Making The Case For “Small Data”

Big Data is a buzzword. Many organizations hitching themselves to the Big Data-wagon amass data quickly in search of unicorn-esque insight, but don’t put much thought into the process. To make matters worse, data, in its various contemporary forms, is readily available. The temptation is high– to collect simply because you can and because it may become useful at some point. Make Small Data, Not Big Data #RSAC

We should practice the discipline of small data. An admittedly tongue-in-cheek term, small data refers to these principles:

  • Practice lean data collection: Collect data only when it is absolutely necessary and critical for the business. If you are Microsoft, collecting Windows machine telemetry such as CPU utilization may be okay, but collecting IP addresses or user information as part of the telemetry is not okay. 
  • Delete superfluous data: Do not keep data beyond its business usage, and promptly delete all instances of it from your infrastructure. For example, a real estate company processing real estate bids should not keep bidding information, especially individuals’ financial records, once the transaction is closed.
  • Prevent data creep: Avoid data caches. Stop duplicating data warehouses for different data analysis purposes. And practice the need-to-know principle to limit the number of applications and users that can access sensitive data. If you are call center agent, you may have access to the last four digit of the customer’s credit card number and/or a customer account number. The rest of the customer record should be completely opaque to you.

Data, though it can be an asset at times, is a distinct liability–the kind that has a ticking time bomb attached to it. The more you collect, the higher your liability, and the more likely the bomb will go off. Cyber criminals are more likely to choose a data-rich target than otherwise. The richer the data, the more determined your adversaries will be.

Many marketers seem blissfully unaware of the extensive evidence of this liability. Has Ashley Madison’s competitors stopped collecting personal data and stopped promising its customers it will be able to keep the information secret?

In terms of data breach implications, we have not yet seen the worst. Target took down a CEO. Ashley Madison broke up families and destroyed careers. The next data breach may not be so kind–its full devastating impacts are yet to be seen.

Perhaps encryption could solve the data collection problem: encrypt the data and you should be fine. Unfortunately, crypto is rarely the answer. Yes, it can help to some extent, but users remain the weak link so long as they can access key materials. Worse yet, having crypto but at the same time ignoring strong security principles to secure keys or other parts of the infrastructure is simply a fool’s errand.

Small data is, and should be, the data principle that every company abides by. Who should enforce “small data” practices in a company? The answer is corporate governance, specifically the board. The board must take on an active data governance function – advocating for “small data”, scrutinizing the practices via which the company collects, stores, and uses personal data. Data is now a corporate risk issue.

If you don’t exert governance, my dear board member, you could be out of a job tomorrow. 

Posted on September 17, 2015

Chenxi Wang

by Chenxi Wang

Chief Strategy Officer, TwistLock

← View more Blogs

This document was retrieved from on Tue, 25 Oct 2016 13:19:36 -0400.
© 2016 EMC Corporation. All rights reserved.