Big Data vs. Lean Data
Or: the question of the meaning of data life
Big Data is certainly the buzzword of the past, the present and certainly next year. But I ask myself the question – is it? Really? And to answer that directly with my own opinion – no.
The days of dusty paper archives are thankfully over, even the endless searching and non-readability of this paper-mountains. The company has a wealth of data in the hands by creating new opportunities, hard drives beyond Mega and Giga, the permanent availability of all the information – but most of it can not be lifted.
“Not all treasure is silver and gold, mate” (Jack Sparrow, Pirates of the caribbean)
Even if modern data centers and corresponding architectures allowing us permanent storage and usage of this information on every conceivable customer or not-yet-customers, so most companies do not take advantage of this potential. The evaluation is challenging the normal capacities and possibly also the know-how or just simply exceeds the feasibility of this. Are we really better off than in the days of piles of paper?
Data-jungle & Boo-hoo ghost NSA
In addition, the legislation already has their own perception of data jungle and – not just since the modern specter with the three letters NSA – speaks of a evaluation of the need for data collection. In some companies – due to the existing group structures- there is some kind of mortification imposed for the purpose of data exchange, or better to say: non-data exchange.
So I ask the question again – do we really need to worry about Big Data? Do we need to increase the capacity of our data center to follow this trend? My answer is: No. We should not start with the harvesting of any data, but ourselves – regardless of the legislation, but in terms of our own cost structures – ask what kind of data is really necessary. Which data can we use with consideration of our business model, processes and necessary analyzes? And also – what we do not need or not need in this form?
Rather than worrying about budgets, using for new capacity, we should first think about how we can raise our own data treasure meaningfully and thereby increase the efficiency of existing data by a better cost-benefit ratio.
Whereas the buzzword Big Data is classified as a modern, lean management is almost classified as obsolete. I am for a revival, a retro look of Lean: Lean Data. Instead of orientating ourselves on the big mass of data, we should consider the quality, put it into our focus again and make us think about how to
- streamline all business processes
- avoid redundancies in total
- use synergies (e.g. even in ETL processes)
- bring the amount of data to a reasonable level
- derive the maximum benefit from existing data
- all this of course incl. compliance with current legislation and the possibility to implement changes (it is preferred to do this dynamically)
- Ideally, this can be placed in a sustainable architecture
Lean Data is my personal, favorite Buzzword for 2014