To say the volume of data is exploding is an understatement. It’s estimated that data will grow from 33 zettabytes in 2018 to 175 zettabytes by 2025, according to the most recent IDC and Seagate report. The reasoning is simple: the ability to generate data via cheap
With the advent of the internet, cloud, and all things connected, data has become the lifeblood of companies’ external communication, internal operations, and overall business value. These veins of data stream throughout a company. Each intersection can change and/or add to this flow of data. And to keep everything running smoothly, this data is stored and analyzed to promote the good health of the business. A portion of this data is also stored as it has a direct relation to the value a business produces.
However, it is not as easy as just wanting to utilize this growth in data. The issue is not in the ability to create mountains of data, nor the ability to stockpile it. The problem is in transforming raw data into valuable information. Data becomes information when questions can be asked of it and decisions and/or insights can be derived from it. And here lies the dilemma: the more there is, the harder and more expensive it is to refine.
But why is this? What makes it so expensive? And here again, the reasoning is simple. It is far cheaper to generate and store data then it is to transform it into accessible information. This refining of data involves much more
However, there is some salvation with respect to how much additional compute is required to derive information from data. Instead of using brute force parsing through each aspect of this raw data, computer science algorithms/structures have been utilized to implement advanced database solutions. Each solution has different benefits and requirements, but in the end, all pretty much do the same thing: store data in a representation such that intelligent access can be performed more efficiently than manually analyzing the raw source.
First impressions, one would think today’s technology and associated databases would seem to address the cost of information translation. And for decades they did. But as the growth in the 3Vs, these solutions are teetering. Now, there have been introductions of new styles of databases to elevate the cost, but the philosophy of refining data into information has not changed, as well as the underlying science. And if it’s not obvious yet, the amount of
In part 2 of “Data is Cheap, Information is Expensive”, I will endeavor to describe an alternative to today’s database technology and associated solutions. A new outlook in how data should be stored and accessed. A philosophy that accessing information should be as simple as storing data without breaking the bank. A viewpoint that cost has a direct relationship to its life cycle and the science behind it. What I will be describing is CHAOSSEARCH and the patent pending technology and architecture it employs to make information inexpensive too.