ChaosSearch Blog - Tips for Wrestling Your Data Chaos

Data is Cheap, Information is Expensive – Part 2

Written by Thomas Hazel | Apr 23, 2019

In my last blog, “Data is Cheap, Information is Expensive Part 1”, I examine the big bang in big data over the last several decades where information is increasingly the lifeblood of business. However, I raised the issue of the associated cost and complexity in transforming (i.e. refining) raw data into valuable information, diving into the problems related to this transformation, as well as, the core reasoning. My conclusion hinted to an answer (or alternative) to the cost and complexity dilemma.

In this blog, I will outline its origins and link to its definition.

Initial Insight

Any answer to a long-standing, hard problem, starts with innovation. Solving “information” puzzles is my thing and computer science is the vehicle. With that said, I’m always amazed what’s possible when it comes down to ideating on how to make data more valuable.

The origin story of CHAOSSEARCH begins with the simple idea of “making information smaller”, where if the information was at its theoretical minimum, one would store less, move less, and process less. All great attributes when trying to solve the cost and complexity dilemmas of big data.

Every startup I’ve worked at or launched, somehow, somewhere, creates a new database or database technology — something for which my colleagues tease me endlessly. But all kidding aside, I like to work on making new computer science concepts a reality and CHAOSSEARCH is no different. I wanted to create a new representation that reduces information beyond today’s compression algorithm ratios. A new file format that could employ any such algorithm and achieve more reduction than the actual algorithm alone.

This is where I had my “ah ha moment” and awakening. What flooded into my brain more than 5 years ago was not just a file format that could compress data below its theoretical minimum, but a database index that could (and would) do more. The idea, called Data Edge™, became more than a file format for reducing size. It became my next company. The technology, that if applied right, could disrupt the analytics space and go a long way to solving the cost and complexity dilemma of big data. And, it would allow me to build another innovative database technology. Though this time, as a data platform delivered as a service for today and tomorrow’s data explosion.

First Principles

For me, each innovation begins with First Principles. If interested in reading about Data Edge first principles, see my blog “First Principles of CHAOS”. In this blog, I walk through foundational principles based on the initial insight and what drove the technology of CHAOSSEARCH. These guidelines (i.e. rules) are at the heart of each and every architectural decision:

  • Minimums
  • Distribution
  • Analysis thereof

From these principles, additional meaning was derived to define and build the CHAOSSEARCH platform. With the idea that a single solution could provide an intelligent and holistic experience for all data management and analytics needs:

  • The ability to simply, quickly and inexpensively store all data at any scale
  • The removal of complexity and external systems for management/analytics
  • The unification of the first two aspects into a single solution at a disruptive price

Ultimately this lead to CHAOSSEARCH as a company. It’s been a 5-year journey from the initial awakening, with 2 of those years as an officially funded company. We made bold bets on the technology and the vision that S3, as a data lake, could be the foundation of a database. And now those bets are truly paying off.

The mission that anyone can store everything and ask anything of their own data, is not just the evolution in storage and analytics convergence, but a revolution in business opportunities.

White Paper

The following technical white paper outlines (at a high level) the advantages and benefits of the Data Edge technology and associated architecture backing the CHAOSSEARCH platform. We have built, from the ground up, a new distributed database 100% based on your S3 storage. To learn more, please visit the following link: Technical White Paper.

Check out Part 3 to learn more HOW the CHAOSSEARCH platform aims to solve these problems for our customers.