To understand the value of logs—those many digital records of hardware and software events—picture a big puzzle. You put all the pieces together to make sense of them.
Every day the modern enterprise generates billions of logs, each capturing a user log-in, application record change, network service interruption—as well as the messages these entities send to one another.
Data teams collect and normalize logs, then use that data to correlate events, describe patterns and identify anomalies.
This log analytics process helps control IT operations, reduce security risk, and enable compliance.
Enterprises need log analytics to monitor and manage their fast-digitizing businesses. But rising log volumes can overwhelm their supporting architectures, most notably the ELK stack (comprising the open-source Elasticsearch, Logstash and Kibana.) Enterprise data teams need more efficient ways to index, search and query all those log files, especially to support AI/ML algorithms. One way is a new, lightweight index… which might boost scale and performance for many additional workloads.
There are four primary use cases for log analytics: ITOps, DevOps, security and customer analytics.
These use cases have a common problem: processing data at scale.
Perhaps the greatest chokepoint is indexing.
As log volumes rise, they can inflate data indexes, which in turn drives up processing overhead and chokes search and query workloads. For example, the ELK stack runs on the Apache Lucene search engine, whose “inverted index” lacks (among other things) the compression needed to properly handle larger workloads. Lucene users also must spend time setting up—and scaling—their clusters, schemas, and shards.
Figure 1. How Event and Message Logs Overload the ELK Stack
All of this hurts productivity, impairs performance, and pushes cloud compute costs through the roof.
To meet SLAs and budget requirements, many enterprises are forced to scale back by shortening their log retention periods.
This in turn makes analytics output less specific and potentially less accurate. In short, log analytics are yet another case of log jammed data.
ITOps, DevOps and SecOps teams have a few options to break the logjam.
New cloud-based platforms can address the final option. They rapidly and dramatically compress indexed data, which is critical given the high number and small content of logs. Users can automatically discover, normalize and catalog all those log files, and assemble metadata to improve query planning—all with a smaller footprint than predecessors such as Lucene. The log data remains in place, but presents a single logical view with familiar visualization tools the user already knows (such as Kibana via open APIs).
Using a new solution like ChaosSearch, enterprise data teams can increase the scope and scale of their log analytics, which makes their ITOps, DevOps and security initiatives more effective.
Enterprise data teams should watch this space. Offerings like ChaosSearch will continue to make this data easier to analyze.
This case study for log analytics underscores three guiding principles that apply to all aspects of enterprise analytics.
First, in a volatile world, find ways to cost-effectively process and analyze as many data points as possible, including historical data. Last month’s logs might help identify the security threat that reappears next month.
Second, build architectures that minimize cloud compute cycles wherever possible to avoid runaway costs.
Third, and perhaps most importantly, seek out technologies that are fit for purpose. In this case, when you have a puzzle with millions of pieces, you need a fast and simple way to index the pieces.
We Still Haven’t Solved the Logging Problem [Devops.Com]