Modern organizations generate and collect vast amounts of log data each day from an ever-increasing number of sources that includes IT infrastructure, networking devices, applications, cloud services, security tools, and more.
This data is essential for powering use cases from security operations and threat hunting to application performance monitoring, but tapping into the full potential of log data can be challenging for organizations without the right tools and capabilities.
This is true for log analytics users (e.g. DevSecOps teams and IT personnel) who find themselves constrained by the limited querying capabilities of legacy analytics solutions like ELK stack, as well as for BI users who analyze business data using SQL/relational queries, but struggle with data silos and may not be able to access log data for analytical purposes.
This is where relational access to log data can play a revolutionary role in democratizing data access and enabling novel analytics cases for both log analytics and BI users.
In this blog post, we’re taking a closer look at three cutting-edge use cases for relational access to log data and how this innovation can empower both log analytics and BI users to explore data in new ways, uncover deeper insights, and make better decisions that enhance business outcomes.
When we think about log analytics users, we’re thinking of ITOps personnel that use log analytics to monitor the health and performance of cloud services, DevOps teams that use log analytics to detect application performance issues and assess user behavior, and SecOps teams that use log analytics to enable their security monitoring and threat hunting activities.
Many of these users are accustomed to querying log data inside databases that support full-text search, but may not fully support relational queries. One common example is Elasticsearch, an open-source NoSQL database commonly deployed in conjunction with Logstash and Kibana as the ELK Stack. Log data indexed in Elasticsearch is stored in an unstructured format, so while it may be queried using full-text search, it cannot easily be queried using SQL or relational querying. Elasticsearch did recently release an SQL feature that enables some relational querying, but there are significant limitations that impact the practical value of this feature when it comes to querying logs, especially complex JSON logs with nesting or arrays.
For log analytics users, gaining relational access to log data can create opportunities to unlock valuable new insights by combining SQL analytics and full-text search querying.
With relational access to log data, these users will be able to correlate log data extracted from text-based or JSON logs and stored in a tabular format, as well as perform SQL joins of log data from multiple sources to enable detailed analysis across multiple tables and reveal correlations that might have gone unnoticed in the past.
These capabilities will unlock new opportunities for log analytics users:
BI is a set of strategies, methods, and technologies used by organizations to collect, analyze, and extract useful insights from business data.
When we think about BI users, we’re thinking about revenue operations (RevOps) teams who analyze sales and marketing data to optimize the customer journey, product teams who analyze data from a variety of sources to drive strategic decision-making, business analysts who create reports on everything from product performance to operational outcomes, and C-Suite executives who consume those reports to understand business results and inform strategic decision-making.
While log analytics users are accustomed to running full-text search queries on log data, BI users rely on SQL/relational querying techniques to harvest insights from all kinds of business data, including sales and marketing data, operational data, website analytics, financial data, and more.
To power BI initiatives, enterprises build and maintain complex data pipelines to extract business data from multiple sources, transform/normalize the data, and load the data into a data warehouse or data mart. Managing ETL pipelines adds significant cost and complexity to BI workflow.
The challenge for BI teams is that business data exists in multiple silos (e.g. Salesforce CRM, ERP software, financial systems, web logs, marketing automation tools, Google Adwords, etc.), and it’s both difficult and expensive to search for correlations.
Companies solve this today by building and maintaining complex data pipelines, creating multiple transforms, and exporting data to multiple analytics tools - but it’s virtually impossible and certainly cost-prohibitive to maintain an accurate, 360-degree view of the customer while depending on these time and cost-intensive methods.
Organizations can overcome this challenge and extend log analytics capabilities to BI users by adopting ChaosSearch, a cloud data platform architected to support both full-text search and relational querying of logs and business data via built-in SQL API.
Instead of building and maintaining complex data pipelines to keep insights flowing, organizations using ChaosSearch can ingest business and customer data into the data lake, alongside application logs and telemetry data, then execute text search or relational queries against their data.
With full-text search and relational access both logs and business data, BI teams will be able to:
With relational access to log data, companies with a focus on product-led growth can join A/B testing data with user metadata to delve deeper into customer experience and product usage, or combine product and user experience data to drive insights that grow product usage and revenue.
With log analytics and business intelligence happening in a single data lake that supports both relational and full text search, the last step to achieving true multi-model analytics is to enable machine learning queries.
Our final use case illustrates how both the users of BI tools and the users of log analytics software can combine text search, relational, and machine learning analysis to develop new insights.
In this unified data lake, analysts can use text search and machine learning detection to identify patterns and anomalies in their data. Analysts can also use our Chaos AI Assistant, a generative AI tool that enables users to interact with and query their data using natural language. When a curious pattern or anomaly is discovered, data analysts can use the BI tools of their choice (e.g. Tableau, Looker, etc.) and visualization tools such as Kibana to engage with the data, analyze what they’ve found, and ask the right questions to uncover valuable new insights.
Finally, large companies with massive volumes of data can consolidate data from hundreds of sources into a single data lake, analyze massive datasets with text search and machine learning algorithms to find patterns, then drill down to reveal insights using their preferred BI applications.
Enabling relational access to log data can help organizations break down data silos, democratize data access, and open up a world of enhanced analytics possibilities for both log analytics and BI users.
Traditional log users can move beyond the limitations of full-text search and start leveraging structured queries to aggregate data, find correlations, and uncover insights that would have been missed in the past, while BI users can analyze log data alongside business data to gain a deeper contextual understanding of the customer journey.
All of these benefits are made possible by ChaosSearch, a powerful data lake database that centralizes enterprise data storage and empowers our customers with true multi-model data access.
ChaosSearch is easily deployable as a SaaS solution or within your Databricks environment.
ChaosSearch via SaaS deployment:
ChaosSearch sits on top of your AWS and GCP cloud object storage, enabling full-text search and relational access to your log data via Elastic API and SQL API.
With ChaosSearch, organizations can ingest logs and customer/business data at scale into cost-effective public cloud storage, establishing a single source of truth for both log analytics and BI users. From there, ChaosSearch automatically indexes the data and makes it available for full-text search, SQL/relational queries, or Gen AI querying with no complex data pipelines, no costly data movement, and no data retention trade-offs.
ChaosSearch via Databricks deployment:
ChaosSearch is also enabling relational access to log data through our partnership with Databricks.
Databricks customers can deploy ChaosSearch inside their Databricks workspace, establish an open data lake where they can ingest and index log data, then query those logs via our compatible Elastic API or using the built-in SQL capabilities of Databricks. Read the Solution Brief: Extend Your Databricks with ChaosSearch.
Watch our free on-demand webinar Unleash the Potential of Your Log and Event Data, Including AI’s Growing Impact to discover how you can maximize the value of your log data with multi-model analytics powered by ChaosSearch.
Read the White Paper: Digital Business Observability - Analyzing IT and Business Data Together.