Datadog has made a name for itself as a popular cloud-native application performance monitoring tool, measuring a system’s health and status based on the telemetry data it generates. This telemetry includes machine-generated data, such as logs, metrics and traces. Cloud based applications and infrastructure generate millions (even billions) of logs – and analyzing them can generate a wealth of insights for DevOps, security, product teams and more.
While Datadog is a great smoke alarm to detect issues, deeper troubleshooting and root cause analysis becomes a massive challenge as organizations scale. The most common challenges include log retention schedules, data transformation and rehydration processes, and cost. Let’s dive into why log analytics are important, top log management challenges with Datadog, and how to navigate them.
Read: Managing Cloud Service Logs: Why It’s Difficult and How to Simplify It
According to Eckerson Group, there are four primary use cases for analyzing log data: ITOps, DevOps, security and customer analytics. Vice President of Research Kevin Petrie elaborates on each use case:
Petrie adds that these use cases have a common problem: processing data at scale.
This issue crops up across a wide variety of observability, monitoring and dashboarding tools such as ElasticSearch, CloudWatch and Datadog. Fortunately, organizations can supplement these tools with alternatives that allow for powerful log analytics at scale with unlimited data retention.
Read: 5 Best Practices for Simplifying Data Management
To execute properly on these use cases, organizations must be able to access large volumes of log data for analysis. One of the biggest challenges with tools like Datadog is the high cost of log storage and retention. As a result, many teams choose not to retain data past a 30-day retention period – which can be problematic as they try to dive deeper into root cause analysis or advanced persistent security threats.
While centralization of telemetry and an intuitive user interface (UI) are certainly Datadog pros, its scaling challenges can quickly become costly. While the metrics and traces Datadog monitors are priced by host, and scale only with the number of new services, logs are priced by volume. Costs can add up for organizations that use microservices architectures, which generate high volumes of logs.
Datadog offers two different prices for logs: ingestion and retention. Ingestion means shipping logs from their source and storing them in Datadog. Retaining logs in Datadog allows customers to analyze them performantly. Logs that are ingested but not retained require extensive data transformation, via a process called log rehydration. This process can take hours and requires a dedicated resource, and teams typically pay the retention price for the volume of logs rehydrated.
Ingestion is priced at $0.10/GB while retention is priced at $1.06/GB to $2.50/GB for 3-day to 30-day retention, respectively. To optimize their costs, organizations try all sorts of workarounds. Whether that means reducing Datadog retention periods, ingesting a subset of the logs and dealing with the rehydration process when needed, or reducing the amount of logs ingested altogether – none of these workarounds are ideal. Besides hindering the team’s ability to generate insights, limiting requirements for data retention can become a problem for regulatory compliance and legal, as well.
Read: Logging Blindspots: Top 7 Mistakes that are Hindering Your Log Management Strategy
As Petrie points out in the article cited above, new cloud-based platforms can alleviate some of the bottlenecks with log data:
“They rapidly and dramatically compress indexed data, which is critical given the high number and small content of logs. Users can automatically discover, normalize, and catalog all those log files, and assemble metadata to improve query planning—all with a smaller footprint than predecessors such as Lucene. The log data remains in place, but presents a single logical view with familiar visualization tools the user already knows (such as Kibana via open APIs).”
Datadog users can easily move their log workloads to a solution like ChaosSearch – while keeping the other monitoring capabilities intact. Getting started is easy.
Using ChaosSearch, organizations can take advantage of unlimited data retention with a starting price of $0.30/GB and significant discounts at scale. In addition, analyzing data in place avoids processes like data pipelines and rehydration – helping teams achieve much faster time to insights. Combining Datadog with ChaosSearch can help organizations achieve best-in-class monitoring as well as a deeper understanding of the systems that drive business growth.
Read the Bog: Best Practices for Effective Log Management
Listen to the Podcast: Differentiate or Drown: Managing Modern-Day Data
Check out the eBook: Beyond Observability: The Hidden Value of Log Analytics