ChaosSearch Blog - Tips for Wrestling Your Data Chaos

5 DevSecOps Checklists to Embrace Advanced Techniques in 2025

Written by David Bunting | Jan 3, 2025

If you’re here, you know the basic DevSecOps practices like incorporating proper encryption techniques and embracing the principle of least privilege for access control. You may be entering the realm of advanced DevSecOps maturity, where you function as a highly efficient, collaborative team, with developers embracing secure coding and automated security testing best practices. At this stage, your team members are ideally integrating security best practices throughout the software development lifecycle. You likely develop secure applications and source code by design.

This blog is intended to move your team beyond basics to more advanced DevSecOps techniques (such as audit logging and fault tolerance) to detect and respond to the increasing intensity and volume of security attacks to applications and infrastructure. According to the 2024 Cost of a Data Breach study, the average cost of a security incident has increased by 10% over the previous year, reaching its highest level ever at $4.88 million. The longer security issues linger, the costlier they can get. For example, advanced persistent threats (APT attacks) can mask themselves as legitimate IT administrators and go undetected for months. The techniques detailed below can help you secure your applications and infrastructure, investigate security incidents faster, and develop proactive security engineering techniques.

 

 

1. Audit logging

 

What is audit logging?

Think of audit logging as documenting the activity within software systems. An audit log records the occurrence of an event, the time it happened, the responsible user or service, and the entity impacted. An audit trail shows events in order, enabling teams to get a sequential view of what happened on their system. Advanced DevSecOps teams often review audit logs to investigate security breaches and maintain regulatory compliance standards.

 

Why it’s important

An audit trail can help your organization find out how a breach happened. According to NIST, audit trails can help you accomplish several security-related objectives, including maintaining individual accountability, reconstructing events or actions that happen on your system, detecting intrusion and analyzing problems.

 

Pro tip

Regularly collecting and retaining logs is critical to your audit logging program. In fact, you probably need to retain logs beyond the typical 30-day retention period to detect advanced persistent threats (APTs). According to UC Berkeley’s information security resources, your audit logging program should, at a minimum, include the following checklist:

 

Audit Logging DevSecOps Checklist

Your security logging and monitoring program should include the following data.

  • Operating System(OS) Events
    • Start up and shut down of the system
    • Start up and down of a service
    • Network connection changes or failures
    • Changes to, or attempts to change, system security settings and controls
  • OS Audit Records
    • Logon attempts (successful or unsuccessful)
    • The function(s) performed after logged on (e.g., reading or updating critical file, software installation)
    • Account changes (e.g., account creation and deletion, account privilege assignment)
    • Successful/failed use of privileged accounts
  • Application Account Information
    • Successful and failed application authentication attempts
    • Application account changes (e.g., account creation and deletion, account privilege assignment)
    • Use of application privileges
  • Application Operations
    • Application startup and shutdown
    • Application failures
    • Major application configuration changes
    • Application transactions, for example,
      • Email servers recording the sender, recipients, subject name, and attachment names for each email
      • Web servers recording each URL requested and the type of response provided by the server
      • Business applications recording which financial records were accessed by each user

 

 

2. Fault tolerance

 

What is fault tolerance?

Fault tolerance is the ability of a system (such as a computer, network, cloud cluster, etc.) to continue operating without interruption when one or more of its components fail. Load balancing and failover solutions can prevent system outages and ensure high availability. However, closely monitoring your system’s fault tolerance can help you detect an increasing volume of sophisticated attacks.

 

Why it’s important

Threat actors are launching more and more attacks that impact system operations. These may include:

Fault tolerant systems are critical for mitigating these risks.

 

Pro tip

Event logs contain detailed information regarding state changes in your environment. First, detecting these changes can help you pick up on potentially suspicious activity, outages, or failures on the network caused by malicious actors. Second, it’s critical to secure the sensitive data contained within these logs, such as passwords or access permissions. In Amazon S3, for example, services like AWS Trusted Advisor will help you check for misconfigurations or open access privileges in your S3 buckets that may provide a front door to attackers.

 

Fault Tolerance DevSecOps Checklist

  • Implement Robust Logging and Security Monitoring
    • Use centralized logging solutions to capture event logs from all components.
    • Secure sensitive data in logs (e.g., masking passwords, access keys).
    • Regularly analyze logs for anomalies and potential threats.
  • Configure Load Balancing and Failover Mechanisms
    • Ensure load balancers distribute traffic evenly across servers.
    • Set up automated failover to secondary systems when primary systems fail.
    • Test failover regularly to ensure seamless operations during outages.
  • Harden Application and Network Security
    • Deploy web application firewalls (WAF) to mitigate DDoS attacks and SQL injection attacks.
    • Restrict access using least privilege principles (e.g., network firewalls, IAM policies).
    • Secure DNS configurations to prevent tunneling attacks.
  • Build Redundant Infrastructure
    • Use multiple availability zones or regions to ensure redundancy.
    • Maintain backup databases and critical resources for quick restoration.
    • Regularly test disaster recovery plans and backups.
  • Perform Real-Time Threat Detection and Continuous Security Monitoring
    • Leverage tools for security scanning and real-time monitoring of threats.
    • Set up alerts for suspicious activities such as unauthorized access or abnormal traffic.
    • Train teams to respond to alerts promptly.
  • Regularly Test Fault Tolerance
    • Conduct simulated failure scenarios (e.g., Chaos Engineering tools like Gremlin).
    • Identify system bottlenecks or vulnerabilities during tests.
    • Optimize based on test outcomes to enhance resilience.

 

3. Threat hunting

 

What is threat hunting?

Threat hunting is a purposeful and structured search for evidence of malicious activities that have not yet generated security alerts. It’s a proactive, human-centric security measure that pushes the boundaries of automated detection methods. Resources like the MITRE Attack Framework can give teams a strong basis from which to start.

 

Why it’s important

Threat hunting can help detect APTs in the network that mask themselves as legitimate activity. These threats can linger and become very costly and damaging. There are many threat hunting methodologies that provide a well-defined, research-based structure to the approach. Most threat hunters assume the cloud environment has already been compromised and the threat already exists.

 

Pro tip

One of the most important ways to determine a security organization’s threat hunting ability is the quantity and quality of the log data it collects and makes available to the SecOps team. Most security professionals believe that enriching the systems in their security operations center (SOC) with additional data sources is the most important step they can take to enhance threat hunting capabilities.

Broadly speaking, threat hunters need access to both host and network data sources, as well as cloud application logs. Host logs can be collected via an agent or through native logging applications like Windows Event Forwarding, the Sysmon utility, auditing services for Linux architectures or unified logging for MacOS.

These logs should provide visibility into how configuration management utilities like PowerShell are being used within the environment, since these tools are commonly exploited by attackers seeking to maintain persistence, while keeping a low profile.

You can follow these threat hunting steps below in our checklist.

 

Threat Hunting DevSecOps Checklist

  • Define the Attack Scenario
    • Identify a specific potential threat in the environment.
    • Consider attacker techniques, target vulnerabilities, and potential attack paths.
  • Formulate an Initial Hypothesis
    • Predict the attacker’s goals and methods for each stage of the attack chain.
    • Plan the hunt to search for evidence supporting or refuting the hypothesis.
  • Identify and Gather Evidence
    • Assemble relevant data sources for analysis.
    • Ensure evidence is contextualized, consistent, and well-documented.
  • Analyze and Correlate Data
    • Use analytics and visualization tools to uncover relationships and anomalies.
    • Cross-reference evidence against known adversary tactics and environmental norms.
  • Report Findings
    • Document collected evidence, analysis performed, and conclusions reached.
    • Share results with management and incident response teams for follow-up actions.

 

Read: Log Analytics and SIEM for Enterprise Security Operations and Threat Hunting.

 

 

4. Fuzzing

 

What is it?

According to OWASP, fuzzing (or fuzz testing) is a black box software testing technique that involves finding implementation bugs using automated malformed/semi-malformed data injection. In other words, fuzzers inject data so application testers can watch how an application acts in the presence of malicious and/or random code in the real world.

 

Why it’s important

For many teams, fuzzing is an important, proactive security check before an application is shipped into production. Fuzzing can show you the quality of the target system and software. Using fuzz testing, you can check the robustness and security risk posture of the system and software application you’re testing.

Fuzzing also is the primary technique attackers use to find software vulnerabilities. Incorporating fuzzing into your SOC can potentially help prevent zero-day exploits from unknown bugs and weaknesses in your system. In many cases, fuzzing can uncover vulnerabilities that otherwise wouldn’t be detected through manual audits or conventional security testing.

 

Pro tip

Fuzzing can be done at low cost and doesn’t require much human intervention. There are many open source tools and frameworks available to help teams accomplish their fuzz testing goals, including (but not limited to) the following resources:

 

Open Source Mutational Fuzzers

 

Fuzzing Frameworks

 

Fuzzing DevSecOps Checklist

  • Select Appropriate Fuzzing Tools
    • Choose tools suitable for your target system.
    • Consider open-source solutions to reduce costs.
  • Define Test Cases
    • Identify application components to test (e.g., APIs, inputs, or file parsers).
    • Create malformed and semi-malformed data injection scenarios.
  • Automate and Execute Tests
    • Run fuzz tests with minimal manual intervention.
    • Monitor the application’s behavior for crashes, errors, or unexpected responses.
  • Analyze Results
    • Investigate crashes, vulnerabilities, and anomalies detected during fuzzing.
    • Cross-check findings with system robustness and security requirements.
  • Document and Mitigate
    • Record all vulnerabilities and their root causes.
    • Implement fixes and re-test to ensure issues are resolved.

 

5. Automated testing (SAST, DAST, IAST)

 

What is it?

There are a variety of automated security testing techniques that can help your DevSecOps team build security into the CI/CD pipeline. A few examples include:

 

Why it’s important

Certain DevSecOps tools can catch bugs you weren’t anticipating in your software applications. Automated security checks are usually worked into the DevOps pipeline, and become an integral part of application development and delivery.

 

Pro tip

Most organizations (90%) use open-source software in some way. As a result, it’s important to incorporate software composition analysis (SCA) into your automated testing routine. These tools can scan all of your open-source components and dependencies for vulnerabilities. Open source vulnerabilities have grown in 2022 by 30%. In fact, many major issues, such as the infamous Log4J security vulnerability, could have been avoided by updating open-source software and dependencies regularly.

 

Automated Testing (SAST, DAST, IAST) DevSecOps Checklist

  • Integrate Testing into CI/CD Pipeline
    • Incorporate SAST for early-stage code analysis before compilation.
    • Use DAST to simulate real-world attacks on running applications.
    • Leverage IAST to monitor application behavior in real-time during testing.
  • Expand Coverage with Specialized Tools
    • Add Mobile Application Security Testing (MAST) for mobile apps.
    • Use tools designed for pentesting or static/dynamic analysis tailored to mobile environments.
  • Conduct Software Composition Analysis (SCA)
    • Regularly scan open-source dependencies and libraries for vulnerabilities.
    • Update outdated or vulnerable components to mitigate risks.
  • Automate and Schedule Tests
    • Automate tests to run at every build and deployment phase.
    • Schedule periodic tests to ensure consistent application security over time.
  • Analyze Results and Address Issues
    • Review results from SAST, DAST, IAST, and SCA tools.
    • Prioritize and remediate identified vulnerabilities promptly.
    • Document fixes and verify through re-testing.

 

Embracing a DevSecOps approach

A DevSecOps approach that involves secure platform design, automation, and culture changes can ensure that security becomes a shared responsibility throughout the organization. This approach is more important than ever before, given that most teams have relatively lean security resources. As a result, DevOps teams must take ownership of security using some of the techniques described above — protecting their organizations and ensuring compliance.

The good news is that a variety of tools that already exist within the DevOps toolchain can help with security use cases. For example, centralizing logs within an analytics solution like ChaosSearch can help create a security data lake using existing cloud object storage resources, such as Amazon S3 or Google Cloud Platform. A holistic view into logs with unlimited retention can help provide faster threat detection and incident response capabilities to DevSecOps teams.

Download the full 2025 DevSecOps checklist here.