Operations | Monitoring | ITSM | DevOps | Cloud

Instrument and monitor Boomi integration flows with OpenTelemetry and Datadog

Boomi is an Integration Platform as a Service (iPaaS) used by thousands of organizations to connect applications, data, and workflows across cloud and on-premises environments. Business-critical processes, from order fulfillment pipelines to customer data synchronization, depend on Boomi Atoms and Molecules running reliably.

Not all index scans are equal: How we cut query latency by over 99%

When engineers investigate SQL queries, they normally think of index scans as a fast and efficient step in the query’s execution plan. When executed correctly, they fetch only the relevant rows from your table as opposed to sequential scans that read the entire table, reducing latency and query costs. However, just because an execution plan uses an index scan doesn’t mean that the scan is fast or performant.

Platform engineering metrics: What to measure and what to ignore

Platform engineering teams have access to hundreds of metrics, yet over 40% of platform initiatives cannot demonstrate measurable value within the first year. Teams that cannot quantify their impact fail to obtain executive sponsorship, risk being defunded, and ultimately, face deprecation. To accurately calculate a platform’s ROI, platform engineering teams need to differentiate between signals that measure platform effectiveness and those that should be used solely for investigative purposes.

Integrate Recorded Future threat intelligence with Datadog Cloud SIEM

Recorded Future provides real-time threat intelligence about indicators of compromise (IOCs), including malicious IP addresses, domains, and vulnerabilities. It also adds context on threat actors and campaigns to help security teams understand which signals represent real risk and prioritize their responses accordingly.

Operating agentic AI with Amazon Bedrock AgentCore and Datadog LLM Observability: Lessons from NTT DATA

This guest blog post is by Tohn Furutani, SRE Engineer at NTT DATA. Over the past year, the conversation around generative AI has shifted from single-shot use cases—such as summarization, Q&A, and chat interfaces—to agentic AI systems that can make decisions based on context, plan multistep actions, invoke tools, and adapt as conditions change.

Capture and analyze custom heatmaps in Session Replay

Datadog Session Replay heatmaps track where users click, scroll, and engage across your web pages. Each heatmap is overlaid on a screenshot of the page, and that background determines what you can actually analyze. But getting the right screenshot can be tricky. Many UI states are dynamic, rare, or simply impossible to capture from replays, so heatmaps can end up showing the wrong view.

Monitor ClickHouse query performance with Datadog Database Monitoring

ClickHouse is widely used for large-scale analytics, but once it is running in production, it can be difficult to understand how query activity translates into resource usage. Engineers investigating performance issues often struggle to determine which queries consume the most memory, run most frequently, or cause spikes in load. In practice, engineers are left querying system.query_log, tailing server logs, and piecing together information after an incident.

How we designed empathetic alert sounds for on-call engineers

Being on call is an essential part of operating reliable distributed systems, but it comes with real human costs such as alert fatigue, sudden wakeups in the middle of the night, and the ongoing anxiety of what the next notification might bring. Many engineers know the feeling: Your phone lights up, a sound cuts through the silence, and your heart rate spikes before you’re even fully awake.

Search and act across Datadog to resolve issues faster with Bits Assistant

Finding the right information across dashboards, monitors, and telemetry sources takes time, even for experienced engineers. When something breaks, it often means figuring out where to start, rebuilding queries, and jumping between metrics, logs, and traces before you can take action. The challenge isn’t a lack of data but the effort required to surface the right information at the right moment.

Understand session replays faster with AI summaries and smart chapters

Datadog Session Replay gives teams a video-like view of what real users experienced in their applications. Engineers rely on replays to connect errors and slowdowns to actual user behavior, while product managers use them to understand friction and improve critical flows. But finding the right replay and the right moment often means manually scanning long sessions without knowing whether they contain relevant signals.