Operations | Monitoring | ITSM | DevOps | Cloud

Test network paths with TCP, UDP, and ICMP in Datadog

When developers and SREs design application tests, they often prioritize user workflows and API availability. Extending that suite with network tests that match your app’s traffic protocols can reveal whether issues originate in the network or application layer. In this post, we’ll explore how you can design effective network tests using the Transmission Control Protocol (TCP), User Datagram Protocol (UDP), or Internet Control Message Protocol (ICMP), including.

The product signal latency gap slowing your growth

Organizations often call product managers the CEOs of the product. But PMs know that’s a myth. When a CEO wants a status report, they get one immediately. They don’t need to negotiate for engineering time, reconcile conflicting project priorities, or wait for a data scientist to find a gap in their schedule. For most PMs, simply understanding the state of the product is where growth can stall.

Turn developer feedback into operational insight with Datadog Forms and Sheets

Engineering organizations rely heavily on developer feedback to improve internal platforms, tooling, and processes. However, that feedback is often scattered across disconnected systems such as external forms, spreadsheets, chat threads, and documentation tools. Because these systems are separate from operational data, teams struggle to correlate developer sentiment with measurable performance or reliability outcomes.

Identify and fix code issues faster with Datadog's Azure DevOps Source Code integration

Developers and SREs who rely on Microsoft Azure DevOps often face fragmented workflows when investigating issues or reviewing code quality. Troubleshooting an error can require jumping between observability tools and source code repositories as you manually connect traces, stack frames, and commits. At the same time, security vulnerabilities, misconfigurations, and flaky tests may go undetected until later stages of the software delivery life cycle (SDLC), where they are more costly to fix.

Bringing observability data hosting to the UK on AWS

UK organizations are increasingly required to design systems that account for data residency requirements, ensuring that operational data remains within national boundaries. Many teams already run their applications on AWS infrastructure in the UK, but telemetry data can still be processed outside the region, creating gaps in visibility. Datadog’s upcoming UK availability zone solves this by keeping telemetry data in the same region as the workloads that generate it.

Centralize observability management with Datadog Governance Console

As organizations grow, they face increasing difficulty in managing their observability efforts. More teams mean more dashboards, monitors, API keys, pipelines, and custom configurations. Without a centralized view, administrators spend hours chasing down untagged resources, investigating surprise bills, and revoking dormant credentials. Governance becomes a reactive effort to reduce waste and address issues, falling short of its potential to proactively create standards and optimize observability.

Every team should be A/B testing

Technical teams want to know the newest, most cutting-edge tools they can implement to give themselves a competitive advantage, whether it’s the latest developer framework or modern CI/CD practices that boost velocity. But there’s one tool from all the way back in the 1920s that can improve any organization, no matter its scale: the randomized, controlled trial—or simply put, experiments.

Manage service tracing across hosts with Single Step Instrumentation rules

Single Step Instrumentation (SSI) simplifies Datadog Application Performance Monitoring (APM) by automatically discovering and instrumenting services across a host. For many teams, SSI is the ideal starting point because it helps them achieve full visibility with minimal setup. However, as environments grow, teams often want more control over which services get traced. Auxiliary workloads such as batch jobs and cron tasks might not require distributed tracing.

Route OTel data from AI apps to ClickHouse and Datadog using Observability Pipelines

As organizations continue to heavily invest in AI and build more agentic workflows, their telemetry data volumes can surge quickly, and the associated costs can become unpredictable. To regain control of their data, many AI-forward teams are turning to high-throughput, low-latency pipelines to collect and route data to tools such as OpenTelemetry (OTel) and ClickHouse. But these self-hosted solutions come with drawbacks.

Offline evaluation for AI agents: Best practices

If you’re building LLM-powered applications and agents, you’ve probably asked yourself: “How do I know if my changes actually made things better?” You can tweak prompts, adjust temperature settings, or try different models, but it’s not always easy to validate whether version B’s response is better than version A’s. Most teams fly blind in preproduction and rely on user feedback to see how well their application works in the real world.