Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Observabilty for complex systems and related technologies.

Observability for the Agent Era: Day 2 | Launches

Honeycomb's Innovation Week: Observability for the Agent Era (May 12-14) For Day 2 of Innovation Week, Honeycomb's product and engineering teams will take you inside the new capabilities purpose-built for the agent era. Expect live demos, real scenarios, and a hands-on look at what it means to own observability for the Agentic era, with AI in Honeycomb to observe AI in production. A 3-Day Virtual Event for Teams Building the Future May 12: Get insights on how the best engineering teams are tackling the challenges of the agentic era.

Honeycomb Innovation Week: Debugging Agentic Workflows with Ken Rimple

Canvas skills are how your team's runbooks and tribal knowledge become an active part of the investigation instead of a document someone has to remember to open. Pre-built skills cover the most common investigation patterns out of the box. Custom skills let you encode the specific context, thresholds, and decision logic your team has accumulated, so every auto-investigation starts with your best thinking already applied.

Innovation Week Day 1: The SDLC Is Collapsing, and Observability Has Never Mattered More

The software development lifecycle is collapsing. The multi-stage pipeline that defined how software got built and shipped for decades is compressing into rapid loops of intent and validation, with agents now part of the teams building and running it. Day 1 of Innovation Week was about what that shift means for how software gets validated, where observability fits, and the problems that have always been hard but are now genuinely urgent.

Security Integrations in Observability Self-Hosted

Integrating security data with observability data provides a comprehensive view for better threat detection and response. Security observability helps connect the dots between seemingly innocent events that, when correlated, reveal complex attack patterns. SolarWinds security products integrate into observability self-hosted, including Security Event Manager for log data and event correlation, Access Rights Management for identifying potential attack vectors, configuration management for compliance monitoring, and Patch Manager for tracking critical updates.

Turn Noisy Logs Into Structured Data with Uptrace Grouping Rules

Here are 3 YouTube title options plus a description optimized for technical/dev audiences: Same log pattern. Hundreds of useless groups. In this video, we show how to use Uptrace Grouping Rules to automatically turn noisy logs into structured, searchable data — without changing application code. You'll learn how to: Examples covered: Perfect for:#OpenTelemetry users, backend engineers, SREs, and anyone dealing with noisy logs.

Making Semantic Conventions Work for You With OpenTelemetry Weaver

Your dataset has hundreds of attributes. Some are self-explanatory: http.response.status_code, server.address. Others are not: meta.refinery.reason, dataset.slug, sli.latency_target_ms. If you don't know what an attribute means, you can't write a good query. And if an AI agent doesn't know what it means, it guesses.

Why Alert Fatigue Solutions Still Miss the Root Cause

Alert fatigue solutions have never been better, but on-call engineers are still burning out. Threshold tuning, AI triage, and alert correlation reduce the noise, but every alert that clears filtering lands with the same incomplete telemetry and triggers the same manual investigation cycle. This post explains why the evidence gap survives every fix, and how runtime context changes that.

Multi-tiered Observability: A Practical Way to Handle Diverse Workloads

Observability in large companies is rarely one-size-fits-all. The VictoriaMetrics topologies guide shows why different deployment patterns are needed as scale, isolation, and reliability requirements grow. Different workloads require different trade-offs: some need long retention for audits and trend analysis, while others need higher resolution for debugging. Business-critical systems also demand dependable alerting and high availability, often with several 9s of reliability.

Observability and Security for the AI Era

Datadog has always been driven by a broader vision of helping teams understand and operate complex systems. In this session, you’ll hear from Michael Whetten, Product SVP, and Abrar Hussain, Senior Director, Product Management, as they share the latest updates across the Datadog product suite and discuss how that vision continues to shape the platform’s evolution and support the next generation of AI-driven applications.