Operations | Monitoring | ITSM | DevOps | Cloud

KubeCon + CloudNativeCon EU 2026: What We Learned About AI, Observability, and Fast Feedback Loops

Honeycomb was excited to attend KubeCon + CloudNativeCon Europe, where one theme stood out across sessions: as AI reshapes how software is built and run, teams are being pushed to rethink how they understand their systems. Without strong observability and feedback loops, AI can accelerate confusion, misalignment, and operational risk.

The Business Case for AI-Driven Observability in Network Operations

Modern network operations generate an extraordinary amount of telemetry. Metrics, logs, events, topology data, cloud signals, and service context all contribute to a richer picture of system behavior. As environments expand across cloud, data center, edge, and SaaS, the opportunity for operations teams is clear: when that telemetry is unified and understood in context, it becomes a powerful source of resilience, efficiency, and business insight.

Profiling Java apps: breaking things to prove it works

Coroot already does eBPF-based CPU profiling for Java. It catches CPU hotspots well, but that's all it can do. Every time we looked at a GC pressure issue or a latency spike caused by lock contention, we could see something was wrong but not what. We wanted memory allocation and lock contention profiling. So we decided to add async-profiler support to coroot-node-agent. The goal: memory allocation and lock contention profiles for any HotSpot JVM, with zero code changes. Here's how we got there.

AI Needs Better Inputs: Why Observability Is Becoming the Foundation of Enterprise AI Maturity

Organizations across industries are accelerating their investments in AI for operations, yet the path to meaningful impact is proving far more complex than early expectations suggested. Analysts at Gartner, Forrester, Deloitte, and McKinsey continue to highlight the same structural barrier. AI cannot produce accurate predictions or safe automation when the operational data feeding it is fragmented, incomplete, or inconsistent.

Accelerate Your OpenTelemetry Migrations With Honeycomb's Agent Skills

Since releasing our hosted MCP server last year, we've been thrilled to see customers not just adopt it but build Honeycomb deeply into their agentic development and observability workflows. Users have embraced it, leveraging Honeycomb to stay in conversation with their code and understand how it runs in production.

The Observability Gap: Why Monitoring Data Should Drive Tests

Most teams already know a lot about production. They have dashboards. They have traces. They have alerts. They have enough telemetry to explain what happened after an incident and enough graphs to argue about it for the rest of the week. Then they go to test a change and start from scratch. The integration tests hit a hand-written mock that returns {"status": "ok"}. The load tests replay a CSV somebody exported months ago. Staging is close enough to production right up until it matters.

Observability Is Now a Boardroom Priority Even If Nobody Wants to Say It Out Loud

Executives rarely state the full truth publicly, but inside boardrooms the conversation has changed. Observability, once viewed as a technical capability deep within operations, has become a strategic requirement for understanding business performance. Leaders may not always use the term itself, yet they focus intensely on the outcomes it promises. Their environments have grown too fast, too fragmented, and too interdependent for traditional visibility approaches to keep pace.

Scary Things Happen in Production. Context Helps You Find Them.

Production is a rowdy place of chaos, especially at scale. When you have millions of requests per second flowing through your system, weird things are always happening. Outliers, unusual request patterns, spikes and pulses of traffic from unknown sources, port scanning…it’s all there. To the naked eye, it looks like noise. If you know what you are looking for…patterns emerge. The night sky: every dot is a request. Without intent, it's an undifferentiated field of light.

How a Runtime Aware AI SRE Agent Transforms System Reliability

A runtime aware AI SRE extends existing AI SRE approaches by moving beyond telemetry correlation into runtime-validated reliability. While the majority of AI SRE tools accelerate incident triage using logs, metrics, and traces, they cannot confirm execution behavior if critical runtime signals were never captured. By generating on-demand evidence inside running services, AI SRES can eliminate slow redeploy cycles, ensuring your distributed systems remain resilient under real-world traffic conditions.