Operations | Monitoring | ITSM | DevOps | Cloud

Streaming Video Monitoring: How to Detect Playback Issues Before Viewers Leave

Video is the single largest driver of internet traffic worldwide. According to the Sandvine Global Internet Phenomena Report, video accounts for 65% of all internet traffic, with on-demand streaming alone consuming over half of all downstream bandwidth on fixed networks. In the United States, households spend nearly five hours per day streaming content, and 94.6% of internet users worldwide watch online video monthly.

The Business Case for AI-Driven Observability in Network Operations

Modern network operations generate an extraordinary amount of telemetry. Metrics, logs, events, topology data, cloud signals, and service context all contribute to a richer picture of system behavior. As environments expand across cloud, data center, edge, and SaaS, the opportunity for operations teams is clear: when that telemetry is unified and understood in context, it becomes a powerful source of resilience, efficiency, and business insight.

KubeCon + CloudNativeCon EU 2026: What We Learned About AI, Observability, and Fast Feedback Loops

Honeycomb was excited to attend KubeCon + CloudNativeCon Europe, where one theme stood out across sessions: as AI reshapes how software is built and run, teams are being pushed to rethink how they understand their systems. Without strong observability and feedback loops, AI can accelerate confusion, misalignment, and operational risk.

The Hidden Cost of Separate Monitoring and On-Call Tools

Most engineering teams I talk to run at least two or three separate tools for monitoring, on-call, and status pages. UptimeRobot or Pingdom watches the services. PagerDuty pages the on-call engineer. Statuspage.io tells customers what is happening. The dollar cost of this stack is easy to calculate. The hidden costs are harder to see, and they add up faster than the subscription fees.

Conversations: Ask Netdata About Anything You're Looking At

Netdata AI can already troubleshoot your alerts and generate Insights reports. What it couldn’t do, until now, was have a back-and-forth conversation. You could get a one-shot analysis, but you couldn’t ask follow-up questions, pull in additional context, or go from a quick question to a full investigation without starting over. We’ve added a conversational layer to Netdata AI.

Understand session replays faster with AI summaries and smart chapters

Datadog Session Replay gives teams a video-like view of what real users experienced in their applications. Engineers rely on replays to connect errors and slowdowns to actual user behavior, while product managers use them to understand friction and improve critical flows. But finding the right replay and the right moment often means manually scanning long sessions without knowing whether they contain relevant signals.

Search and act across Datadog to resolve issues faster with Bits Assistant

Finding the right information across dashboards, monitors, and telemetry sources takes time, even for experienced engineers. When something breaks, it often means figuring out where to start, rebuilding queries, and jumping between metrics, logs, and traces before you can take action. The challenge isn’t a lack of data but the effort required to surface the right information at the right moment.

How we designed empathetic alert sounds for on-call engineers

Being on call is an essential part of operating reliable distributed systems, but it comes with real human costs such as alert fatigue, sudden wakeups in the middle of the night, and the ongoing anxiety of what the next notification might bring. Many engineers know the feeling: Your phone lights up, a sound cuts through the silence, and your heart rate spikes before you’re even fully awake.

Monitor ClickHouse query performance with Datadog Database Monitoring

ClickHouse is widely used for large-scale analytics, but once it is running in production, it can be difficult to understand how query activity translates into resource usage. Engineers investigating performance issues often struggle to determine which queries consume the most memory, run most frequently, or cause spikes in load. In practice, engineers are left querying system.query_log, tailing server logs, and piecing together information after an incident.

What's New in InfluxDB 3.9: More Operational Control and a New Performance Preview

We’ve spent the last few months listening to how teams are running InfluxDB 3 in the wild. The feedback was clear: as you scale, you need less “guesswork” and more control. Today’s release of InfluxDB 3.9 is our answer to that. As more teams move InfluxDB 3 into production, our focus has shifted toward the operational experience: how you manage the database at scale, how you ensure it remains secure, and how you provide a seamless experience for users.