Operations | Monitoring | ITSM | DevOps | Cloud

Sponsored Post

Cost Control in SAP BTP: The Critical Need for Automation

The cloud is the cheapest processing you can buy... until you get the bill! Unfortunately, Cloud service costs are notoriously opaque when it comes to transactional and operations costs. The results can be unexpected bills and even damage to the ROI of cloud programs. SAP BTP is no exception, but it doesn't have to be this way. Good FinOps discipline is readily available for BTP - and beyond avoiding "bill shock" such monitoring is just good operational hygiene, preserving budget and resources for productive investment.
Sponsored Post

"Proactive Insights for a Reactive World": What Makes Collective IQ Different for Business Leaders

From a business executive's perspective, the core question is not how many metrics a tool collects, but how clearly it connects technology to business productivity, cost, and risk. Dave Wagner summarizes this nicely: "if you're a business leader, what's really powerful about Collective IQ is it's not just technology metrics, it's productivity metrics."

Introducing Seer Agent: The answer is already in Sentry. Now you can ask for it.

This is a story about an engineer’s night that could have been bad, but ended up… not so bad. A few weeks ago, on a Saturday, our AI debugger, Seer, started failing. Note the big scary spike on the right. The errors were generic failures from the LLM calls, nothing that pointed at a root cause. Most of the team wasn’t scheduled to be on this weekend, and it just so happened Indragie, our Head of AI, was online. He started paging engineers.

Misconfigured Alert Detection: Find the Alerts That Need Tuning

Netdata ships with hundreds of stock alerts. They cover a wide range of infrastructure conditions and they’re designed with sensible defaults. But “sensible defaults” and “correct for your environment” are not the same thing. A CPU threshold that’s perfectly reasonable for a build server might generate constant noise on a machine running batch jobs.

What "AI-Ready Data" actually means for observability teams

Many organizations deploying AI are learning similar lessons right now: the challenge isn’t this or that AI model, it’s the data. According to Gartner, 60% of AI projects will be abandoned by organizations because of failures to support these projects with AI-ready data. Also, 63% of organizations either lack or aren’t sure they have the right data management practices to get there.

Apache ActiveMQ High Availability Architecture: The Complete 2026 Guide

The most common Apache ActiveMQ high availability mistake is not a configuration error; it is a false assumption. Teams deploy two broker instances, point clients at both with a comma-separated URL, and label the topology "HA." Then the primary crashes, the secondary does not have the message state, and clients start throwing exceptions while the ops team scrambles.

How to Exclude Health Check Endpoints from Python OTel Traces

Health check endpoints generate thousands of identical, useless spans per day. Here are two production-ready approaches to filter them from your Python OTel traces — and the correctness trap most implementations miss. Prathamesh works as an evangelist at Last9, runs SRE stories - where SRE and DevOps folks share their stories, and maintains o11y.wiki - a glossary of all terms related to observability.

last9-genai: Closing the Conversation Gap in LLM Observability

OpenTelemetry's GenAI instrumentation gives you spans and token counts. It does not give you conversations, workflow cost rollups, or prompts visible in your dashboard. last9-genai is an OTel extension that fills those three gaps — without replacing your existing observability stack. Prathamesh works as an evangelist at Last9, runs SRE stories - where SRE and DevOps folks share their stories, and maintains o11y.wiki - a glossary of all terms related to observability.

State of Observability in Financial Services 2026: From implementation to business impact

The demands on financial services companies are intensifying rapidly. They must not only deliver seamless system performance but also control costs, secure sensitive data, and maximize the value of their observability investments. To navigate these converging pressures, leaders are evolving their approach to system monitoring and telemetry. The 2026 State of Observability in Financial Services research report reveals a fundamental shift in how organizations manage their digital infrastructure.