Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Navigating the Middleware Maze: How meshIQ 12.1 Redefines Scale and Simplicity with Agentic AI

meshIQ v12.1 transforms middleware management with petabyte-scale data processing and agentic AI. The new intelligent launchpad, simplified onboarding, and context-aware safeguards move teams from reactive monitoring to proactive, AI-driven operations across the enterprise.

Analyze cloud costs with flexible spreadsheets in Datadog Sheets

Cloud cost data is most useful when teams can adapt it to their own reporting and planning needs. In addition to viewing cost breakdowns, FinOps teams often need to calculate forecasts, reshape datasets, and present tailored views to finance and leadership teams. In many workflows, those steps happen outside the observability platform. Once the data is exported, it quickly becomes outdated and requires repeated manual updates.

Datadog for Government achieves FedRAMP High certification

Modern government missions depend on software platforms that can perform under demanding conditions. As agencies update systems that support public safety, benefits delivery, financial operations, and national priorities, they face security and compliance requirements that shape how technology is adopted as well as how it is built, operated, and evolved over time.

Elasticsearch 9.4 powers the next phase of the Elastic AI Ecosystem: Dell AI Data Platform with NVIDIA

AI is moving fast. Enterprise adoption needs to move with purpose. Over the past year, one thing has become clear: Organizations are not looking for more AI hype. They are looking for a path to production — one that connects infrastructure, data, and intelligence in a way that delivers real business value. That is exactly what the Elastic AI Ecosystem is built to do. At Elastic, we believe AI is only as powerful as the data foundation behind it. Great models matter.

Troubleshoot performance issues faster with the new Grafana Assistant integration for Database Observability

So your database is slow. Now what? Grafana Cloud Database Observability already gives you visibility into your SQL queries with RED metrics, individual execution samples, wait event breakdowns, table schemas, and visual explain plans. But visibility is just the starting point. You can see that a query's P99 latency spiked, but what should you do about it? You can see wait events like wait/synch/mutex/innodb firing, but what does that actually mean?

Introducing Application Metrics: Track the signal, see the spike, jump to the trace

A few weeks ago we had a bug with Session Replay. Replays were failing in some browsers once more than 1,000 video segments loaded. We had no idea how often it happened or who was hitting it, and because the failure didn’t always produce an error, we had no way to find affected users to reproduce it. Before, we could’ve answered this with spans or logs, but it’s clunky — spans are often sampled, so you can miss outliers; logs are less structured and tend to change over time.

ActiveMQ JMS 2.0 Implementation Guide: Simplified API, Transactions & Spring

For most of JMS's lifetime, writing a simple producer required creating a ConnectionFactory, creating a Connection, starting it, creating a Session, creating a MessageProducer, creating a Message, calling send(), and then closing the producer, session, and connection with the close calls safely wrapped in finally blocks to prevent resource leaks. Every developer knew the pattern. Every developer wrote it slightly differently. Every code review had the same comments about resource management.