Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Tempo 2.10 release: new TraceQL features, LLM-optimized API responses, vParquet5, and more

Tempo 2.10 has arrived, delivering TraceQL enhancements, improved cardinality management for the metrics-generator, vParquet5, and more. You can continue reading and check out the video below to learn more about these and other new features. The Tempo 2.10 release notes and changelog provide more in-depth details and include all of the changes that came with this release.

Debug PostgreSQL query latency faster with EXPLAIN ANALYZE in Datadog Database Monitoring

In PostgreSQL, the EXPLAIN ANALYZE statement gives you a detailed report of what actually happens when you execute a query. This kind of information is important for troubleshooting slow queries, but using EXPLAIN ANALYZE to collect this data is often challenging in a production environment. Datadog Database Monitoring now supports automatic collection of EXPLAIN ANALYZE plans for PostgreSQL, enabling you to easily capture execution details that help you troubleshoot slow queries.

Less code, faster builds, same telemetry: Turbopack support for the Next.js SDK

TL;DR - Turbopack became the default in Next.js, so we reworked our SDK to stop depending on bundlers. The result is less code, faster builds, and the same telemetry. This blog explains how we got there. You know the feeling when you spend years building tooling that supports something and all of a sudden that something becomes deprecated and you have to rethink your full approach?

IT as the Proving Ground for AI: Driving Enterprise Innovation

As per the Enterprise AI Survey conducted by Digitate in collaboration with Sapio Research revealed that IT operations have emerged as the primary proving ground for artificial intelligence in the enterprise. With 78% of organizations already deploying AI in IT, 65% identifying ITOps as the biggest AI beneficiary, and adoption outpacing every other function, IT leads enterprise AI maturity.

Getting Started with Splunk Dashboards

Splunk is a leading platform for searching, monitoring, and analyzing logs across IT tools and systems. Well-known for its ability to handle vast volumes of log and event data, Splunk empowers organizations to gain real-time visibility into their systems and operations. However, while Splunk offers rich telemetry and analytics, its dashboards can sometimes become complex - making it difficult to surface the most critical insights quickly. That’s where SquaredUp can elevate the experience.

How to Choose the Right API Monitoring Tool for Production Environments

APIs are no longer just technical connectors between systems; they are production infrastructure. Customer-facing applications, partner integrations, payment flows, and internal microservices all depend on APIs working correctly, consistently, and at scale. When an API fails, the impact is rarely limited to a single endpoint; it can disrupt user journeys, compromise revenue, and breach service-level agreements (SLAs).

6 Common Factors That Influence Fleet Safety Program Success

Building a safer fleet is not about one silver bullet. It is a set of practical choices that add up, day after day, until safer habits and smarter tools become the way you operate. This article breaks the work into six factors you can act on. Each one is designed to be simple to start, measurable to manage, and durable enough to last when operations get busy.

Now available: More monitor history

We’re excited to roll out an improvement many of you have been asking for: extended historical metrics for website and ping monitors. Until now, monitor metrics like availability, downtime, and response times were limited to the last 24 hours. While useful for short-term checks, this made it harder to spot trends, investigate intermittent issues, or understand long-term performance. That changes today.