Operations | Monitoring | ITSM | DevOps | Cloud

How to Choose an AI SRE Solution

The AI SRE landscape has exploded over the past year, with vendors racing to add artificial intelligence capabilities to their platforms. For engineering leaders evaluating these solutions, the sheer number of options can feel overwhelming. Some vendors are building AI-native solutions from scratch, while others are retrofitting AI onto existing workflows. Cloud providers are embedding agents into their ecosystems, and observability platforms are adding intelligence layers to their telemetry data.

Why Agentic AI Adoption Is Accelerating in Europe and What Comes Next

Across Europe, the cautious optimism business leaders held towards AI agents has evolved into more widespread enthusiasm. What was once a curiosity is now core to how many European organizations operate, respond, and innovate. According to PagerDuty’s latest agentic AI survey, three-quarters or more of organizations in France, Germany, and the UK are deploying multiple AI agents. This growing confidence reflects a broader trend.

Improve Kubernetes reliability faster with Gremlin and Dynatrace

It’s now easier than ever to start testing Kubernetes with Dynatrace and Gremlin. With a new strategic integration, Kubernetes services set up in Dynatrace are automatically discovered in Gremlin to make testing set up simple and fast. At a time when AI is driving massive expansions in infrastructure and dramatically increasing deployment speed, being able to set up and test new services quickly is more important than ever. ‍

BigPanda Acquires Velocity: Accelerating the Future of Agentic IT Operations

Today marks an exciting milestone for BigPanda and for the future of IT Operations. We’re thrilled to announce that BigPanda has acquired Velocity, an AI SRE company whose technology and team share our passion for transforming how enterprises keep the digital world running. Velocity brings deep expertise in Site Reliability Engineering (SRE) and major incident response, developed alongside some of the world’s most sophisticated technology organizations.

Splunk Developer Program

A short video that introduces the Splunk Developer Program, highlights the end-to-end support and tooling it offers, and showcases how developers can build, test, and grow impactful apps with confidence. The video will follow the journey of a first-time app builder who discovers the program, uses its resources, and becomes an active, recognized contributor in the Splunk community.

Why 2025 & Beyond is The Builders Era

The tech world loves buzzwords. We’ve lived through the Cloud Era, the Mobile Era, the AI Era (we’re still in that one, apparently). But 2025 marks something different. Something developers have been craving for years but couldn’t quite name. Welcome to The Builders Era. Not because of some shiny new framework or yet another platform promising to 10x your productivity. The Builders Era is happening because developers are done being spectators in their own craft.

How feedback loops power progressive software delivery

Modern engineering teams face competing priorities. Developers are expected to deliver new features faster than ever, but users expect rock-solid reliability with every release. Shipping quickly can feel like you’re gambling with user trust. If you move too fast, you risk outages, but if you move too slowly, innovation stalls.

Observability and FedRAMP in Action: The VA's Mission to Deliver Reliable Digital Service

Ensuring digital services remain accessible, reliable, and secure is a high priority for any organization operating at scale. For the Department of Veterans Affairs (VA), this focus is central to its mission of providing quality care to veterans, their families, and caregivers. Often described as “the largest IT shop in the United States,” the VA manages 2.7 million pieces of equipment across a vast network of interconnected systems.

Eliminate unnecessary costs in your Amazon S3 buckets with Datadog Storage Management

Cloud object storage powers a wide range of workloads, from AI training datasets to customer-facing media libraries. As your data grows into the petabyte scale, managing storage costs and ensuring reliability requires fine-grained visibility. You need answers to questions like: Which specific teams, services, workloads, or datasets are driving spend? Which data is cold and should be archived? What fixes will have the biggest impact on cost and performance?

MCP found a thankless bug faster than us, and it was actually fun

Once, when I was a very junior developer, I was discussing a bug with a very senior developer (let's call him Burt). Satisfied with the fix, I said something like "oh, that was a great bug". He looked at me as if his eyes were going to fall out of his head. Clearly, this enraged him. He briefly went off about how there are no great bugs, there are only bugs to squash – and that’s all.