Operations | Monitoring | ITSM | DevOps | Cloud

Ep 41: The cost of not thinking: Who's responsible when AI agents get it wrong?

In this episode of Masters of Data, we get into the messier side of AI adoption, tackling questions like who actually owns the output when AI gets it wrong, and whether chasing efficiency is making us forget what it means to be human in the first place. We discuss tech CEOs proudly announcing they no longer think for themselves and debate whether AI is quietly eroding our critical thinking skills. We make the case that purpose-built, narrow AI is genuinely exciting, but that no efficiency gain is worth losing the human touch that makes work, connection, and creativity meaningful.

Apple's AI Challenge: Leadership Change Meets Strategic Pressure

Apple's anniversary year is marked not only by the symbolic results of the Tim Cook era but also by a strategic turnaround addressing the company's primary challenge: its lag in artificial intelligence. On September 1, John Ternus will take over the post of CEO, while Cook moves to the position of Chairman of the Board, focusing on strategic and regulatory issues.

NVIDIA DCGM Collector: Deep GPU Monitoring for Data Center and AI Infrastructure

GPU infrastructure is expensive and increasingly central to production workloads. Whether you’re running ML training jobs, inference serving, video transcoding, or HPC workloads, understanding what your GPUs are actually doing, and what’s going wrong when performance degrades, is not optional.

This Month in Datadog - April 2026

In the latest episode of This Month in Datadog, Jeremy shares how to run autonomous Cloud SIEM investigations, remediate vulnerabilities with auto-generated fixes, and use natural language to explore Datadog. Later, Sumedha Mehta spotlights the Datadog MCP Server, which gives AI agents real-time access to Datadog’s observability data. Then, Chetan Sharma walks through Datadog Experiments, which measures how product changes impact the user journey.

AI Diagnostics in Kentik NMS (Network Monitoring System)

Network problems are easy to spot. Proving root cause is the hard part — and it’s where most of MTTR gets burned. Kentik’s new AI diagnostics in the Network Monitoring System (NMS) close the gap between detection and diagnosis by bringing three capabilities directly into Kentik AI Advisor.

AI Enablement for Dev Teams: The 6-Pillar Flywheel

AI adoption is already happening on your team, whether you have a strategy or not. Tracy Lee (CEO of This Dot Labs, Microsoft MVP, Google Developer Expert) breaks down the AI Enablement Flywheel — a 6-pillar framework used by successful engineering organizations to move from scattered experimentation to scalable, ROI-positive AI workflows.

AI Supply Chain Attacks Are Here. And Most Organizations Aren't Ready

When I read about the Vercel breach tied to a Context AI compromise, I wasn’t surprised. I’ve been talking with customers for a while now about how AI was going to introduce a new kind of supply chain risk. This is exactly what that looks like. What stands out to me is how familiar the pattern is. We saw it with open source, then again with SaaS, and again with cloud.

AI in Software Delivery: Engineering Excellence or Just Market Hype? | Harness Blog

AWS re:Invent 2025 made one thing very clear: enterprise interest in AI is no longer theoretical. The conversation has moved beyond curiosity. Teams are actively experimenting, leaders are looking for production-ready use cases, and engineering organizations are trying to figure out where AI can create real leverage across software delivery, security, platform engineering, and operations.

Accelerating MTTR with Faster Root Cause Diagnosis: AI Advisor Now Supports On-Demand Connectivity, Config Context, and Device Diagnostics

Knowing something is broken is easy. Figuring out why is hard. Introducing three new, native AI diagnostic capabilities in the Kentik Network Intelligence Platform to accelerate root cause analysis and keep your network running better.