AI is moving out of the experimental phase and into the everyday rhythm of work. Teams are no longer using it occasionally for novelty or quick wins, but instead are exploring more robust use cases to investigate issues, answer questions faster, surface context, and help them move through complex workflows with more confidence. That’s the shift that most organizations’ leadership teams have been asking for.
This blog post is part of PagerDuty’s ongoing series on how we’re helping customers navigate their journey towards autonomous operations. Read on to learn about how PagerDuty Advance Scribe Agent updates (Generally Available) build towards this vision. When a major operational issue hits, there’s always someone drawing the short straw to take on the most thankless job in incident response: scribing the call. Chances are you were already that someone.
I have an Upsun project that's nothing but proofs of concept. It's a dashboard, basically. Each POC gets its own tile. Click in, and you land on a page with three tabs. The first tab is a written explanation of what the POC argues. The second tab is the POC itself, with a built-in demo that automates a walkthrough of the feature so the recipient can watch it run without me on the call.
DNS monitoring should be simple. You want to know if something changed. You want to know if a record propagated. You want to know if a phishing site just went live with your brand name in the domain. But in practice it takes work. You log in to a dashboard. You click through menus. You run a check, copy the output, paste it somewhere else. You repeat that process every time someone on the team asks a question. AI assistants like Claude and ChatGPT could help.
How should IT leaders approach automation and AI? Where should they start, and how can they drive measurable results without getting caught up in the hype? In this episode of Agents of IT, Fran Fernandez and Zach Austin sit down with Chris Ellis, Senior Technology Solutions Specialist at RICOne, to discuss practical IT automation strategies, agentic AI, service desk transformation, and the journey toward autonomous operations.
Every quarter, the same scene plays out in boardrooms across the Fortune 500. The CEO asks: “What is the return on everything the company is spending on AI?” The CTO talks about productivity gains and developer velocity. The CFO points at a cloud bill that doubled but cannot isolate which line items are AI. The board nods politely and tables the discussion until next quarter, when the same question will produce the same non-answer. (If this sounds familiar, you are not alone. Keep reading.)
CloudZero originated as a way to make sense of your cloud costs. Costs spread across bills with billions of line items belonging to resources that might or might not have been tagged (or taggable), spun up by engineers working across teams, on different microservices, features, and products, that served a wide range of customers. Kubernetes. Multi-cloud. Check, check, check.
Your logs showed 500 errors. The traces showed the dependency graph. Neither showed the actual bug, a DEL control character getting appended to the query string. This is how I found it. In this video I walk through Speedscale BYOC (bring your own cloud): capture real production traffic, store it in your own Elasticsearch cluster inside your VPC, pull it down locally with a single script, and reproduce the exact bug using proxymock. The data never leaves your environment.
If you’re prepping for your first AI or MLOps interview, the hardest part usually isn’t always the hands-on element. For me, it’s the vocabulary. Interviewers sometimes lob single-word concepts at you (“what’s quantization?”) and watch how far you can carry the thread. The questions sound clear-cut, but each one is really a doorway into a bigger topic, and the interviewer is judging how cleanly you walk through it.