Operations | Monitoring | ITSM | DevOps | Cloud

The AI Workload Punishes Bad Habits

The AI workload presents the ultimate challenge, highlighting the structural limitations of the traditional hyperscaler model. In this segment from a Civo Navigate London 2025 session, Kelsey Hightower explains exactly why AI adoption forces enterprises to confront flawed architecture and rising astronomical costs. When specialized hardware is scarce and rented GPUs sit idle at a premium, it’s clear that traditional cloud providers were not built for this era. Data that didn't move is forcing organizations to move compute back to where it lives.

MachineGPT: Speaking the Language of Machines to Shape the Future of AI

At.conf25, we took a bold step forward—introducing the concept of MachineGPT, which brings the power of generative AI to one of the most overlooked resources: machine data. MachineGPT speaks the language of machines. Just like ChatGPT learned the grammar of words and sentences to understand questions and respond in human language, MachineGPT can learn the hidden “grammar” of how systems behave through machine data.

The Dawn of the 10x Team

Previously, I wrote about how debugging, whether done by humans or AI powered tools, depends on context. Without it, even the most capable systems can only tell you what code is broken, but not why it broke. Now that AI can access the same depth of context developers rely on (stack traces, traces, logs, commits, and code), the way we build and operate software is changing. We’re moving from an era of monitoring to one of reasoning.

Automating Network Devices with NETCONF and YANG in Puppet Edge

This video offers a practical guide to automating network devices using YANG data models and the NETCONF protocol while using Puppet Edge. Gain the knowledge to streamline your network operations and enhance consistency. Perforce Puppet gives IT operations teams back their time and offers peace of mind with infrastructure automation that enables security and compliance.

Prioritize errors and create tickets using Rollbar's MCP Server

Production errors can feel overwhelming. Your Rollbar dashboard is filling up with alerts, your team is scrambling to understand what needs immediate attention, and critical revenue-impacting issues might be buried among less urgent problems. In this post, we'll walk you through a workflow that transforms production error chaos into organized, prioritized action items. We'll cover everything from analyzing Rollbar errors to creating properly linked Linear tickets.

Reliability lessons from the 2025 Microsoft Azure Front Door outage

On October 29th, 2025, Azure Front Door suffered an outage that impacted Microsoft services on a global level, including Microsoft 365, Outlook, Xbox Live, Copilot, and more. It also affected Microsoft Azure, meaning companies like Costco, Starbucks, and Alaska Airlines ran into issues for both customer-facing and internal systems. The root of the issue was a misconfiguration in the data plane for Azure Front Door and the Azure Content Delivery Network.

Introducing the New Cloud Dedicated Admin UI

InfluxDB Cloud Dedicated provides hosted and managed InfluxDB Cloud clusters in a single-tenant environment and is optimized to handle high write and query loads. Today, InfluxData is releasing a visual overhaul and new features for its Admin UI. Among the recent updates are live observability for customer clusters, overhauled site navigation, and improved visibility into table schemas.