Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on DevOps, CI/CD, Automation and related technologies.

Introducing Magellan: The AI data engine that builds your IDP

Building a catalog used to be a project. It meant months of tracking down owners, untangling dependencies, and manually piecing together a picture of your architecture. It was a tedious, thankless process that delayed the value of your Internal Developer Portal (IDP) before you even got started. Now, it’s a coffee break. We’re excited to introduce Magellan, our new AI-powered data engine designed to build your catalog and get your IDP live in minutes.

A new era for your developer portal: The Cortex MCP is now generally available

Here's a scenario every on-call engineer knows too well: a critical incident fires for a service you’ve never seen before. Your first ten minutes are a frantic scramble across wikis and Slack channels just to answer the most basic questions: Who owns this? What does it do? Where are the runbooks? By the time you’re oriented, the incident has escalated.

Your Password Reset Workflow Is Wasting Everyone's Time

Let’s not mince words; there’s a special place in hell for the password reset ticket. It’s the most boring, most avoidable, and arguably the most expensive waste of time on your service desk. And yet, in 2025, most enterprises still treat password resets like it’s 2005. They route them through manual queues, bury IT teams, and frustrate users who just want to log back in. Even when the password reset is finally resolved, nobody comes away from the experience feeling like a winner.

Choosing the Right APM for Go: 11 Tools Worth Your Time

If you’re building high-performance systems, Golang has probably earned a spot in your stack. Its speed, lightweight concurrency, and quick compile times make it ideal for scalable APIs, microservices, and distributed systems. But those same qualities that make Go powerful can make performance monitoring tricky. Goroutines run fast and in parallel, which means a simple CPU or memory graph doesn’t always tell you what’s slowing things down.

What Is SolarWinds, And Should You Use It?

Downtime is brutally expensive and damaging. Enterprises can lose about $9,000 every minute systems are down, while smaller businesses lose hundreds of dollars per minute. A single outage can often cost over $100,000, and nearly a third of companies lose customers due to downtime. That’s why many organizations turn to platforms like SolarWinds to maintain reliable systems and minimize the risk of costly disruptions.

Could AI Turn Back The Clock On IT Departments?

I recently wrote about the impending SaaS crisis, driven by companies’ newfound ability to use AI to build software they used to have to buy. I predicted this phenomenon would make it even harder for SaaS vendors to drive growth, and that elite SaaS margins would fall from the mid-70s to the mid-60s as companies leaned more into their data and AI.

Implementing image recognition with React and continuous deployment

Integrating artificial intelligence (AI) into web applications can significantly enhance user experience. AI offers features like image recognition to process and analyze user-uploaded images. Combining this with a robust continuous integration and continuous deployment (CI/CD) pipeline using CircleCI ensures seamless updates and reliable delivery. In this article, you will learn how to build a React app that uses TensorFlow.js for client-side image recognition and set up automated testing with CircleCI.

Building LLM agents to validate LangGraph tool use and structured API responses

Transitioning LLM agents from intriguing prototypes to reliable, production-grade solutions introduces a unique and significant challenge: the inherent stochasticity of LLMs. Unlike conventional software, where inputs predictably yield precise outputs, an LLM’s response can exhibit variability even when presented with identical prompts. To ensure the dependability of your LLM agent, you will need a rigorous validation strategy.