Operations | Monitoring | ITSM | DevOps | Cloud

CPU monitoring for network admins: Why it matters more than ever

In your role as a network administrator, maintaining smooth, uninterrupted system performance isn’t just a one-time task; it’s your daily mission. Whether you're managing hundreds of endpoints, virtual machines, or hybrid cloud environments, CPU monitoring is one of the most critical tools in your toolkit. Without it, diagnosing performance slowdowns, service lags, or outages becomes reactive guesswork.

How we're shipping faster with Claude Code and Git Worktrees

Four months ago, Claude Code was announced and we were requesting invites to its "Research Preview." Now? We've gone from no Claude Code to simultaneously running four or five Claude agents, each working on different features in parallel. It sounds chaotic, but it's been a natural progression as we've learned to trust AI more and as the tools have dramatically improved.

The Open Source Observability Podcast - EP #1: Clickhouse, Data Lakes, and AWS S3 with Joshua Lee

In this episode we get to dive into some of Josh's favourite databases and telemetry sources for observability. Listen to learn what open source software you could benefit from including in your toolstack! Joshua Lee is a Developer Advocate at Altinity, where he applies his observability and engineering background to ClickHouse use cases and creates educational content to support the open source community. He has over 15 years of experience in leading software projects for a broad scope of industries.

Demo Roundups! Meet the PagerDuty AI Agents

Welcome to the future of operations, where people and agents manage critical work together, driving productivity and efficiency. Learn how PagerDuty’s AI agents can supercharge teams, by autonomously handling repetitive tasks and resolving well-known issues, while surfacing data and insights that augment human expertise for faster resolution and higher operational resilience.

How Puppet is Redefining Infrastructure Management with AI, Powered by Perforce Intelligence

AI has emerged as a defining force in modern technology, spearheading transformation across industries. Yet, despite its promise to revolutionize workflows and unlock unprecedented efficiency, most DevOps organizations face significant hurdles in adopting AI safely and effectively. Concerns about complexity, scalability, and governance hold many decision makers back.

How Dropbox rebuilt its logging stack with Grafana Loki after a data center went dark

Two years ago, a power outage knocked a Dropbox data center offline. It wasn’t just any data center. It was the only one where Dropbox hosted Grafana Loki, meaning engineers couldn’t access their log data. “We had considered a data center outage when we were rolling out Loki, but it had just never risen up in priority enough to get put into multiple data centers,” said Chris Hodges, an infrastructure software engineer at the cloud storage company.

Internxt becomes Valencia CF's Official Cloud Provider & Partner

Valencia CF has reached an agreement with Internxt, where the Valencian technology company, specialized in cloud storage services, becomes the Club's Official Cloud Provider and Sponsor. With this move, the startup strengthens its presence and commitment to its homeland, Valencia. Founded in 1919, Valencia CF has won six La Liga titles, eight Copa del Rey titles, three UEFA Cups, two UEFA Super Cups, among others, thus becoming one of the most relevant elite football clubs in the world.

What's Slowing Down Your App? Common Performance Issues APM Can Solve

Application performance is critical to user experience and business success. When an application starts slowing down, identifying the root cause isn’t always straightforward. For developers, DevOps engineers, and SREs, Application Performance Monitoring (APM) tools provide real-time visibility into how applications behave under load.

Route your monitor alerts with Datadog monitor notification rules

As organizations scale their infrastructure, monitoring systems can become a source of noise rather than insight. A clean, straightforward set of alerts for a handful of services can quickly spiral into a mess of overlapping thresholds, redundant triggers, and inconsequential notifications across hundreds (or thousands) of components. This flood of notifications can slow response times, overwhelm engineers, and increase the chance of overlooking critical problems.