Operations | Monitoring | ITSM | DevOps | Cloud

Vertical Pod Autoscaling: How It Compares to Pepperdata Capacity Optimizer

Vertical Pod Autoscaling (VPA) is a component within Kubernetes designed to automatically resize the CPU and memory requests of pods based on their observed, historical usage patterns. While Pepperdata Capacity Optimizer and VPA both change the resource requests of pods in response to changing application resource requirements, there are several key differences.

Arie's Adventures with Coroot

Arie van den Heuvel is an engineer, a System and Application Management Specialist, and a valued member of our community. Below he has shared his journey using Coroot, and how it has helped improve observability for his team. You can read more of Arie’s writing and support the resource articles he has created for open source on his blog.

Beyond Outages: The Post-Incident Reviews We Should Have Had

In the past year alone, we’ve seen just how much a single outage can disrupt and how much stronger teams become when they learn from it. From the July 16, 2024 incident to the widespread June 2025 outage, it’s clear that incidents are inevitable. The question is: how do you transform each disruption into an opportunity to improve your processes for the next one?

Demo: Running a Patch Job with Puppet Advanced Patching

With Puppet, patching is faster and easier than ever. Watch this video to learn how to set up and run a patch job with Advanced Patching in Puppet Enterprise Advanced. Puppet's Barr Iserloth and Liam Sexton cover activating Advanced Patching, creating a patch group, and running a patch job from the Puppet Enterprise console. Highlights include the easy-to-use patching GUI, custom patch groups for cross-OS patching, streamlined scheduling that obeys your defined maintenance and blackout windows, and reporting that shows you where each patch was applied.

Identifying Idle Paths in a Data Center Leaf-Spine Fabric

In a perfect leaf-spine network, traffic evenly spreads across all links. But reality is often different, leaving costly, idle paths hidden in your data center fabric. Kentik's Phil Gervasi demonstrates how Kentik's network intelligence platform helps engineers quickly identify and address these underutilized paths. With powerful visualizations, detailed telemetry analysis, and customizable alerts integrated into your ticketing systems, Kentik makes it easy to spot persistent traffic imbalances, troubleshoot ECMP issues, and optimize your infrastructure.

Lost Your Work? This Git Trick Saves The Day!

Ever reset too far? Deleted a branch you needed? Thought you lost a commit forever? In this episode of Wait… Git Can Do That?, we explore git reflog — Git’s local time machine. You’ll learn how to: View every local Git action — even the messy ones Recover unreachable commits Navigate using HEAD@{n} Just remember: it’s local, it’s time-limited, and it’s seriously underrated. Subscribe for more Git features you didn’t know you needed.

Friends Don't Let Friends Deploy Kafka the Old Way

In the cloud, Kafka’s promise of “never lose a byte” quietly morphs into “always pay for two.” Every time the leader syncs followers across zones, you get hit with premium egress charges that can dwarf compute costs. Diskless Kafka turns that upside-down: brokers replicate data straight into S3, so the pricey cross-zone hops vanish. Yes, object storage is slower than a local SSD, but the swap buys you on-demand elasticity and a bill that finally makes sense.

How to Build Resilient Networks for AI Production Workloads

Production AI needs a network that can keep up. Learn why private, scalable connectivity is the key in our webinar recap with Vultr. AI is no longer a proof-of-concept hiding in a developer lab. It’s a full-fledged production workload, and it’s hungry for data. But as enterprises move their AI strategies from theory to reality, they’re hitting a wall that isn’t about algorithms or processing power – it’s about the network.

Real-Time Alerting for AI-Optimized Data Centers

Kentik transforms real-time network telemetry into actionable alerts for AI-optimized data centers. By converting database queries into custom alerts, engineers can detect issues like elephant flows, idle links, and packet loss before performance suffers and triggers alerts in systems like ServiceNow or PagerDuty.

How to Strengthen Your Security Operations with Incident Response Software

When our organization – a mid-sized, fast-scaling technology company specializing in enterprise service management solutions, serving clients in regulated industries like finance and healthcare – faced its first serious cybersecurity breach in early 2024, we realized our incident response management approach wasn’t just outdated – it was putting the business at risk. Back then, we had alerts. We had logs.