Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on DevOps, CI/CD, Automation and related technologies.

Rancher Live: What is Developer Advocacy?

Join us for an engaging Rancher live stream hosted by Orlin Vasilev, as we dive into the world of Developer Advocacy—what it really means, why it matters, and how it's evolving in the cloud-native space. Orlin will be joined by two powerhouse guests in the field: Jorge Castro – a community strategist and long-time open source advocate, known for his work with Kubernetes and cloud-native ecosystems. Jorge brings deep insights from years of building developer communities and bridging the gap between engineers and users.

The Second Wave of Private Cloud

Over the past decade, the public cloud became the default way to run software. Its flexibility, on-demand pricing, and global reach made it the obvious choice for many teams. Startups could move fast, and enterprises could avoid long procurement cycles and complex hardware management. As teams gain more experience with cloud infrastructure, unintended consequences start to rear their costly heads. Bills grow quickly and are difficult to predict.

Netdata Overview: All You Need to Know in Under 3 Minutes

In just a few minutes, this walkthrough will show you how to unlock the full power of Netdata during your trial period. From real-time metrics to AI-powered insights, learn how to get immediate value without any guesswork. Whether you're running a Homelab or managing production systems at scale, this video will help you hit the ground running and make every minute of your trial count. Let’s turn your trial into insight, clarity, and control.

9 Best Incident Response Tools (Plus 4 Open-Source Options)

I’ve curated a list of 9 best incident response tools, plus 4 open-source options for you. But first, a quick note: Many people mix up alerting, monitoring, and incident response. Incident response is what you do after receiving an alert. It includes alert acknowledgment, escalations, incident communication, post-incident analysis, and response automation. Yes, some of these (incident communication and post-incident analysis) overlap with incident management.

Kubernetes Is Powerful-But It's Slowing You Down. Here's How to Fix It.

Ask any SRE what slows them down in a Kubernetes incident, and the answer is usually too much information in too many different places. Kubernetes has changed the way we run software. It’s given us incredible flexibility, scalability, and power. But in the years I’ve worked in cloud operations and platform engineering, I’ve also seen how that power comes at a price: complexity.

Developing Modules for Puppet and the Forge in 2025

Since announcing changes to our OSS plans as well as introducing the new licensing starting with PDK 3.5.0, the team has received questions from the community around how the changes will affect them. In this article, we’ll highlight some helpful resources about how you can develop and contribute to modules on the Forge and ensure compatibility with Puppet Core and Puppet Enterprise.

Factors That Define a Scalable Reseller Hosting Plan

Many entrepreneurs are drawn to reseller hosting as an accessible and profitable business model. As you explore various options, it's important to understand the factors that contribute to a scalable reseller hosting plan. A plan that supports growth must include key elements like performance, flexibility, price, and support. Let's break down these crucial aspects in more detail.

Zero Ticket Video Series: How to Automate Password Resets with Resolve

Struggling with repetitive IT tickets like password resets and account unlocks? You're not alone — these make up nearly 30% of all service desk requests. In this demo, learn how RITA, the AI-powered IT Agent from Resolve, can eliminate these issues entirely — no ticket required.