Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Containers, Kubernetes, Docker and related technologies.

Kubernetes monitoring explained: Key metrics, labels, and best practices

Monitoring Kubernetes and containers doesn’t have to be overwhelming. In this video, we’ll break down the essential metrics you need to track, why labels are critical for container visibility, and the best practices for Kubernetes monitoring at scale. You’ll learn: How tools like Site24x7 simplify Kubernetes monitoring with auto-discovery, dashboards, anomaly detection, and forecasting. Whether you’re a DevOps engineer, SRE, or developer, this video gives you the practical knowledge to improve container monitoring and observability.

Kubernetes v1.34: What You Need to Know

Kubernetes v1.34, codenamed “Of Wind & Will (O’ WaW)”, brings a wide range of enhancements aimed at making clusters more efficient, secure, and easier to manage. This release delivers 58 enhancements with 23 graduating to Stable, 22 entering Beta, and 13 in Alpha, reflecting the platform’s continued maturation as enterprises scale their container orchestration needs.

#049 - The AI Translator: Using LLMs & MCP for K8s Operations & Self-Healing Infra with Alexei Le...

In this episode, Itiel Shwartz kicks off a series on MLOps, LLM, and GenAI in Kubernetes. Starting with Alexei Ledenev, who has over two decades in software development and deep experience in cloud architecture and distributed systems. He shares his journey from CoreOS Fleet to his current role on the Platform Team at Doit.

ECS Vs. EKS Vs. Fargate: AWS Container Services Compared

Amazon Web Services (AWS) provides more than 200 services. Among those, Amazon Elastic Compute Service (ECS), Elastic Kubernetes Service (EKS), and AWS Fargate help deploy and manage containers. Choosing between these services can be challenging. They seem similar on the surface (and are all popular). But each offers unique benefits and limitations. In this guide, we compare the three services, discussing the best use cases for each, and helping you choose the best fit for your business.

Lightning-Fast Kubernetes Management with Rancher's Vai Project

If you manage Kubernetes at scale with Rancher, you know that UI performance is not just a “nice-to-have”—it’s crucial for productivity. The Rancher team is on a continuous journey to enhance our platform’s ability to handle increasingly complex environments. In this post take a deep dive into an exciting, evolving improvement we’ve been developing: a project codenamed “Vai” (also called UI Server-Side Pagination or SQLite-backed caching).

Top Argo CD Anti-Patterns to avoid when adopting GitOps - Part 1 of 3

This is the first webinar in our three-part series. Join Kostis Kapelonis as he walks through common mistakes teams make when using Argo CD for GitOps. This session is great for everyone — whether you're just getting started or already experienced with GitOps. Learn what not to do, and how to avoid issues that can slow you down.

Stop Fighting Kubernetes to Go Multi Region

Every engineering leader eventually asks the same question: What happens if my cloud region goes down? This isn't unheard of, or even rare, and the stakes are obvious. A single-region deployment might work fine on day one, but it leaves you exposed: one outage, one fiber cut, or one bad update from your provider, and your application is offline. In some cases, your entire business could be at risk. That's why recently, multi-region architecture has become the gold standard.

Distributed performance testing for Kubernetes environments: Grafana k6 Operator 1.0 is here

Performance testing is critical to build reliable applications, but testing at scale, especially inside modern Kubernetes environments, can be a challenge. For example, how do you coordinate tests across multiple nodes, test private services without compromising security, or even do both at once? And most importantly, how do you do all this without adding too much operational complexity to your stack?

Calico Whisker vs. Traditional Observability: Why Context Matters in Kubernetes Networking

Are you tired of digging through cryptic logs to understand your Kubernetes network? In today’s fast-paced cloud environments, clear, real-time visibility isn’t a luxury, it’s a necessity. Traditional logging and metrics often fall short, leaving you without the context needed to troubleshoot effectively. That’s precisely what Calico Whisker’s recent launch (with Calico v3.30) aims to solve. This tool provides clarity where logs alone fall short.