Operations | Monitoring | ITSM | DevOps | Cloud

Optimizing Kubernetes node resources: How to avoid exhaustion and improve performance

Resource exhaustion at a node remains a critical issue. However, the automation of deployment and management of containerized applications is executed relatively efficiently in Kubernetes. When a node is low on resources—as in CPU, memory, or storage—a workload may suffer from failures, degraded performance, and eviction.

Scale Anything: How Komodor Enhances Autoscaler Capabilities

Kubernetes autoscalers like Cluster Autoscaler (CAS) and Karpenter have evolved significantly to manage the sprawling Kubernetes ecosystem, which has grown far beyond a simple container orchestration platform to include a vast array of add-ons, operators, CRDs, and third-party integrations. These autoscalers play a crucial role in ensuring K8s workloads get the resources they need, precisely when they need them, without creating excess and waste.

Remediate Kubernetes incidents faster using private actions in your apps and workflows

The Datadog Action Catalog provides more than 1,400 actions to help you accelerate remediation across your infrastructure directly within Datadog. With actions, you can use Workflow Automation to configure workflows that automatically address issues as they happen and build custom apps in App Builder that empower anyone in your organization to act when incidents occur.

SUSE Rancher Prime Meets Cluster API: From theory to practice

If you’re new to Kubernetes or looking to modernize your cluster management workflows, Cluster API and SUSE Rancher Prime make it easier than ever to provision and manage clusters declaratively. This guide walks you through enabling Cluster API in SUSE Rancher Prime, deploying your first cluster and exploring advanced features like GitOps. Some helpful documentation can be found here and a few pre-requisites for this hands-on walkthrough.

Charmed Kubeflow 1.10

Charmed Kubeflow 1.10 is almost here! Join the live stream with Canonical’s team to get the latest news about the changes, features and integrations available. Together with Stefano Fioravanzo, Manos Vlassis and Kimonas Sotirchos from the product engineering team, we will go live to: Talk about the features from Kubeflow 1.10 Analyze the main differences between the upstream release and Canonical’s distribution.

Rancher Live: Cloud native sustainability footprint measurement

Measuring the sustainability footprint of software - cloud native or otherwise- is not easy. Learn how CNCF's Environmental Sustainability Technical Advisory Group plans for this through the Green Reviews Working Group by joining Divya Mohan and her guest, Antonio Di Turi, on March 27th.

From failure to fix: Diagnose Kubernetes Node and Pod problems with Site24x7

Picture a busy Monday morning. You are working on leftover projects from the previous week, and assuming everything is fine with your applications as you had not received support tickets during the weekend. All of a sudden, during the middle of the day, you get a flood of reports from users who complain about slow response in your application and error pages piling up. You and your team are scrambling hard to figure out the issue.

Kubernetes Alternatives: What the Latest Search Trends are Signaling

Search is the signal. If you want a glimpse into where things are headed, just take a peek at this graph of search interest for Kubernetes Alternatives from the last few years. Cycle is a direct Kubernetes alternative, and having been part of building this company since 2018, I can tell you that this graph is so much more accurate than you can ever imagine. Until late 2021, talking about not using Kubernetes was met with an almost dogmatic intransigence.

#039 - Banking on Kubernetes: From Fintech Frontier to Regulatory Reality with Kasper Nissen (Dash0)

In this episode, Itiel sits down with Kasper, a Developer Advocate at Dash0. Kasper discusses his background, including his extensive experience building platforms on Kubernetes at Lunar, a Nordic challenger bank. He shares insights into the challenges and successes of using cutting-edge cloud-native technologies in a regulated banking environment. Kasper also touches upon his involvement in the CNCF community as an ambassador and co-chair for KubeCon. The discussion explores Lunar's philosophy of leveraging open-source projects and their journey of adopting Kubernetes and other cloud-native tools.

Drift Away: The Hidden Risk of Large-Scale Kubernetes Environments

Configuration drift is a silent but persistent challenge in managing Kubernetes environments at scale. Whether you’re running workloads across multiple clusters in on-premises data centers, cloud providers, or edge locations, the risk of drift increases exponentially as environments grow. According to a Komodor survey, 40% of Kubernetes users report that configuration drift negatively impacts the stability of their environments.

SUSE Virtualization - Enforcing Admission Resource Integrity With Validating Admission Policy

Blog written by: Ivan Sim SUSE Virtualization – Enforcing Admission Resource Integrity With Validating Admission Policy With more enterprises using SUSE Virtualization (formerly Harvester) as the bedrock virtualization platform to host their modern cloud-native AI and edge workloads, it’s important that the platform provides seamless built-in guardrails to validate and sanitize resources admitted into the environment.

Container Observability: Optimizing Every Layer with Innovative New Capabilities for Kubernetes & Windows

Managing containerized workloads and Windows environments requires more than just basic monitoring—it demands deep observability to prevent performance bottlenecks, optimize costs, and accelerate troubleshooting. Virtana’s latest Container Observability enhancements provide IT teams with greater control, visibility, and analytics across Kubernetes and Windows-based workloads.

Announcing Densify's Latest Release: Smarter Kubernetes Automation, Built for the Enterprise

To coincide with KubeCon Europe 2025, we’re excited to announce the latest release of Densify’s Kubernetes optimization engine, Kubex, which delivers full-stack resource management and seamless automation resource optimization at enterprise scale. This release delivers the advanced controls enterprises have been asking for—without sacrificing the intelligence and precision that sets Densify apart.

Software Trends - Cycle Looks at DevOps & Platform Engineering

Software engineering is trending, and the latest fads come and go with passionate adoption. Remember the OpenStack craze? Kubernetes? On-prem v cloud? I could go on and on, but one trend in software engineering that has surfaced in the last few years is DevOps & Platform Engineering.

From Conflicts to Control: The Case for Virtual Clusters in Kubernetes

Managing multiple teams in Kubernetes can feel like juggling too many balls at once. Have you ever struggled with resource conflicts, security risks‌ or simply keeping everything running smoothly when everyone shares the same cluster? If so, you’re not alone. Let’s dive into how virtual clusters can transform this chaos into a well-orchestrated symphony.

Key Differences Between Docker and Kubernetes: A Comprehensive Guide

As microservices-based architectures have taken off, Docker and Kubernetes have risen as two leading platforms for container operations. While Docker helped popularize the container model, Kubernetes has evolved into a versatile solution for orchestrating production container workloads at a massive scale. However, their similarities obscure important distinctions in how each approaches container management. This post sheds light on the functional differences between Docker and Kubernetes.

#038 - Kubernetes Supercharging Particle Physics with Ricardo Rocha (CERN)

Ricardo from CERN, who leads the platform infrastructure teams, discusses CERN's significant role in particle physics research with the Large Hadron Collider. The conversation covers how CERN manages the massive amounts of data generated from experiments using a worldwide computing grid. Ricardo shares CERN's journey with adopting Kubernetes for various applications, including critical systems controlling detectors and accelerators. He also touches upon CERN's involvement with the CNCF and the Kubernetes community.

SUSE Rancher Prime Meets Cluster API: What You Need to Know

Kubernetes has revolutionized how we deploy and manage applications, but juggling clusters across clouds and on-premises environments can quickly become a tangled mess. Different tools, inconsistent configurations‌ and manual processes drain your team’s time and energy. What if there was a way to simplify Kubernetes cluster management, bringing order to the chaos? Enter Cluster API (CAPI) and SUSE Rancher Prime.

Tech Debt as Innovation? How Netflix Turns It Into Opportunity

At Civo Navigate San Francisco 2025, Lisa Smith, from Netflix shares a fresh perspective on how tech debt can drive innovation instead of slowing teams down. Learn how to staff legacy systems, handle tricky deprecations, and evaluate the “tech debtiness” of your infrastructure to unlock growth and efficiency. Discover how to turn tech debt into a strategic advantage for your engineering team.

Federation Done Right: Cycle's LowOps Approach

Federation allows for distributing control and services across not just multiple regions, but multiple providers and environments as well. This is a critical capability for today's multi-cloud and bare metal deployments, and the idea has gained momentum for several practical reasons such as compliance, resilience, and latency. Now, more than ever, teams are expected to support multi-cloud deployments, navigate regional compliance requirements, and deliver those low latency experiences to users globally.

Observability Reimagined: How AI is Transforming Monitoring

Observability needs to evolve. With AI reshaping IT monitoring, how can businesses leverage predictive analysis, AI-driven monitoring, and auto-remediation workflows to create more resilient infrastructures? At Civo Navigate San Francisco 2025, Jemiah Sius, New Relic, explores how AI is transforming observability, shifting from reactive responses to proactive, intelligent solutions.

The latest in Kubernetes Monitoring: new features to track persistent storage, simplify alerting, and more

Monitoring is an essential part of any Kubernetes deployment, helping organizations optimize cluster health, streamline troubleshooting, and control their costs. In Grafana Cloud, we offer all these capabilities (and more) in our out-of-the-box Kubernetes Monitoring solution. Since introducing Kubernetes Monitoring in 2022, we’ve been steadily adding new features, improving the UI, and making it even easier to gain insights into the state of your Kubernetes fleet.

AI & ML Experts Reveal the Future - What's Next for Innovation?

Where is AI heading next? In this panel from Civo Navigate San Francisco 2025, leading AI & ML experts explore the latest advancements, challenges, and opportunities shaping the future of artificial intelligence. Join Josh Mesout (Civo), Jimil Patel (Intuit), Nami Baral (Niural), Tristian Cormier (State of California), and Gaurav Bharaj (Reality Defender) as they discuss neural networks, responsible AI governance, real-world applications, and the future of human-machine collaboration.

How Engineering Leaders Can Supercharge Developer Productivity

How do you boost developer productivity without burning out your team? At Civo Navigate San Francisco 2025, industry leaders discuss how engineering teams can improve workflows, ship quality code faster, and scale developer productivity. Join Benjie De Groot (Shipyard), Nathen Harvey (Google), Irina Nazarova (Evil Martians) and Solomon Hykes (Dagger, Docker) as they explore DORA metrics, DevEx, shift-left strategies, and tooling for high-impact engineering teams.

Shaping a Greener Future: Tech Leaders on Energy & Carbon Impact

The tech industry must innovate while cutting carbon emissions and energy use. At Civo Navigate San Francisco 2025, experts in AI, cloud, and sustainability tackle the challenges of energy-hungry technologies like GPUs, software efficiency, and industry-wide solutions. Join Simon Hansford (Civo), Hui Wen Chan (Crusoe), Suleiman Mirzad (Slalom), Dr. Thomas McDonald (Orbital Materials), and Dinesh Majrekar (Civo) as they explore how tech can drive a greener future.

Is Cloud Still King? The Shifting Landscape of Infrastructure

Believe it or not, we are in the middle of one of the biggest cloud repatriation movements of the past decade. More than ever, companies are rushing to find infrastructure solutions that better suit their needs. Over the past decade, hyperscalers have dominated the market, generating trust and, in some cases, overconfidence in software development. Drawn in by promises of reliability, ease of use, and ultimate flexibility, teams turned to providers like AWS, GCP, and Azure.

How Netflix Engineers Launch High-Stakes Products to Millions

Launching a high-impact engineering feature that reaches millions of users and drives massive revenue is no easy feat. In this talk from Civo Navigate San Francisco 2025, Ramneet Bhatia, Senior Software Engineer at Netflix, shares the strategies, challenges, and key lessons from leading the Paid Sharing launch at Netflix.

Introducing Civo FlexCore: A New Era for Private Cloud

The cloud landscape is evolving, and at Civo, we are committed to pushing boundaries and reimagining cloud infrastructure. At our recent Civo Navigate event in San Francisco, we unveiled a significant step forward in this mission—Civo FlexCore, a game-changing private cloud solution designed to bring the simplicity and efficiency of the public cloud to on-premises environments. For too long, enterprises have struggled with the limitations and costs imposed by hyperscale cloud providers.

The Secret Weapon for Culture Change: Product Operations Explained

At Civo Navigate San Francisco 2025, Chris Butler, Staff Product Operations Manager at GitHub, explores how product operations drives culture change and value alignment. Culture change is challenging, but product ops serves as the bridge between strategy and execution, fostering collaboration, accountability, and transparency. Drawing from his experience at GitHub, Microsoft, and Google, Chris shares insights on how organizations can sustain meaningful transformation.

Cycle Video Walkthrough: Securing Private Network Access with Cloudflare Tunnel

Cycle was built to be a powerful, security focused container orchestration platform that is a more user friendly alternative to Kubernetes. One way it achieves this is by making complex, yet secure networking easy to achieve. By combining Cycle's private networks, known as environments, with Cloudflare Tunnel, teams can further enhance their network security and reliability.

Marty Weiner's AI Predictions: What to Expect in the Next 2 Years

Recorded at Civo Navigate San Francisco 2025, Marty Weiner, co-founder of VerifyYou and former CTO of Reddit, delivers a shocking talk on the current state of AI and its rapid progression. From its impact on various industries to its potential effects on the economy and job market, Marty explores the exciting and terrifying aspects of AI. He also discusses the concept of AGI and its potential implications for civilization. Watch to learn more about the future of AI and what it means for humanity.

Introducing the Rancher CVE Portal: Enhanced Transparency and Security for Your Rancher Workloads

At SUSE, we’re always looking for ways to make it easier for customers to maintain secure, enterprise-grade environments. The Rancher Security team is excited to announce the public beta launch of the Rancher CVE Portal, available now at scans.rancher.com. This new resource is a significant step forward in providing clear, actionable visibility into vulnerabilities affecting Rancher and its associated dependencies.

9 Kubernetes monitoring best practices: A practical guide to successful implementation

Kubernetes has revolutionized containerized application deployment, but effective monitoring remains a crucial challenge. Unlike traditional infrastructures, Kubernetes environments are dynamic, distributed, and short-lived, making real-time visibility essential for performance, security, and cost optimization. Without proper monitoring, teams risk application downtime, resource wastage, and security vulnerabilities.

Cloud Computing Reimagined: The Game-Changing Truth

At Civo Navigate San Francisco 2025, our experts shared their insights on how to reimagine the cloud through cutting-edge solutions and advancements. Featuring CTO Dinesh Majrekar and Chief Innovation Officer Josh Mesout, this session explores Civo's multi-cloud strategy and the pivotal role of private cloud in a hybrid future.

Amazon EKS Auto Mode

EKS Auto Mode is a huge step forward in managing your EKS clusters by automating complex tasks, enhancing cost efficiency, simplifying management, and ensuring resource optimization. These features make Kubernetes more accessible and manageable, particularly for organizations looking to leverage containerized environments without the overhead of extensive manual configuration and management.

[Webinar] Kubernetes Health Management with Komodor

Modern Kubernetes environments are increasingly growing in scale and complexity. While application performance monitoring and infra-observability tools were once sufficient to maintain reliability, they are ill-equipped to handle the distributed and ever-changing nature of dozens or hundreds of Kubernetes clusters.

Will DevOps as We Know It Survive the AI Revolution?

Is DevOps on the brink of extinction? Solomon Hykes, co-founder of Docker and CEO of Dagger.io, explores how AI agents are transforming software development—not just writing code but shipping it. In this talk at Civo Navigate San Francisco 2025, Solomon retraces the history of software’s industrial revolution and examines whether AI will replace DevOps engineers or empower them. With live demos and expert insights, he reveals what’s next for the software factory and the future of platform engineering.

3 Neat Tricks and New Patterns Used in Cycle - from Cycle's Customer Success Team

In the last few months I've had the privilege to implement and witness new and exciting Cycle platform workflows emerging. Some of these come from the evolution of our own internal work, while others I've learned from our users. Reflecting on this, I thought it would be beneficial to the community to share! In this article, I'll walk through a handful of the neatest patterns we've seen in action. So I hope you'll join me for a no fluff, real, and practical look at how to get more out of Cycle.

Drift Detection in Kubernetes

When the increasingly popular strategy of configuration as code (CaC) is used to develop infrastructure, it’s known as infrastructure as code (IaC). Today, IaC is quickly becoming entrenched in development processes, especially in conjunction with Terraform and Kubernetes. Yet, although IaC (and CaC) bring immense value, they can also lead to a major problem: configuration drift.

Advanced Container Resource Monitoring with docker stats

If you’ve ever needed to check how much CPU or memory a Docker container is using, docker stats is the command for the job. It provides real-time resource usage metrics, helping you monitor and troubleshoot containers efficiently. This guide covers everything you need to know about docker stats: how to use it, what each metric means, and how to integrate it into a larger monitoring setup.

#037 - Problem First, Kubernetes Second: Insights from Ahmed Bebars (New York Times | CNCF)

In this episode of Kubernetes for Humans, we speak with Ahmed Bebars, a Principal Engineer at the New York Times and a CNCF Ambassador, who offers a unique perspective on cloud native technologies. Ahmed recounts his professional journey in accounting before transitioning into the technology sector, leading to his current deep involvement with the Kubernetes ecosystem. He shares his initial introduction to Kubernetes almost a decade ago, recognizing its capabilities in container orchestration.

Forget the Hype-Kelsey Hightower on What's Next in AI & Cloud Computing

Kelsey Hightower takes the stage at Civo Navigate San Francisco 2025 to explore the evolving world of AI, cloud computing, and Kubernetes. In this insightful discussion, he shares his thoughts on data sovereignty, AI adoption, security challenges, and the future of cloud technology. Recently joining Civo as a Board Director, Kelsey discusses his vision for reimagining cloud computing, the impact of AI agents, and what developers need to focus on in an increasingly AI-driven world.

Calico eBPF Source IP Preservation: The Unexpected Story of High Tail Latency

The Calico eBPF data plane is your choice if latency is your primary concern. It was very disturbing that some benchmarking brought to our attention that eBPF had higher tail latency than iptables. The 99+% percentiles were higher by as much as a few hundred milliseconds. We did a whole bunch of experiments and we could not crack the nut until we observed that there are some occasional and unexpected TCP reset (RST) packets, but no connections were reset.

Java on containers: a guide to efficient deployment

Java remains one of the most widely used programming languages today, especially in enterprise backend systems—and for many good reasons. With each new release, Java’s robust runtime offers additional improvements in performance, security, scalability, and developer productivity. The portability of its code has proven increasingly relevant and useful as the industry embraces ARM64, making Java one of the go-to languages for modern workloads.

Challenges in Kubernetes monitoring and how to overcome them

Kubernetes has revolutionized how organizations deploy, scale, and manage containerized applications, offering unprecedented efficiency and flexibility. However, the very characteristics that make Kubernetes so powerful—its dynamic, distributed, and ephemeral nature—also create significant challenges for monitoring. Without robust monitoring capabilities, organizations struggle to identify and resolve performance bottlenecks, optimize resource utilization, and maintain security.

AIOps for Kubernetes (or KAIOps?)

With the growing complexity of cloud-native applications, DevOps teams often face challenges when setting up and maintaining Kubernetes observability. AIOps (artificial intelligence for IT operations) makes the process more manageable using AI and machine learning for monitoring, troubleshooting, and performance optimization. In this article, you’ll learn about the common challenges in Kubernetes observability and how AIOps can provide proactive and effective solutions.

A Conversation with Steve Wozniak: Apple, Innovation, and Beyond

Get an inside look at the life and career of Steve Wozniak, co-founder of Apple, as he joins Mark Boost and Dinesh Majrekar on stage at Civo Navigate San Francisco 2025. From his early days as a young engineer to the creation of the Apple II, Wozniak shares his unique perspective on the tech industry and his role in shaping its future. With a career spanning decades, Wozniak reflects on his experiences as a pioneer in the tech industry, and offers valuable insights on how to drive innovation and stay ahead of the curve.

OpenShift vs. Kubernetes: What's the Difference?

If asked even a year ago to forecast the most dominant technologies of 2024, it].; may not be too surprising that containerization would be among those seeing widespread adoption. Now commonplace for modern app development, organizations are faced with deciding between two leading container orchestration platforms: OpenShift and Kubernetes, each touting superior orchestration. With both platforms vying for a share in the market, many struggle to choose one over the other.

Cloud Costs Out of Control? Civo Has the Answer!

Join Mark Boost as he kicks off Civo Navigate San Francisco 2025, setting the stage for two days of tech innovation and insights. Discover Civo’s mission to create a simpler, fairer, and better-value cloud solution, challenging the dominance of hyperscalers. Mark unveils his vision for a true multi-cloud future and introduces Civo’s latest innovations: Flexcore and RelaxAI. He then welcomes Kelsey Hightower to the stage for a major announcement—his new role as Civo’s Board Director—before diving into the crucial role of private cloud solutions in today’s evolving landscape.

How Cloud Computing Powers the Modern Internet

The way we use the internet today is vastly different from what it was just a decade ago. From high-speed video streaming and social media to cloud gaming and e-commerce, everything happens in real-time. But have you ever wondered what makes all of this possible? The answer lies in cloud computing-the backbone of modern digital services. Cloud computing provides the power, speed, and scalability needed to keep the internet running smoothly. It enables businesses to store massive amounts of data, process information instantly, and deliver online services without interruptions.