Operations | Monitoring | ITSM | DevOps | Cloud

Zero code tracing: Kubernetes observability with Logz.io and eBPF

Distributed tracing is a core tool for operating modern microservices platforms. For SREs and DevOps teams, it is often the fastest way to understand latency issues, service dependencies, and unexpected failure modes. But achieving comprehensive tracing coverage is resource-intensive and time-consuming. It usually requires application changes, language-specific instrumentation, agent lifecycle management, and ongoing coordination with development teams.

The Rise of AI Agents and the Reinvention of Kubernetes: Ratan Tipirneni's 2026 Outlook

Prediction: The next evolution of Kubernetes is not about scale alone, but about intelligence, autonomy, and governance. As part of the article ‘AI and Enterprise Technology Predictions from Industry Experts for 2026′, published by Solutions Review, Ratan Tipirneni, CEO of Tigera, shares his perspective on how AI and cloud-native technologies are shaping the future of Kubernetes.

Kubernetes v1.35: The Release That Tackles the Industry's $100 Billion Waste Problem

Kubernetes v1.35 dropped a couple of weeks ago, and while the headlines focus on gang scheduling and in-place resizing going GA, there’s a bigger story here that every platform team needs to understand: Kubernetes is finally acknowledging that cluster utilization is fundamentally broken. At Komodor, we work with hundreds of organizations running Kubernetes at scale.

7 Kubernetes Predictions for 2026 - AI Will Push SRE to its Limit

As AI workloads shift from training to massive-scale inference, SRE teams are about to feel even more pressure. GPU-heavy computing is breaking the assumptions today’s clusters were built on, while enterprises are beginning to trust autonomous operations and cost pressure is pushing consolidation across the cloud-infrastructure stack.

[Webinar] Accelerating Kubernetes Intelligence: Cisco's Platform Evolution

Join Hasith Kalpage, Director of Platform Engineering , and Arthur Drozdov, Agentic AI Engineer, as they share how Cisco is using Komodor’s Klaudia Agentic AI to evolve its platform strategy, to unlock smoother developer experience, slash MTTR, and reduce bottlenecks across the enterprise. – Including a live demo of the CAIPE platform!

Do You Need a Service Mesh? Understanding the Role of CNI vs. Service Mesh

The world of Kubernetes networking can sometimes be confusing. What’s a CNI? A service mesh? Do I need one? Both? And how do they interact in my cluster? The questions can go on and on. Even for seasoned platform engineers, making sense of where these two components overlap and where the boundaries of responsibility end can be challenging. Seemingly bewildering obstacles can stand in the way of getting the most out of their complementary features.

How to build AI agents with n8n and relaxAI: Live webinar

You have the ideas, now learn how to turn them into production-ready AI agents. Join us on January 21st at 5:00 PM for a live webinar featuring Ben Norris, AI Engineer at Civo, and Sophia McKee, COO at Civo. We will demonstrate how to design, build, and deploy intelligent agents using n8n’s visual workflow automation platform, all powered by secure, UK-hosted infrastructure from relaxAI. You'll learn how to orchestrate tools, APIs, and LLMs to create scalable automations without needing deep coding expertise.

Site24x7's Kubernetes monitoring | Proactive, scalable, AI-powered

Kubernetes drives modern cloud-native applications, but its distributed nature creates visibility and performance challenges at scale. In this video, discover how Site24x7 provides real-time monitoring, AI-powered anomaly detection, and scalability for Kubernetes environments, helping you to proactively manage resources and resolve issues faster. Key features of Site24x7 Kubernetes Monitoring: Whether you're running a single Kubernetes cluster or managing multiple environments, Site24x7 helps you ensure peak performance and faster decision-making with minimal manual intervention.

How Istio Ambient Mode Delivers Real World Solutions

For years, platform teams have known what a service mesh can provide: strong workload identity, authorization, mutual TLS authentication and encryption, fine-grained traffic control, and deep observability across distributed systems. In theory, Istio checked all the boxes. In practice though, many teams hit a wall. Across industries like financial services, media, retail, and SaaS, organizations told a similar story. They wanted mTLS between services to meet regulatory or security requirements.

Scaling Kubernetes GitOps with Fleet: Experiment Results and Lessons Learnt

Fleet, Rancher’s built-in GitOps engine, is designed to scale up to thousands of clusters. However, “how far” can it scale in a real world scenario, you might ask? Earlier this year, we wrote about the Fleet benchmark tool and we made a few discoveries that were very instructive, especially concerning resource consumption and its impact on deployments’ performances.

KubeCon Atlanta 2025 & the AI-Native Shift

KubeCon + CloudNativeCon North America 2025 in Atlanta marked a definitive moment for cloud-native infrastructure. Over four days, celebrating the 10th anniversary of both CNCF and Kubernetes, more than 9,000 attendees witnessed the ecosystem’s evolution from container orchestration to AI-native operations. The conference delivered a clear message – AI workloads are no longer experimental.

Scaling faster and predictable cloud bills with Civo's FlexCore

How does Defense.com scale its SaaS security platform while keeping costs predictable? CEO Oliver Pinson-Roxburgh explains why Civo’s FlexCore was the only choice. FlexCore is engineered to deliver massive scalability and high performance, as milliseconds matter for real-time threat analysis, while ensuring UK Data Sovereignty and Compliance (ISO 27001). Crucially, FlexCore offers predictable pricing, eliminating the sudden, massive bills of larger providers. FlexCore delivers on-prem performance with public cloud scaling and simplicity.

Building Trust in AI-Powered Kubernetes Ops: Why "Good Enough" Is a Production Killer

The air in the operations world is thick with AI and LLMs. EVERY vendor is rushing to slap an “AI-powered” badge on their product. But here’s the uncomfortable truth: In high-stakes Kubernetes operations, one bad AI recommendation can destroy months of trust-building in an instant. We aren’t building a chatbot to suggest recipes. We are building systems that, armed with kubectl permissions, have the potential to take down production with a single, wrong command.

Setting Up a Windows VM on Cycle

In the last few months we've made some changes to VMs that finally allow installing and running Windows on them. MS Paint on Cycle is finally a reality. To make Windows VMs work, we had to add a few things to the platform to support it. As always with Windows, there are some quirks, gotchas, and pain points. But in the guide below, I'll show you how we solved these issues in our recent platform update, and how to install and run a Windows Server 2025 VM on Cycle with full network connectivity.

Setting up OpenTelemetry Demo in Kubernetes with Splunk Observability Cloud

Are you looking to explore the power of OpenTelemetry and Splunk Observability Cloud in a Kubernetes environment? This video provides a comprehensive, step-by-step walkthrough on how to deploy the OpenTelemetry Demo application in Kubernetes and seamlessly integrate it with Splunk Observability Cloud for metrics, traces, and logs! In this tutorial, you'll learn.

Building visibility and resilience across Kubernetes

Kubernetes has transformed how modern applications are deployed and scaled. Its flexibility and automation power innovation but also expand the attack surface. From control plane access to runtime drift, Kubernetes introduces layers of complexity that can obscure visibility if not properly monitored. For security leaders, Kubernetes is both an opportunity and a risk. While it enables agility, it also decentralizes security responsibility across teams, tools, and cloud layers.

Ingress NGINX Controller Is Dead - Should You Move to Gateway API?

Ingress NGINX Controller, the trusty staple of countless platform engineering toolkits, is about to be put out to pasture. This news was announced by the Kubernetes community recently, and very quickly circulated throughout the cloud-native space. It’s big news for any platform team that currently uses the NGINX Controller because, as of March 26, 2026, there will be no more bug fixes, no more critical vulnerability patches and no more enhancements when Kubernetes continues to release new versions.

Heroku vs. Kubernetes

If you are deciding where to deploy a web app, you will almost always run into a choice between a platform like Heroku and running on Kubernetes. This article will compare Heroku and Kubernetes. They are two popular platforms for deploying and managing applications. This article breaks down the key differences in architecture, use cases, complexity, cost, and scalability to help engineers choose the right go-to platform for their needs.

The cloud the way you want it: Introducing cloud parity

For decades, there have been two incompatible worlds in cloud: Public (AWS, Google, Microsoft) and Private (VMware, Nutanix). Moving between them meant throwing everything away and re-architecting your systems. Civo is rewriting that script. This final thought from the Civo keynote at Civo Navigate London 2025 introduces Cloud Parity: the elimination of the public/private gap. It's just one way of working, with the same product, same API, and same support.

Monitor your Kubernetes operators to keep applications running smoothly

The performance of your Kubernetes operators often influences the behavior of the applications they manage. Operators automate the day-to-day management of your applications by executing critical activities, which may include scaling replicas, performing upgrades, and recovering from failures. For example, a PostgreSQL operator can ensure that standby servers are always deployed, that the database’s failover is correctly configured, and that data is backed up on schedule.

An In-Depth Look at Istio Ambient Mode with Calico

Organizations are struggling with rising operational complexity, fragmented tools, and inconsistent security enforcement as Kubernetes becomes the foundation for modern application platforms. As a result of this complexity and fragmentation, platform teams are increasingly burdened by the need to stitch together separate solutions for networking, network security, and observability.

The War Room of AI Agents: Why the Future of AI SRE is Multi-Agent Orchestration

We’ve all been there. It’s 2 AM, your phone is buzzing with alerts, and you’re suddenly thrust into an incident war room with a dozen other bleary-eyed engineers. The production environment is on fire, customers are affected, and everyone’s trying to piece together what went wrong. But here’s what makes these moments fascinating from a systems perspective – it’s rarely just one person silently fixing the issue in isolation.

Docker Logs Command Reference: tail, follow, since Options

Managing Docker container logs is essential for debugging and monitoring application performance. Tailoring Docker logs allows for real-time insights, quick issue resolution, and optimized performance. This guide focuses on efficient methods for tailing Docker logs, with clear examples and command options to streamline log management.

Harnessing the potential of 5G with Kubernetes: a cloud-native telco transformation perspective

Telecommunications networks are undergoing a cloud-native revolution. 5G promises ultra-fast connectivity and real-time services, but achieving those benefits requires an infrastructure that is agile, low-latency, and highly reliable. Kubernetes has emerged as a cornerstone for telecom operators to meet 5G demands.

What is cloud parity? The future of flexible and sovereign cloud computing

Back in 2024, I officially put a name to a concept at Civo we had been developing for many years. I called it cloud parity. When Civo was incepted, two completely different worlds existed, the public cloud dominated by Amazon, Microsoft and Google, and the private cloud dominated mainly by VMware.

CI/CD for Go Microservices on Scaleway Kubernetes with CircleCI

Development teams depend on microservices to build, deploy, and scale features independently. Microservices have become the backbone of modern, scalable applications. Scaleway’s managed Kubernetes service (Kubernetes Kapsule) offers a powerful, cost-effective platform for running containerized workloads in the cloud. It’s a great fit for startups and solo engineers who want to focus on shipping features, not managing infrastructure.

Cloud cost crisis: 90% of Indian businesses face unexpected bills

Cloud promised simplicity. Instead, Indian businesses are paying for surprises! This video reveals key findings from our Cost of Cloud 2025 research, which exposes a massive cloud cost crisis for Indian organizations: The issue isn't cloud adoption, it's a lack of clarity, predictability, and control. Civo is built for the future: simple, predictable, locally compliant cloud.

AWS Batch On EKS: Streamlining Containerized Workloads

Machine learning pipelines are getting heavier by the day. From model training to large-scale inference and data preprocessing, compute demands are scaling faster than teams can manage. Kubernetes clusters groan under unpredictable job spikes. Static infrastructure wastes money when workloads slow down. The result? Organizations are perpetually chasing flexibility, automation, and cost efficiency. AWS has quietly built a solution to establish that balance.

Is It Time to Migrate? A Practical Look at Kubernetes Ingress vs. Gateway API

If you’ve managed traffic in Kubernetes, you’ve likely worked with Ingress controllers. For years, Ingress has been the standard way to expose HTTP and HTTPS services. But in practice, it often came with trade-offs. Controller-specific annotations were required to unlock critical features, the line between infrastructure and application responsibilities was unclear, and configurations often became tied to the implementation rather than the intent.

Cost Optimization Is Now Part of the SRE Playbook

In the era of cloud-native architectures, Site Reliability Engineering (SRE) has matured from a discipline focused purely on uptime to a sophisticated practice of efficient reliability. The key driver for this evolution is an undeniable truth: cloud spend has become intrinsically linked to system stability.

Scaling with Wildcard Certificates: Why Modern Infrastructure Benefits

Managing TLS certificates at scale is one of those operational tasks that starts simple and quickly grows into a sprawling problem. As organizations adopt microservices, multi-tenant architectures, and globally distributed load balancers, the number of domains and subdomains they support can expand dramatically. Each certificate then requires its own lifecycle management: Wildcard certificates offer a powerful solution to this growing complexity.

Level Up Your Container Security: Introducing the JFrog Kubelet Credential Provider

Amazon Elastic Kubernetes Service (Amazon EKS) is a fully managed, compliant Kubernetes service that simplifies running, managing, and scaling containerized applications. EKS automatically handles the availability and scalability of the Kubernetes control plane, allowing teams of any size or skill level to focus on building and deploying production-ready applications across diverse environments, including AWS, on-premises, and at the edge.

kubectl logs Command Reference and Documentation

The kubectl logs command retrieves container logs from Kubernetes pods. It supports real-time log streaming with -f, time-based filtering with --since, viewing previous container instances with --previous, and accessing logs from specific containers in multi-container pods using -c.

Optimize Kubernetes cluster cost with Datadog Cluster Autoscaler

Running Kubernetes at scale almost always means paying for more compute than you need. To protect reliability, platform and application teams typically overprovision nodes early in development and keep scaling up as they add features and workloads. They are often reluctant to move to smaller or different instance types without a clear picture of how those changes will affect performance or availability. The result is a fleet of underutilized nodes that silently inflate your cloud bill.

Data control with CivoStack Enterprise: Beyond the air-gap debate

When an organization talks about sovereignty it is usually about where its data lives, who can touch it and how it is protected. Adding air‑gap to the discussion often turns the conversation into a binary: either the system is completely cut off from the outside world or it isn’t. In practice the reality sits somewhere in between.

Digital sovereignty: US sanctions and the control of European cloud

"There's a big potential kill switch sitting on his desk." This clip from our Digital Sovereignty Panel exposes a fundamental threat: how US geopolitical interests can compel cloud providers like Microsoft to suspend services, even for international organizations. Panelist Johan David Michels discusses the shocking case of the ICC prosecutor, Karim Khan, whose work email was withdrawn by Microsoft following US sanctions. This illustrates the fundamental lack of digital sovereignty over data hosted by US hyperscalers.

Stop tool sprawl - Welcome to Terraform/OpenTofu support

Provisioning cloud resources shouldn’t require a second stack of tools. With Qovery’s new Terraform and OpenTofu support, you can now define and deploy your infrastructure right alongside your applications. Declaratively, securely, and in one place. No external runners. No glue code. No tool sprawl.