Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

Crossing K8s Monitoring and Observability Gaps With Change Intelligence

Recently we had the privilege of being named a Gartner Cool Vendor in the Monitoring and Observability category. The funny thing is, while this is definitely the closest Gartner category for our solution, we aren’t really used to thinking about Komodor as a monitoring and observability tool.

Just Launched ValidKube. Here Are 7 Other K8s Open Source Projects We Love!

I am excited to share that we’ve just launched our first open source project called ValidKube. The idea behind Validkube is to fuse together the capabilities of three other popular open-source projects (kubeval, kubectl-neat and trivy by Aqua) and present them in a single view, providing users with a way to ensure YAML code hygiene and security, all at the same time and with just a few clicks of the button.

How One Company Accidently Autoscaled to 200 Nodes and Crashed The App

This article is based on a true story. The names of the company and people involved were changed to protect the innocent 🙂 . A few weeks ago, we were contacted by a pretty big e-commerce company. We can’t really share their name but, for the purpose of this story, let’s call them “KubeCorp Inc”. They reached out to us following an edge-case incident they had, which resulted in severe downtime.

The Top 5 Kubernetes Configuration Mistakes-And How to Avoid Them

Let’s face it – Kubernetes can (and is) oftentimes very complex. This means that you’re bound to make a cluster configuration mistake along the way – apart from it impacting your cluster’s performance and security, it can also heavily affect your ability to enforce visibility and troubleshooting. There is, however, a light at the end of the tunnel.

Diving Under the Hood With Our New 'Node Status' Feature

More than anything else, Kubernetes troubleshooting relies on the ability to quickly contextualize the problem with what’s happening in the rest of the cluster. As complicated as this may sound, SPEED is really the name of the game. After all, more often than not, you will be conducting your investigation under the glow of fires burning bright in production. Getting relevant context quickly and seeing things holistically is exactly what Komodor was created for.

Four Best Practices to Migrate to Kubernetes (Part 1)

Kubernetes has evolved into the leading platform to build your microservices systems. Given its increased maturity over the past few years as well as the robust ecosystem which has been built around its technology, Kubernetes has become more production-ready than ever. Nevertheless, it still has its own unique set of challenges. In particular, it brings a lot of complexity into play with its adoption.

Five Kubernetes Deployment Best Practices (Part 2)

In our previous post , we focused on tips for making the transition and migration to Kubernetes a smoother, and less painful process. In this post, we’d like to now provide some tips from the operational trenches for future-proofing your Kubernetes operation, after making the move. Kubernetes, as a software-driven system, has many benefits for engineers and DevOps teams to take advantage of.

Komodor Workflows: Automated Troubleshooting at the Speed of WHOOSH!

Today, just in time for Kubecon 2021, I am happy to announce the beta availability of Workflows. For me, this is our most exciting product announcement to date – a completely new capability that expands the definition of what Komodor is, as it charts the course for its next evolution. Let me start with the feature first. In a nutshell, Workflows is a series of smart algorithms that operate within the “depths” of Komodor.

Best Practices Guide for Kubernetes Labels and Annotations

Kubernetes is the de facto container-management technology in the cloud world due to its scalability and reliability. It also provides a very flexible and developer-friendly API, which is the foundation of its control plane. The effectiveness of the Kubernetes API comes from how it manages the Kubernetes resources via metadata: labels and annotations. Metadata is essential for grouping resources, redirecting requests and managing deployments.

The Aftermath of the Facebook 6-Hour Outage

Less than 24 hours ago, the world came to a “social standstill” as Facebook, and its sister companies, WhatsApp and Instagram, became unavailable, leaving its 3.5 billion users in a flap. The outage, which lasted almost 6 hours, shut off access for users and businesses all over the world and caused ripple effects that we will likely continue to see in the immediate (and perhaps not-so-immediate) future.