Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

The Cloud Monitoring Journey

Monitoring is not a goal, but a path. Depending on the maturity of your project, it can be labeled in one of these six steps of the cloud monitoring journey. You will find best practices for all of them and examine what companies get from each one. From classic virtual machines to large Kubernetes clusters or even serverless architectures, companies have adopted the cloud as a mainstream way to provide their online services.

Prometheus Alertmanager best practices

Have you ever fallen asleep to the sounds of your on-call team in a Zoom call? If you’ve had the misfortune to sympathize with this experience, you likely understand the problem of Alert Fatigue firsthand. During an active incident, it can be exhausting to tease the upstream root cause from downstream noise while you’re context switching between your terminal and your alerts. This is where Alertmanager comes in, providing a way to mitigate each of the problems related to Alert Fatigue.

Sysdig Monitor introduces native support for Microsoft Azure Monitor

Microsoft Azure Monitor allows customers to get critical details about their Azure cloud environments and services. The API for Azure Monitor can be a great way for teams to pull this information into their own storage systems for further analysis. However, it can be an overwhelming amount of data to process. Sysdig can help with this problem and eliminate time and effort. Here is how we do it …

Exploiting IAM security misconfigurations and how to detect them

These three IAM security misconfiguration scenarios are rather common. Discover how they can be exploited, but also, how easy it is to detect and correct them. Identity and access management (IAM) misconfigurations are one of the most common concerns in cloud security. Over the last few years, we have seen how these security holes put organizations at increased risk of experiencing serious attacks on their Cloud accounts.

Our Journey Into Cutting Kubernetes Costs by 40%

As companies start their Kubernetes and cloud-native journey, cloud infrastructures and services grow at a rapid pace. This happens all too often as organizations shift left without thorough controls, which can lead to overallocating and overspending on their Kubernetes environments. Organizations running workloads in the cloud can put budgets at risk when they lack information about key facts.

How to Monitor kube-controller-manager

When it comes to creating new Pods from a ReplicationController or ReplicaSet, ServiceAccounts for namespaces, or even new EndPoints for a Service, kube-controller-manager is the one responsible for carrying out these tasks. Monitoring the Kubernetes controller manager is fundamental to ensure the proper operation of your Kubernetes cluster. If you are in your cloud-native journey, running your workloads on top of Kubernetes, don’t miss the kube-controller-manager observability.

Exploring the New Container Checkpointing Feature

Kubernetes is a continuously evolving technology strongly supported by the open source community. In the last What’s new in Kubernetes 1.25, we mentioned the latest features that have been integrated. Among these, one may have great potential in future containerized environments because it can provide interesting forensics capabilities and container checkpointing.

Kubernetes Services: ClusterIP, Nodeport and LoadBalancer

Pods are ephemeral. And they are meant to be. They can be seamlessly destroyed and replaced if using a Deployment. Or they can be scaled at some point when using Horizontal Pod Autoscaling (HPA). This means we can’t rely on the Pod IP address to connect with applications running in our containers internally or externally, as the Pod might not be there in the future.

A day in the life of a Customer Support Detective

I open my laptop and look over my cases while I slurp down my first cup of coffee. Most of my backlog is waiting on customer updates, or bug fixes. Two of my cases have been marked for closure. Not a bad start for a Monday! A pod CrashLoopBackoff issue was resolved by bumping up memory requests, and the missing metrics issue was solved after applying some Prometheus annotations to the customer’s nginx pods. I notate and close both cases. No sooner do I hear the beep of the badge scanner.