Operations | Monitoring | ITSM | DevOps | Cloud

DevOps

The latest News and Information on DevOps, CI/CD, Automation and related technologies.

Introducing Squadcast's Key Based Deduplication

We are excited to share another feature update with all our valued customers! We have recently gone live with our Key Based Deduplication feature, enabling you to define dedup keys using customizable templates for configured alert sources. With this feature, you can automatically group similar incidents and effectively deduplicate alerts.

AKS Day 2 management made easy

You’ve deployed your Azure Kubernetes Services (AKS) cluster into production. Now what? Deploying AKS clusters is cause for celebration, but don’t rest on your laurels for too long. You are now in the Day 2 Kubernetes management phase and the operational challenges are on the rise. The Kubernetes application lifecycle is broken into three main phases. They are often referred to as Days, but realistically, they take much longer than 24 hours!

Enable and use GKE Control plane logs

Are you having any issues with the control plane components in your GKE Cluster? Are you interested in gaining visibility into the control plane side of the cluster to troubleshoot the issues by yourself? Then GKE Control Plane Logs is a great way to gain insights on what's going on with your cluster. In this video, we provide a quick overview about Control Plane components and logs, and show how to enable control plane logs on the new and existing GKE clusters. Watch this video to learn how to use Control plane logs to troubleshoot webhook and control plane latency issues in GKE clusters.

Low Disk Space Remediation: Triaging the Explosion of Data and Closing the Loop

Today, there is an explosion of data in IT. This data explosion of critical infrastructure living in the cloud, on premises in the data center, or even orchestrated in containers can be subjected to low disk space issues. How do you respond to the challenging inconvenience of low disk space?

Are you ready for DORA?

Not to be confused with the popular children’s TV character, DORA is a new EU regulation for the financial sector, which stands for the Digital Operational Resilience Act. DORA became law on 16 January 2023 and will start to apply from 17 January 2025, so it’s crucial that senior executives in the financial sector, such as Chief Risk Officers and Chief Information Security Officers, understand its implications and prepare for compliance from day one.

Enhancing the Ubuntu Experience on Azure: Introducing Ubuntu Pro Updates Awareness

Canonical works closely with Microsoft to ensure that running Ubuntu on Azure is a great experience. One of the key aspects of this collaboration is ensuring the longevity and security of Ubuntu releases, such as Ubuntu 18.04 LTS, even beyond their Standard Security Maintenance period. We are excited to announce the integration of Ubuntu Pro update awareness into Azure through the Azure Guest Patching Service (AzGPS) and Update Management Center (UMC).

Prometheus Monitoring 101

Prometheus is an increasingly popular tool in the world of SREs and operational monitoring. Based on ideas from Google’s internal monitoring service (Borgmon), and with native support from services like Docker and Kubernetes, Prometheus is designed for a cloud-based, containerized world. As a result, it’s quite different from existing services like Graphite. ‍ Starting out, it can be tricky to know where to begin with the official Prometheus docs and the wave of recent Prom content.

Connecting Prometheus and Grafana

Using Prometheus and Grafana together is a great combination of tools for monitoring an infrastructure. In this article, we will discuss how Prometheus can be connected with Grafana and what makes Prometheus different from the rest of the tools in the market. MetricFire's product, Hosted Graphite, runs Graphite (a Prometheus alternative) with Grafana dashboards for you so you can have the reliability and ease of use that is hard to get while doing it in-house.