Operations | Monitoring | ITSM | DevOps | Cloud

Your Boss Doesn't Understand Your Work (Here's Why)

Developer productivity metrics create unique anxiety. If your company rolled out tracking systems like DORA metrics or velocity dashboards, you're probably wondering what these numbers mean and how they'll evaluate your work. At GitKon 2025, we assembled senior engineers from GitHub, Cloudflare, Kong, and GitKraken to discuss "Your Boss is Measuring You, Now What?" The panel included both individual contributors and engineering leaders, creating an honest conversation about measurement from both perspectives.

From Blueprint to Production: Building a Kubernetes MCP Server

As Large Language Models (LLMs) evolve from simple chatbots into agentic workflows, the need for a standardized way to connect them to external data and infrastructure has become critical. In a recent workshop hosted by Nir Adler, Innovation Engineer at Komodor, we explored how to bridge this gap using the Model Context Protocol (MCP).

What you missed at OTel Unplugged 2026 in 8 minutes!

OTel Unplugged 2026 was different by design. Held alongside FOSDEM in Brussels, this was an unconference built by the OpenTelemetry community, for the community. No sales pitches. No product demos. Just honest conversations about what’s working, what’s broken, and where OTel needs to go next. In this recap, you’ll hear short interviews and reflections from engineers, maintainers, and practitioners on.

How to Use Pandas Time Index: A Tutorial with Examples

Time series data is everywhere in modern analytics, from stock prices and sensor readings to web traffic and financial transactions. When working with temporal data in Python, pandas provides powerful tools for handling time-based indexing through its DatetimeIndex functionality. This tutorial will guide you through creating, manipulating, and extracting insights from pandas time indexes with practical examples.

Heartbeat behind the metrics | Raghavan on building Site24x7

How do you build an observability platform that keeps up with constant change? In this episode of Heartbeat Behind the Metrics, Srinivasa Raghavan Santhanam, Director of Product Management at Site24x7, reflects on more than 15 years with the product and what he sees as its quiet strengths. He talks about GenAI as a hidden gem inside Site24x7, and you'll hear a standout customer story where a large Indian enterprise replaced 12 different tools with Site24x7, consolidating everything into a single platform. For him, that moment confirmed the platform’s ability to solve multiple problems at scale.

Agentic AI Essentials: The Dashboard and Changing IT Roles

Dashboards provide a useful prism through which we can study the broader evolution of the IT professional’s role in the era of agentic AI. For years, dashboards have been the centerpiece of IT work, serving as the interface where teams interpret system behavior, diagnose issues, and plan actions. Dashboards epitomize the relationship between humans and their systems: humans observe, interpret, and act. As agentic AI enters the picture, that relationship begins to change. Let’s explore how.

What Is Alert Noise Reduction? Techniques & Tools

Modern IT environments are noisy. The sheer volume of telemetry data coming forth every second from microservices, hybrid clouds, and containerized applications is just extraordinary. In IT Operations, NOC teams, and Site Reliability Engineers (SREs), this data is crucial, but only if it can be acted upon. When it’s not like this, everything becomes a background noise.

What is the Open Container Initiative?

In this video, we explain the Open Container Initiative (OCI) and how open, vendor-neutral standards make containers portable and interoperable across platforms, tools, and environments. We cover what OCI is, why OCI compliance matters, and how OCI defines the core building blocks of the container ecosystem: container images, runtimes, and distribution.

The AI-Empowered Site Reliability Engineer: Automating the Balance of Risk and Velocity

You might expect an AI-SRE agent to target 100% reliable services, ones that never fail. It turns out that past a certain point, however, increasing reliability is worse for a service (and its users) rather than better! Extreme reliability comes at a non-linear cost: maximizing stability limits how fast new features can be developed, dramatically increases the operational cost, and reduces the features a team can afford to offer.