Operations | Monitoring | ITSM | DevOps | Cloud

The Inconvenient Truth About AI Ethics in Observability

Let's be honest: most conversations about AI ethics sound like they're happening in a boardroom, not an ops room. But here's the thing, when you're using AI to make sense of your telemetry data, ethics isn't some abstract concept. It's the difference between insights you can trust and algorithmic noise that leads you down the wrong path. The uncomfortable reality? Your AI is only as ethical as the messiest, most biased piece of telemetry data you feed it. And if you think your data is clean, well...

ITOps vs DevOps: Understanding Their Roles in Modern IT Environments

The conversation around ITOps vs DevOps continues as organizations pursue agile development and responsive service delivery. While both practices share the goal of improving software and infrastructure management, they emerge from distinct historical, operational, and cultural backgrounds. Understanding how these models differ at their core helps decision-makers choose the most suitable operating strategy and align their teams for smoother collaboration.

What Are Traces? A Developer's Guide to Distributed Tracing

One of the most common challenges in modern software engineering today is understanding how requests flow through applications. As system architectures shift to favor widely distributed, cloud-native designs, keeping track of how an application processes user actions is more difficult than ever. A single user action may trigger events processed in dozens of backend services. Traces are helping software developers today with this challenge.

OWASP CI/CD Part 9: Improper Artifact Integrity Validation

Improper artifact integrity validation is a critical vulnerability in CI/CD pipelines characterised by insufficient mechanisms to cryptographically verify the authenticity and integrity of code and build artifacts traversing the pipeline. When these controls are weak or absent, adversaries with access to any pipeline stage can inject malicious or tampered artifacts that appear legitimate, enabling undetected propagation through the pipeline and eventual deployment into production environments.

Canonical announces Charmed Feast: A production-grade feature store for your open source MLOps stack

July 10, 2025: Today, Canonical announced the release of Charmed Feast, an enterprise solution for feature management with seamless integration with Charmed Kubeflow, Canonical’s distribution of the popular open source MLOps platform. Charmed Feast provides the full breadth of the upstream Feast capabilities, adding multi-cloud capabilities, and comprehensive support.

Datadog named Leader in 2025 Gartner Magic Quadrant for Observability Platforms

We are thrilled to announce that, for the fifth consecutive year, Datadog has been named a Leader in the 2025 Gartner Magic Quadrant for Observability Platforms. We believe that this recognition reflects our continued focus on helping customers observe, secure, and act on everything that matters across their technology stack.

What is Log Loss and Cross-Entropy

You're building a classification model, and your framework throws around terms like "log loss" and "cross-entropy loss." Are they the same thing? When should you use binary cross-entropy versus categorical cross-entropy? What about focal loss? This blog breaks down these loss functions with practical examples and real-world implementations.

Cloud Log Management: A Developer's Guide to Scalable Observability

As systems move to microservices, serverless, and multi-cloud setups, debugging gets harder. You’re no longer dealing with a single log file; you’re looking at logs from dozens of services, running across different environments. Traditional debugging methods like SSH-ing into servers or adding print statements don’t scale in these environments. Cloud log management tools help by collecting logs from all your services into one place.

How We Made Our Queries 99.5% Faster

We cut log-query scanning from ~100% of data blocks to < 1% by reorganizing how logs are stored in ClickHouse. Instead of relying on bloom-filter skip indexes, they generate a deterministic “resource fingerprint” (hash of cluster + namespace + pod, etc.) for every log source and sort the table by this fingerprint in the primary-key ORDER BY clause. This packs logs from the same pod/service contiguously, letting ClickHouse’s sparse primary-key index skip irrelevant blocks.