Operations | Monitoring | ITSM | DevOps | Cloud

From raw data to flame graphs: A deep dive into how the OpenTelemetry eBPF profiler symbolizes Go

Imagine you're troubleshooting a production issue: your application is slow, the CPU is spiking, and users are complaining. You turn to your profiler for answers—after all, this is exactly what it's built for. The profiler runs, collecting thousands of stack samples. eBPF profilers, including the OpenTelemetry eBPF profiler, operate at the kernel level, so they capture raw program counters: memory addresses pointing into your binary.

When Code Becomes Cheap: The New Reliability Constraint in Software Engineering

For most of the history of software engineering, the primary constraint was production. Code was expensive, skilled engineers were scarce, and shipping features required concentrated human effort. Velocity was limited by how fast people could reason, implement, test, and deploy. That constraint shaped everything from team size, architecture, release cadence, through to how we thought about technical debt. When production is expensive, you optimise for output. You remove friction from shipping.

Smarter Alerts, Faster Root Cause, & Proactive IT Ops with SolarWinds AI Observability

Discover how AI is transforming IT operations with SolarWinds Observability. In this video, we showcase powerful new AI-driven features designed to help you detect issues faster, reduce alert noise, and stay ahead of performance problems across your entire stack. From applications and databases to networks, cloud infrastructure, and end-user experience SolarWinds AI delivers deep insights where it matters most.

An Oh Dear skill for use in Claude Code or Codex

AI coding agents are getting good at calling tools. Claude Code, Codex, and others can run shell commands, parse JSON, and reason about the results. But they need to know what tools are available and how to use them. That's what skills are for. A skill is a small package of documentation that teaches an AI agent how to use a specific tool. We've built one for Oh Dear.

CertKit Keystore: Private keys that never leave your infrastructure

When you use CertKit, your private keys live in CertKit’s database, encrypted at rest. We’ve written about why the actual risk is smaller than it sounds. But some organizations have policies that prohibit storing private keys with any third party, regardless of how they’re protected. That policy isn’t going away. The Local Keystore enables those organizations to use CertKit and still keep their keys local.

Monitor Juniper Mist in Datadog

From point-of-sale (POS) terminals to cloud-based applications and mobile devices, reliable connectivity is critical to business operations. Even brief disruptions can negatively impact user experiences, resulting in failed transactions, delayed application responses, or repeated attempts to reconnect. Juniper Mist is an AI-powered networking platform that provides insight into wireless environments, including access point performance and radio frequency health.

A new Host Map for modern infrastructure

A host map is a visual representation of your infrastructure that displays hosts and related resources such as clusters, pods, and containers in a single, interactive view. We introduced the Datadog Host Map more than a decade ago to help you “know thy infrastructure” and answer critical questions: Does everything look healthy? Has anything changed? Does the shape of my environment match what I expect?

Scary Things Happen in Production. Context Helps You Find Them.

Production is a rowdy place of chaos, especially at scale. When you have millions of requests per second flowing through your system, weird things are always happening. Outliers, unusual request patterns, spikes and pulses of traffic from unknown sources, port scanning…it’s all there. To the naked eye, it looks like noise. If you know what you are looking for…patterns emerge. The night sky: every dot is a request. Without intent, it's an undifferentiated field of light.

The Four Factors of Production-Ready PostgreSQL

Discover how Aiven makes the 4 factors of production-ready PostgreSQL easy A database isn't production-ready just because your application can query it. To be truly ready for production, your PostgreSQL setup must be able to survive node failures, block unauthorized network access, handle sudden connection spikes without crashing, and tell you exactly why a query is running slow. Let’s explore these four factors and how Aiven puts being production-ready on easy mode.

How to Plan a Successful CI/CD Migration Without Disrupting Developers | Harness Blog

Modern engineering teams run on CI/CD. It’s where pull requests get validated, artifacts get produced, and releases get promoted to production. That also makes CI/CD migration very risky because you're not just moving a "tool"; you're moving the workflow that developers use dozens or hundreds of times a day. The good news: disruption is optional.