Operations | Monitoring | ITSM | DevOps | Cloud

OpenTelemetry Metrics Explained: A Guide for Engineers

OpenTelemetry (often abbreviated as OTel) is the golden standard observability framework, allowing users to collect, process, and export telemetry data from their systems. OpenTelemetry’s framework is organized into distinct signals, each offering an aspect of observability. Among these signals, OpenTelemetry metrics are crucial in helping engineers understand their systems.

How to avoid blowing the budget on Azure AI

So you had a great day playing with really awesome new tech, solving big business challenges, and feeling like you really nailed it. Then you wake up the next day to an alert from Azure telling you you've blown your monthly budget and its only the first week of the month. We've all been there... right? Using any cloud service comes with a cost, but for most services the budget risk is low. Cost calculated daily isn't a problem when usage is predictable, but not everything works like that.

Search and analyze unsampled logs in real time with Live Tail

With thousands of logs generated every minute from your infrastructure, applications, services, and devices, retaining all of this data for active search and analysis can be cost-prohibitive. Because log volumes continue to grow rapidly as operations scale, it’s common for organizations to implement log management strategies and limit the amount that they store in order to minimize costs.

Integration roundup: Monitoring your modern data platforms

Modern applications increasingly rely on specialized databases and platforms to power real-time analytics and support advanced AI/ML capabilities. These tools help teams accelerate development by consolidating workflows and processes, enabling faster and more efficient data operations. That’s why Datadog has launched three new data platform integrations with Supabase, DuckDB, and Milvus.

Networks are everyone's business - TCP Checks for app developers

Checkly is the industry’s best tool to monitor your production applications. With the power of playwright, developers can test the systems they’ve developed, and roll out those tests as production monitors running from multiple geographies on the Checkly system. And Checkly monitors thousands of API endpoints with complex validation, setup and cleanup scripts, and reliable alerting. So why are we expanding into TCP-based checks?

Should you run your database on Kubernetes?

In the early days, people debated how safe it was to store their money in the bank; now, we debate running databases on Kubernetes. Over the years, Kubernetes has evolved significantly, transforming into a capable platform for handling various workloads, including stateful ones. In this blog, I will consolidate some of the best arguments from both sides and provide you with some points to discuss with your team lead in your next conversation. It's an interesting topic with varying answers.

Your New Retrospective Experience: More Collaborative, Customizable, and Powerful

Run smarter, more effective retros. Customize retros, collaborate in real time, and surface key insights faster with AI. The new experience empowers you to spend less time documenting and more time working together as a team to uncover the insights that lead to real improvements in your process, roles, and technology.

Introducing Observo Orion: Your AI Data Engineer for Security and DevOps

I’m thrilled to announce the general availability of Observo Orion, the industry’s first Agentic AI Data Engineer. This launch represents more than just a new product — it’s a fundamental shift in how organizations will manage their security and observability data pipelines. For years, I’ve watched organizations struggle with data engineering challenges. It’s been a highly specialized discipline, requiring deep technical expertise and significant manual effort.