Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

How Grafana Labs enables horizontally scalable tail sampling in the OpenTelemetry Collector

Tracing is a widely adopted solution to provide performance insights into distributed applications. It is a valuable resource for developers to view the service call graph and track service latency at a granular level. It’s also a handy tool for on-call engineers to drill down and debug a problematic service during an outage. There are a number of open source distributed tracing frameworks out in the wild, including Jaeger, Zipkin, and OpenTelemetry.

How we monitor Checkly

If you follow our very own @HLENKE, you might have seen his recent tweet. Availability and responsiveness are key topics for every SaaS platform. They also happen to be multi-level, complex topics that often span different technology stacks and can be tackled with a variety of approaches. Hannes' tweet actually gives us the perfect segue into a blog post about how our engineering team currently monitors Checkly.

The Raw & Real Approach to Observability

Practicing observability isn’t just about tools. It also means improving how you work together and how you share lessons across the team. Learning from each other helps everyone on your team become better engineers that can create amazing experiences with code, or that make code work at incredible scale (or both!). Writing software and operating it in production is—and must be—a team sport.

Serverless Logging Performance - Part 1

When thinking about serverless applications, one thing that comes to mind immediately is efficiency. Running code that gets the job done as swiftly and efficiently as possible means you spend less money, which means good coding practices suddenly directly impact your bottom line. How does logging play into this, though? Every logging action your application takes is within the scope of that same performance evaluation.

Automating Storage Forecasting Using a Time Series Database Puts the Future in Customers' Hands Today

When the stakes are high, every decision is only as good as the information behind it. With the right information, enterprises and vital sectors can confidently make informed decisions. Data becomes a foundation for action — and a source of differentiation. But how do you store the relentless influx of data — especially since data storage costs, amplified by the risk of data loss, are among the top hurdles facing organizations today?

Infrastructure as a Competitive Advantage - Tips for Managing Trading Operations

I recently spoke on a panel discussion with the Securities Technology Analysis Center (STAC) on the use of infrastructure as a competitive advantage. The event offered fresh thinking on what it takes to manage high-frequency, low-latency trading environments - so I wanted to share some best practices for organization, monitoring, and how to make insights operational.

Survivorship Bias in Observability

During World War II, a mathematician named Abraham Wald worked on a problem – identifying where to add armor to planes based on the aircraft that returned from missions and their bullet puncture patterns. The obvious and accepted thought was that the bullets represented the problem areas for the planes. Wald pointed out that the problem areas weren’t actually these areas, because these planes survived.