Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Observabilty for complex systems and related technologies.

Datadog vs. Splunk: Which Is the Better Observability Solution [2023 Comparison]

Datadog and Splunk are among the most popular performance monitoring tools available on the market. If you’re looking for such a solution and looking to scratch one off your shortlist, look no further than this article. In this Datadog vs Splunk comparison, we will take a deep dive into everything each tool has to offer. We will point out their similarities and differences to help you decide which tool can meet your needs better.

Developing with OpenAI and Observability

Honeycomb recently released our Query Assistant, which uses ChatGPT behind the scenes to build queries based on your natural language question. It's pretty cool. While developing this feature, our team (including Tanya Romankova and Craig Atkinson) built tracing in from the start, and used it to get the feature working smoothly. Here's an example. This trace shows a Query Assistant call that took 14 seconds. Is ChatGPT that slow? Our traces can tell us!

What is Observability?

“Observability” seems to be the buzzword du jour in IT these days but what does it actually mean, and how is it any different from plain, old monitoring? In simple terms, observability is the ability to understand how a system is performing and how it is behaving from the data that system generates. It is not just about monitoring metrics or collecting logs, but also understanding the context of those metrics and logs, and how they relate to the overall health of the system.

Why Paradigm switched to Grafana Cloud: Inside their observability stack

As the largest liquidity network in crypto, Paradigm facilitates more than $11 billion in monthly volumes, representing nearly 40% global cryptocurrency option flows. Their free-to-use platform provides a single point of access to multi-asset, multi-instrument liquidity on demand, and Software Architect Jameel Al-Aziz leads the team of developers who build and maintain the platform.

Mastering Complex Progressive Delivery Challenges with Lightrun

Progressive delivery is a modification of continuous delivery that allows developers to release new features to users in a gradual, controlled fashion. It does this in two ways. Firstly, by using feature flags to turn specific features ‘on’ or ‘off’ in production, based on certain conditions, such as specific subsets of users. This lets developers deploy rapidly to production and perform testing there before turning a feature on.

Cribl Stream Production Deployment Guide

Deploying new tools can be challenging for Operations and Security data teams. However, we recently released a reference architecture for Cribl Stream to streamline this process and reduce trial and error. During a live discussion, Cribl's Ed Bailey and Eugene Katz will share a real-life example of how a customer would start the deployment planning process using real-world examples. We will start with requirements and finish with a diagram to help guide a production deployment.

Understanding Observability: The Key to Effective System Monitoring

In the rapidly evolving landscape of modern tech, system reliability has become a critical factor for businesses to succeed. To ensure the stability and performance of complex distributed systems, companies are relying on observability—a concept that isn’t synonymous, but instead goes beyond traditional monitoring approaches.

Gain insights into Kubernetes errors with Elastic Observability logs and OpenAI

As we’ve shown in previous blogs, Elastic® provides a way to ingest and manage telemetry from the Kubernetes cluster and the application running on it. Elastic provides out-of-the-box dashboards to help with tracking metrics, log management and analytics, APM functionality (which also supports native OpenTelemetry), and the ability to analyze everything with AIOps features and machine learning (ML).

Less is more: industry leaders share their success with tool consolidation for maximized productivity

We’ve known for years that context switching is detrimental to productivity. Both computers and humans become less productive with each additional concurrent task or priority. Every time you need to shift your focus between projects, you lose approximately 20% efficiency as you figure out where you left off, what needs to be done, how the work fits into the project, etc.