Operations | Monitoring | ITSM | DevOps | Cloud

Slack's New Metrics Storage Engine Challenges Prometheus

Metrics storage engines must be specially engineered to accommodate the quirks of metrics time-series data. Prometheus is probably the most popular metrics storage engine today, powering numerous services including our own Logz.io Infrastructure Monitoring. But Prometheus was not enough for Slack given their web-scale operation. They set out to design a new storage engine that can yield 10x more write throughput, and 3x more read throughput than Prometheus! In February 2022 Suman Karumuri, Sr.

Why Icinga?

We have decided to make some short educational videos about Icinga, and today we will be releasing the first one: Why Icinga? In these videos we want to explain the Whys and Whats and Hows around Icinga in a way that is accessible to anyone who is interested. So Why do you want to use Icinga? Monitoring is the foundation you want to build your infrastrastructure on.

Troubleshoot faster with improved Datadog Events

Datadog Events provides customers with a data feed about their infrastructure and applications, delivering an up-to-the-minute history of activity such as code deployments, configuration changes, and triggered alerts. Events collects data from Datadog products and over 100 third-party integrations—including Docker, Jenkins, Kubernetes, Sentry, AWS CloudWatch, and Azure Service Health.

Bolster network monitoring with root cause analysis

If you own an enterprise, then you know the value of a healthy network and how seriously detrimental a network outage is to your business. But network issues are inevitable. The heavy dependence on networks to meet the ever-changing client and internal usage requirements takes a heavy toll on the network. This makes networks vulnerable to common problems such as unplanned, sudden downtime, high resource utilization, and hardware malfunctioning.

How We Run Successful Beta Tests with Error Reporting

We’ve recently completed a large beta test for our new product here at Testmo. We build a test management tool, so most of our users are professional software testers. As you can imagine, our customers are a rather critical group of users when it comes to software quality. We’ve learned some important lessons about running a large beta test and we want to share how we benefited from Sentry error reporting to identify, find, and fix issues quickly.

New Honeycomb Whitepaper on Frontend Observability

Big news: I can finally stop pointing anyone who asks about Honeycomb’s story for frontend observability to Emily’s blog post from 2017 on “Instrumenting Page Loads with Honeycomb.” (It was a great post, don’t get me wrong, but I don’t think any of us knew it would bear such weight for so long.) I am ecstatic to announce that we have released a new whitepaper called “Getting Started With Honeycomb Client-Side Instrumentation for Browser Applications,” wri

Modernizing Network Monitoring with InfluxDB and Telegraf

This article was originally published in The New Stack. As the technology landscape continues to change at a rapid pace, enterprise companies are in a rush to catch up and modernize their legacy IT and network infrastructure to capture the benefits of newly developed tools and best practices. By adopting modern DevOps techniques, they can reduce their operational costs, increase the reliability of their services and improve the overall speed and agility at which their IT teams are able to move.

Website Performance Monitoring: What Are You Really Paying for?

Have you found yourself confused by the plans and pricing around website performance monitoring? Are you using the features you’re paying for? Finding the right service involves many moving parts. Very often, that journey begins with a quest to find a simple up or down monitoring tool for external verification. It’s only after you take that first step into the market that you begin to notice additional features and expanded functionality.

Welcome to the Auvik family, MetaGeek!

We have some exciting news. Today I’m thrilled to welcome a new member to the Auvik family. The MetaGeek team are experts in wireless and through their 15+ year history, MetaGeek has led the way in creating tools to help network administrators and wireless engineers build, monitor, and troubleshoot Wi-Fi networks. At Auvik, we have an ambitious and aggressive roadmap to deliver a remarkable technology management experience.

New in Grafana Loki 2.5: Faster queries, more log sources, so long S3 rate limits, and more!

I’m very excited to tell you all about the latest Grafana Loki installment, 2.5! A huge amount of work, nearly 500 PRs, has gone into Loki between v2.4 and now. The major themes for this release are improved performance, continuing ease of operations, and more ways to ingest your logs. I usually find myself the most excited about performance improvements, so let’s start there.