Operations | Monitoring | ITSM | DevOps | Cloud

April 2022

Slack's New Logging Storage Engine Challenges Elasticsearch

Elasticsearch has long been the prominent solution for log management and analytics. Cloud-native and microservices architectures, together with the surge in workload volumes and diversity, have surfaced some challenges for web-scale enterprises such as Slack and Twitter. My podcast guest Suman Karumuri, a Sr. Staff software engineer at Slack, has made a career on solving this problem. In my chat with Suman, he discusses for the first time in a public space a new project from his team at Slack: KalDB.

Survey: Complexity and Costs Threaten Health of Strong DevOps Pulse

Oh how quickly the times change. Over the past 5 years, as Logz.io has executed its annual DevOps Pulse Survey and created the accompanying report, the state of “modern” cloud engineering and monitoring practices have advanced at a torrid pace. As a result, so has the project itself. Back in 2017 when we started polling the industry and collecting data, we sought to validate the usefulness of DevOps itself.

Slack's New Metrics Storage Engine Challenges Prometheus

Metrics storage engines must be specially engineered to accommodate the quirks of metrics time-series data. Prometheus is probably the most popular metrics storage engine today, powering numerous services including our own Logz.io Infrastructure Monitoring. But Prometheus was not enough for Slack given their web-scale operation. They set out to design a new storage engine that can yield 10x more write throughput, and 3x more read throughput than Prometheus! In February 2022 Suman Karumuri, Sr.

Spring4Shell Zero-Day Vulnerability: Overview and Alert Upon Detection for CVE-2022-22965

On March 29, 2022, a critical vulnerability targeting the Spring Java framework was disclosed by VMware. This severe vulnerability is identified as a separate vulnerability inside Spring Core, tracked as CVE-2022-22965 and canonically named “Spring4Shell” or “SpringShell”, leveraging class injection leading to a full remote code execution (RCE).

Who Owns Observability In Enterprises?

It’s common sense. When a logstorm hits, you don’t want to be left scrambling to find the one engineer from each team in your organization that actually understands the logging system – then spending even more time mapping the logging format of each team with the formats of every other team, all before you can begin to respond to the incident at hand. It’s a model that simply won’t scale.