Operations | Monitoring | ITSM | DevOps | Cloud

The Importance of Historical Log Data

Centralized log management lets you decide who can access log data without actually having access to the servers. You can also correlate data from different sources, such as the operating system, your applications, and the firewall. Another benefit is that user do not need to log in to hundreds of devices to find out what is happening. You can also use data normalization and enhancement rules to create value for people who might not be familiar with a specific log type.

What Slack Downtime Costs, and What We Can Do About It

This morning, though, all of our backlogs were a little harder to sift through thanks to a Slack outage in Europe and the US. To calm down, some of us might have turned to our Google Home or Chromecast to unwind while the outage hours piled up, only to find those were down too! What a morning!Now that Slack is running again, let’s take a moment to reflect on what the outage means and what we can learn from it.

CFEngine 3.12.0 LTS Released

Today we are happy to announce the general availability of CFEngine 3.12.0 LTS! This release has a lot of new features, and we are very excited about all the new possibilities you get with CFEngine 3.12.0 LTS. If you are using the previous LTS, 3.10 you will also benefit from all the new features, improvements and testing of the 3.11 release, which you can read more about in the CFEngine 3.11 release post.

How Wix Supports TDD with their TestKit for Sentry's Raven SDK

Ziv Levy, Software Engineer at Wix (a Sentry customer), recently faced two challenges: simulating a bug in Wix code and testing report data. His recent blog post, Meet Raven TestKit: Wix Engineering’s Open Source Tool to Test Sentry Reports, dices into how a plugin for Sentry’s Raven SDK helped him tackle those challenges, with specific documentation detailing what made the choice so simple. Read the post below, and then go try Raven TestKit for yourself (please, and thank you).

6 Reasons Why PagerDuty Engineering Stands Out From the Crowd

The other day, a newer Engineering Manager here at PagerDuty, Dileshni Jayasinghe, started a Slack thread expressing joy at how fantastic our engineering team is after attending a conference with engineering folk from other organizations. She explained that she’d shared our practice of owning what we build with someone—who then responded by gazing off into the distance and saying, “That’s my dream.”

Metrics At Scale: How to Scale and Manage Millions of Metrics (Part 2)

With businesses collecting millions of metrics, let’s look at how they can efficiently scale and deal with these amounts. As covered in the previous article (A Spike in Sales Is Not Always Good News), analyzing millions of metrics for changes may result in alert storms, notifying users about EVERY change, not just the most significant ones. To bring order to this situation, Anodot groups correlated anomalies together, in a unified alert.