Alerting

wavefront

SLO Alerting with Wavefront

Back in the good old days of monolithic applications, most developers and application owners relied on tribal knowledge for what performance to expect. Although applications could be incredibly complex, the understanding of their inner workings usually resided within a relative few in the organization. Application performance was managed informally and measured casually.

pagerduty

Service-Based vs. Team-Based Approach: Which Is Better?

How is the incident response process set up at your organization? At PagerDuty, our approach is to holistically look at your infrastructure, your customer-facing applications, and your products. We distinguish these by describing these items as “services” that roll up to and make up a “business service.” This setup allows teams to better manage these services so that when incidents do happen, responders can gain context much faster. But how?

logz.io

How Kenshoo Streamlined Development by Creating an Intellij Plugin for Log-based Alerting within Logz.io

Complex problems can often be solved with simple, practical solutions. That’s what our team at Kenshoo discovered when we realized that we needed a way to easily and proactively track specific log messages which indicate mission-critical events.

onpage

Large Diamond Mining Organization Adopts OnPage

Diamond mining is recognized as a dangerous occupation, causing serious accidents for mineworkers across the globe. Often times, these incidents turn out to be fatal because the victim didn’t receive immediate care from first responders. However, significant strides are being made to minimize the impact of these accidents by large, international organizations.

victorops

The Definitive Guide to DevOps Incident Management

Software developers and IT professionals alike are spending more time in production environments – detecting anomalies in performance and fixing issues in real-time. Instead of writing code and deploying new updates on a monthly, quarterly or even yearly basis, software companies are now releasing multiple deployments each day.

pagerduty

Unplanned Work, Part 2: The Impact on the Enterprise

Today, technology problems can alter the trajectory of a business. Minutes of downtime or latency (slow is the new down) cost organizations dearly in lost revenue and can jeopardize customer relationships. However, there’s an even more important consequence of technology problems than top-line risk: reduced innovation as teams are forced into reactive fire drills that take time away from product development.