%term

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

The Pragmatic Buyer's Guide to AIOps Platforms

Oct 17, 2019 By Yoram Pollack In BigPanda

It’s been said hundreds of times: in the digital era, customers tolerate no downtime. IT operations teams must keep systems running 24x7x365, as the price of downtime is steep. According to Gartner, in 2014, organizations lost $5,600 per minute of downtime, which worked out to well over $300,000 per hour. Today, it’s likely higher, as organizations increasingly rely on technology to power revenue-generating business services.

Read Post

BigPanda

Read more about The Pragmatic Buyer's Guide to AIOps Platforms

Announcing Runbooks

Oct 17, 2019 By Bobby Tables In FireHydrant

Since the beginning, we’ve wanted to make it faster, easier, and even a joy to respond to incidents. We’ve had the typical components of incident response for a while, but orchestrating them together was a manual task by our users. Today we’re marrying together all the features already available in our incident response tool into our newest release: Runbooks.

Read Post

FireHydrant

Read more about Announcing Runbooks

Modernizing Your Digital Operations with Sumo Logic and PagerDuty

Oct 17, 2019 By Sameer Nori In PagerDuty

As digital transformation continues to be central to an organization’s growth mandate, it’s critical to ensure that customer-facing, revenue-generating, mission-critical applications are operationally reliable and secure. That’s where Sumo Logic comes in—for almost 10 years, we have been providing a Continuous Intelligence platform for DevSecOps that’s utilized by over 2000+ customers in almost every vertical.

Read Post

PagerDuty

Read more about Modernizing Your Digital Operations with Sumo Logic and PagerDuty

Managing technical risk effectively with Error Budgets

Oct 14, 2019 By Prakya Vasudevan In Squadcast

Tradeoffs are hard. Think about the time when you had to choose between two equally compelling options - (a) addressing technical debt or (b) pushing out that long-awaited feature release, and risk breaking production. Or when your team couldn’t agree on where to draw the line on improving request latency versus shipping a major new update.

Read Post

Squadcast

Read more about Managing technical risk effectively with Error Budgets

Optimize Your Services - October 2019

Oct 10, 2019 By PagerDuty In PagerDuty

Interested in how to tune-up your services to derive even more benefit from your PagerDuty implementation? Easy and simple changes can have a huge impact on how much time and money you spend.

View Video

PagerDuty

Read more about Optimize Your Services - October 2019

Squadcast: Incident Resolution, The SRE Way

Oct 10, 2019 By Squadcast In Squadcast

Squadcast is an incident management tool that’s purpose-built for SRE. Create a blameless culture by reducing the need for physical war rooms, centralize SLO dashboards, unify internal and external SLIs and automate incident resolution and knowledge base creation with Squadcast Actions.

View Video

Squadcast

Read more about Squadcast: Incident Resolution, The SRE Way

Making the Most of PagerDuty + Datadog

Oct 10, 2019 By David M. Lentz In PagerDuty

For your team to effectively respond to incidents, you need a shared, unambiguous incident definition so you can recognize when an incident has occurred and assign the appropriate severity. Definitions of an incident differ across teams, but whatever definition you use, identifying and monitoring key service level indicators (SLIs) can help you understand when your service is operating normally—and when its performance has degraded to the point where you need to trigger an incident.

Read Post

PagerDuty

Read more about Making the Most of PagerDuty + Datadog

PagerDuty + Customers: Hear Why the UK Loves PagerDuty, Too!

Oct 9, 2019 By PagerDuty In PagerDuty

At PagerDuty Connect London, customers shared their stories of how PagerDuty is helping lead digital transformation at their companies. We got a chance to take some of these customers aside to ask them why they love PagerDuty!

View Video

PagerDuty

Read more about PagerDuty + Customers: Hear Why the UK Loves PagerDuty, Too!

A single person on-call "rotation" is a critical vulnerability

Oct 9, 2019 By Daniel In FireHydrant

One of the most common complaints we hear from operations and site reliability engineers is about the quality of life impacts and the resulting stress imposed by their on-call responsibilities. Most of us are already aware that a proper on-call rotation is critical to our engineering organization’s health in terms of both immediate incident response and long-term sustainable growth.

Read Post

FireHydrant

Read more about A single person on-call "rotation" is a critical vulnerability

OnPage Mentioned in Two 2019 Gartner Hype Cycle Reports

Oct 8, 2019 By Ritika Bramhe In OnPage

Gartner’s Hype Cycle for Business Continuity and IT Performance Analysis are trusted reports, identifying solutions that enhance and solidify an organization’s business continuity. The OnPage team is pleased to announce that we’ve been included in two of Gartner’s Hype Cycle reports, listing OnPage’s incident alert management solution as a trusted tool for today’s support teams.

Read Post