Incident Management

opsgenie

Introducing External Services in Opsgenie, powered by Statuspage

As IT and DevOps teams rely more heavily on third-party services, the likelihood of an external incident affecting your customers increases. The 2017 Amazon S3 outage comes to mind as a particularly large downtime event that took thousands of websites down with it. When things go wrong with either an internal or external service, the right people need to be alerted to properly respond to the issue and communicate with customers.

splunk

No Fighting in the War Room: Two Steps to Reduce Mean Time to Resolution with Modern Monitoring

You’re at dinner with friends, messages start coming in, and your favorite football team is about to win their first game in a very long time. You jump on the mobile app to see the last five minutes live. Then you get that dreaded error message… “Service is unavailable, please try again” On the other end of the message, it’s also just as frustrating—the call to war has sounded and the generals start assembling. The team meets in the war room—be it physical or virtual—and the clock is ticking.

victorops

Customizing Your Incident Management Workflow With the VictorOps API

One critical aspect of application support is an on-call strategy allowing the management of application issues outside of typical business hours. The effectiveness of a DevOps or IT on-call strategy is defined by its ability to resolve high-priority incidents in a timely manner. VictorOps provides a solution for the challenges associated with incident management through the use of collaborative incident management software.

PagerDuty: Your Journey To Real-Time Operations

In a world where people expect always-on, seamless digital experiences, it is essential that teams are empowered with the right tools and processes to work together and deliver in critical moments of truth. Our CEO, Jennifer Tejada, shares how PagerDuty acts as the central nervous system for the digital enterprise, helping connect teams to real-time opportunity and elevate work to the outcomes that matter.

PagerDuty Pulse May19

Catch up on all the exciting things we’ve released over the past several months. In this edition of PagerDuty Pulse, you’ll get a view into our Spring release, which helps teams across the enterprise effectively take action during the most critical moments with the power of data, intelligence, and automation at scale. We’re excited to release and share new enhancements across all of our products (Event Intelligence, Modern Incident Response, Analytics, Visibility), as well as to the core platform.
firehydrant

SLO, SLA, SLI Oh My! Creating them can be easy

Imagine you are driving a car on a freeway. Your speedometer is telling you you’re going 62 mph. But you “gotta go fast”. Faster than then 65 mph speed limit. So you go for it: first 68mph, then 75mph, then 80mph. Then you pass a police officer hiding in a speed trap. To your dismay, they pull you over and give you a ticket. All is not lost: there is a silver lining here. It’s the perfect analogy is to understand how indicators, objectives, and agreements all work with each other.