%term

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

Taking down (and restoring) the Raygun ingestion API

Nov 2, 2023 By Vishakh Nair In Raygun

In a world where Software as a Service (SaaS) products are integral to daily life, maintaining uninterrupted service for end-users is paramount. However, stuff happens. When it does, our most valuable response (other than restoring service ASAP) is to review the series of events that led up to the incident and learn from them. On August 25th, 2023, at 7:02 AM NZT, Raygun experienced a significant incident that impacted our API ingestion cluster, leading to an outage lasting approximately 1 hour and 15 minutes. While this wasn't fun for anyone involved, this incident did prove to be a valuable learning experience, shedding light on the importance of infrastructure management and resilience.

Read Post

Raygun

Read more about Taking down (and restoring) the Raygun ingestion API

Status Pages That Deliver: Top 10 Favorites

Nov 2, 2023 By Chitra Bisht In Squadcast

Status Pages represent an invaluable asset for websites and SaaS businesses, particularly in today's environment with prevalent outages and heightened user expectations for seamless uptime. Integral to any robust website monitoring strategy, these pages serve as centralized hubs, offering users a singular, authoritative source for tracking the status of websites and applications.

Read Post

Squadcast

Read more about Status Pages That Deliver: Top 10 Favorites

Status Pages 101: How to Create a Status Page You and Your Customers Will Actually Want to Use

Nov 2, 2023 By Ashley Sawatsky In Rootly

This blog post is adapted from my talk at SRECon EMEA 2023 - original slides are available here! Status pages are a simple yet underutilized element of incident communication. Done well, they’re a low-lift way to keep your customers and stakeholders informed when incidents impact them. But without a solid approach, updating status pages can easily become a tedious and often neglected task during incidents. In this post, we’ll cover some tips to get your status page right.

Read Post

Rootly

Read more about Status Pages 101: How to Create a Status Page You and Your Customers Will Actually Want to Use

PagerDuty and Jeli Together Will Transform Incident Management

Nov 2, 2023 By Dan McCall In PagerDuty

Today is an important day for us at PagerDuty, and for the larger ecosystem of incident management. We’ve signed a definitive agreement to acquire Jeli, a standout player in the incident management space. This deal represents a strategic alignment of visions, technologies and goals that will have a lasting impact on the industry and our customers.

Read Post

PagerDuty

Read more about PagerDuty and Jeli Together Will Transform Incident Management

Basics of Incident Management

Nov 1, 2023 By Kaushik Thirthappa In Spike

Life is full of unexpected incidents. From the coffee spill that disrupts your morning routine to the sudden traffic jam that transforms a 20-minute commute into an hour-long ordeal. Much like these challenges, most of our systems and infrastructure also constantly face these tiny glitches. If ignored, they can have a significant impact. Unlike minor inconveniences, these glitches we call Incidents have the potential to disrupt your business, frustrate customers, and eat into your revenue.

Read Post

Spike

Read more about Basics of Incident Management

Set Responders Up for Success with New User Onboarding

Nov 1, 2023 By Cristina Dias In PagerDuty

Effective incident response plays a critical role in maintaining smooth operations at organizations of all sizes. When built up correctly, operational resilience–that ability to bounce back quickly after failure–can act as a shield that guards your customer experience, ensuring that even when incidents inevitably happen, you’re back online in no time.

Read Post

PagerDuty

Read more about Set Responders Up for Success with New User Onboarding

xMatters Support - Dynamic Groups

Oct 31, 2023 By xMatters In xMatters

Dynamic groups are teams of users based on selected criteria. A dynamic group's members change depending on who matches the selected criteria at the time of an alert. For example, you can create a dynamic group that includes all users who have specific training (such as first aid or fire safety) in a particular physical location within your organization. You could base this on a custom user property that indicates the level of training each user has. As each user gains a certification, the group is updated to reflect that change.

View Video

xMatters

Incident Management

Read more about xMatters Support - Dynamic Groups

The Unplanned Show, Episode 18: Resilient architectures with Matt Stine

Oct 31, 2023 By PagerDuty In PagerDuty

We'll catch up with Matt Stine, the author of "Migrating to Cloud-Native Architectures" (O'Reilly), about his current thinking on resilient architectures.

View Video

PagerDuty

Read more about The Unplanned Show, Episode 18: Resilient architectures with Matt Stine

PagerDuty Operations Cloud Fall Launch 2023

Oct 30, 2023 By Inga Weizman In PagerDuty

Across the business landscape, 2023 has been called the “year of efficiency.” Organizations have had to deliver more growth and innovation, but with tighter budgets and headcount than in prior years. CIOs have needed to build strategies to mitigate the risk of operational failure and protect their brand’s customer experience.

Read Post

PagerDuty

Read more about PagerDuty Operations Cloud Fall Launch 2023

Interlink's Service Chain Mapping solution: Helping Banking & Finance Organizations Strengthen Operational Resilience and Meet Regulatory Requirements

Oct 30, 2023 By David Arrowsmith In Interlink

Operational resilience is an increasing area of focus and scrutiny for regulators of the banking and financial services industry. In the European Union, the Digital Operational Resilience Act (DORA) looms on the near horizon - with equivalent regulatory frameworks slowly but surely rolling out across the globe.

Read Post