Operations | Monitoring | ITSM | DevOps | Cloud

Incident Management

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

5 Exciting Predictions for SRE in 2023

SRE is a field defined by its constant evolution: from Google’s in-house secret recipe, to the hottest new practice for the biggest enterprise orgs, to a diverse and holistic mentality practiced by orgs of all sizes. Earlier this year, we co-sponsored the Catchpoint State of SRE survey, where we took the temperature of SRE where it was. Now, as we did in 2021 and 2020, we’ll turn to the future to speculate on what 2023 will bring for SRE. ‍
Sponsored Post

Using AIOps for Better Adaptive Incident Management

An effective incident management strategy is crucial for any business, especially those offering consumer-facing digital services. This is because when incidents occur, they may be easily detected by your users, impact your reputation, and ultimately affect your bottom line. So, to minimize the reach and severity of incidents, your response needs to be swift and effective. One way to ensure your approach meets these requirements is to implement AIOps.

Sponsored Post

Runbook Automation as a Baseline for Controllability and Observability

Some of the highest priorities for engineers - from NOC Engineers, DevOps & Site Reliability Engineers - are the automation and optimization of their production environments. Many companies today face tough challenges with their Network Operations Centers (NOCs) or production environments. These challenges fall into the hands of engineering teams.

ITIL vs. ITSM - What's the difference?

Companies depend on IT services to support their business operations, and to meet the demands of their customers. ITIL (Information Technology Infrastructure Library) and ITSM (Information Technology Service Management) are frameworks to help organizations manage their IT services. While these two do have elements in common, they also have important differences. ITIL is a set of best practices for IT service management which emphasizes the alignment of IT with the needs of the business.

How we approach integrations at incident.io

If you pick a random SaaS company out of a jar and go to their website, chance are they integrate with another tool. Typically, the end goal of integrations is to meet users in the middle by working with other tools they’re already using on a day-to-day. Put another way, integrations are a strategic business decision. But the question remains: why don’t companies just build a tool with similar functionality in order to make the product stickier?

The Risks Of Using Small Status Page Vendors

Servers are down. Employees are scrambling. Customers are upset. The pressure is on. When internal operations are in disarray, and your business is experiencing a service outage, the last thing you need to worry about is the reliability of your incident communication solution. Keeping users informed when services are down is mission-critical, in order to prevent a flood of support requests, which compound the effects of the incident, straining employee productivity and bandwidth.

PagerDuty and FiberPlane Integration Demo

Presenter: Aparna Valsala, Solutions Engineer at Fiberplane, Using the PagerDuty and Fiberplane integration, the responding engineer can immediately start the investigation using a predefined and configurable Fiberplane template visible to all while allowing multiple engineers to collaborate on the investigation with complete visibility and context.