A lot of tech companies struggle with creating an effective and efficient on-call schedule internally for their product and service, this results in much longer downtimes when something goes wrong. They often over-burden their team members with repeated on-call duty which results in team member fatigue. Here’s how to create an on-call schedule that your team might love.
At Sensu Summit 2019, Adam Westman, Sr. Engineering Manager at Target, introduced us to GoAlert, their on-call scheduling and notification open source project. In this post, I’ll recap his talk, sharing the journey that led them to build GoAlert, the problems they’ve solved, and how they use GoAlert with Sensu Go to simplify monitoring and reduce alert fatigue.
“Being on-call is a critical duty that many operations and engineering teams must undertake to keep their services reliable and available. However, there are several pitfalls in the organization of on-call rotations and responsibilities that can lead to serious consequences for the services and the teams if not avoided.
Alerting has come a long way from the days of paging an on-call administrator in the middle of the night, to multiple on-call teams that run and manage incident response around the clock. This is because as organizations grow and scale, responding to incidents also gets more complex and you often need more than one team to get involved to successfully resolve an incident.
The idea behind an escalation matrix is simple: the situation requires greater authority to resolve. Authority can take many forms, including experience with a particular toolset or simply the proper permissions to flip the right switches. Therefore, escalation must involve putting the proper information into the right person’s hands (well, device).
Have you heard about OnPage’s new and exciting integration with Amion? Through this collaboration, healthcare organizations can improve clinical communications, resulting in better patient care. It’s a sure way to enhance the patient experience, ensuring that qualified, on-call physicians respond more effectively to urgent clinical matters (i.e., patient needs). So, how exactly does the integration work?