Operations | Monitoring | ITSM | DevOps | Cloud

On Call

On Counting Alerts

A while ago, I wrote about how we track on-call health, and I heard from various people about how “expecting to be woken up” can be extremely unhealthy, or how tracking the number of disruptions would actually be useful. I took that feedback to heart and wanted to address the issues they raised, and also provide some numbers that explain the position I took with these metrics on alerts.

Sponsored Post

Top Five Pitfalls of On-Call Scheduling

On-call schedules ensure that there's someone available day and night to fix or escalate any issues that arise. Using an on-call schedule helps keep things running smoothly. These on-call workers can be anyone from nurses and doctors required to respond to emergencies to IT and software engineering staff who need to fix service outages or significant bugs. Being on-call can be challenging and stressful. But with the proper practices in place, on-call schedules can fit well into an employee's work-life balance while still meeting the organization's needs.

Uncovering the mysteries of on-call

For the vast majority of organisations, some form of round-the-clock cover is critical to successful business operations. On-call is an essential part of an effective incident response process, yet there is no commonly accepted playbook on how to most effectively structure and compensate on-callers. We ran a survey to uncover the mysteries of how on-call works in organisations of different shapes and sizes around the world.

Webinar Recap: How to Avoid Being On Call With Under-Instrumented Tools

“It’s too expensive!” “Do we really need another tool?” “Our APM works just fine.” With strapped tech budgets and an abundance of tooling, it can be hard to justify a new expense—or something new for engineers to learn. Especially when they feel their current tool does the job adequately. But, does it?

Going On Call for the First Time

I've never been on call before, and I'm not sure what to expect, or how I can best prepare for it. Will I need to upend my life just in case the pager goes off? And how should I best cope with getting paged? I've read Charity's piece on the opposite problem of wanting to stop being on call, but it didn't quite answer my question.

Tracking On-Call Health

If you have an on-call rotation, you want it to be a healthy one. But this is sort of hard to measure because it has very abstract qualities to it. For example, are you feeling burnt out? Does it feel like you’re supported properly? Is there a sense of impending doom? Do you think everything is under control? Is it clashing with your own private life? Do you feel adequately equipped to deal with the challenges you may be asked to meet? Is there enough room given to recover after incidents?

SRE: From Theory to Practice | What's difficult about on-call?

We launched the first episode of a webinar series to tackle one of the major challenges facing organizations: on-call. SRE: From Theory to Practice - What’s difficult about on-call sees Blameless engineers Kurt Andersen and Matt Davis joined by Yvonne Lam, staff software engineer at Kong, and Charles Cary, CEO of Shoreline, for a fireside chat about everything on-call. As software becomes more ubiquitous and necessary in our lives, our standards for reliability grow alongside it.

OnCallogy Sessions

Being on call is challenging. It’s signing up to be operating complex services in a totally interruptible manner, at all hours of the day or night, with limited context. It’s therefore critical to have proper on-call on-boarding procedures, offer continuous training sessions, and continuously improve documentation. We also need to make sure people feel safe by providing ways to reduce their stress, and make room for questions to surface all sorts of uncertainties around our operations.