Your engineers probably dislike going on-call for your services. Some might even dread it. It doesn't have to be this way. With a few changes to how your team runs on-call, and deals with recurring alerts, you might find your team starting to enjoy it (as unimaginable as that sounds). I wrote this article as a follow-up to Getting over on-call anxiety.
Attention, on-call warriors! You asked, we answered — we’re excited to announce that our reimagined mobile app is now generally available to our customers.
You've joined a company, or worked there a little while, and you've just now realised that you'll have to do on-call. You feel like you don't know much about how everything fits together, how are you supposed to fix it at 2am when you get paged? So you're a little nervous. Understandable. Here are a few tips to help you become less nervous.
Background We recently released the biggest overhaul to one of the core features of Spike.sh - On-call schedules. Software teams use on-call schedules to designate first responders who will handle issues when they occur.
On-call planning is one of the most popular features in Enterprise Alert and is widely used by users, team managers and administrators. However, in our discussions we keep finding that it is not simply done with 5 minutes of planning. Scheduling often depend on external systems. This can range from a simple excel form provided to HR all the way to a comprehensive billing system such as SAP. As a result, it takes a quite a bit of time to transfer the planned shifts to third-party systems.
We’re excited to present a feature update to the OnPage platform. The new update will bring more flexibility and resiliency to a team’s on-call management workflow. With the new scheduling capabilities, OnPage system administrators can create exceptions to configured, recurring on-call schedules.
An on-call schedule tells you and everyone in the team who will be the first responder when an issue happens in production. The on-call team member is responsible for investigating the issue, either fixing the issue herself or adding other people who can help fix it. Having an on-call schedule is important for building reliable systems because making someone responsible for production issues makes sure that they're not ignored.