Operations | Monitoring | ITSM | DevOps | Cloud

On Call

On-call by default

Like many SaaS businesses, we have an on-call rota to enable us to provide 24x7 cover if there are problems with incident.io. We have a 'pager' which will alert the relevant person if something unexpected happens in our app, so that they can investigate and fix it if needed. Note: This was adapted from an internal document we wrote about how we think about on-call at incident.io.

Ask Miss O11y: I Don't Want to Be On Call Anymore. Am I a Monster?

First, I’d like to say that pager duty isn’t something we should treat like chronic pain or diabetes, where you just constantly manage symptoms and tend to flare-ups day and night. Being paged out of hours is as serious as a fucking heart attack. It should be RARE and taken SERIOUSLY. Resources should be mustered, product cycles should be reassigned, until the problem is fixed.

Publish SIGNL4 oncall and alert information to your Grafana dashboard

Grafana is as open source analytics and interactive visualization application. You can connect different data sources to display chart and graphs or even trigger alerts. Wouldn’t it be great to add information about SIGNL4 alerts or about who is on call as part of your dashboard? In this case you immediately get an overview about open, acknowledged, and closed alerts per category. Of you can see wo it currently on duty. Here is an example with a who-is-on call, and an alert overview panel.

Improve your on-call experience with Datadog mobile dashboard widgets

Life happens—even when you’re on-call. You can’t take your laptop everywhere, but whether you’re on the train, at dinner, or at the gym, you can count on the Datadog mobile app for access to key data about the status and performance of your applications. Now, you can use Datadog mobile widgets to build an on-call mobile dashboard directly on your phone’s home screen, so it’s even easier to track the data you care about from anywhere.

Evaluating Splunk On-Call Alternatives

Splunk On-Call (Formerly VictorOps) is a popular incident response and on-call management platform that allows engineering and operations teams to collaborate with ease and resolve issues faster. As part of the Splunk Observability Suite, Splunk On-Call is combined with related products to achieve the goal of bringing monitoring, troubleshooting, and investigation, into a single, comprehensive view — simplifying the process from incident detection to resolution.

3 Things to Consider When Investing in On-Call Scheduling Software

On-call scheduling software modernizes the way healthcare administrators assign responsibilities to care team members. The software helps create an equitable workforce among care teams and eliminates manual errors during the on-call scheduling process. Administrators can set up digital schedules to contact the right clinicians at the right time. This ensures that on-call providers quickly resolve patients’ issues to improve patient experience.