Operations | Monitoring | ITSM | DevOps | Cloud

On Call

How to create an on-call schedule that doesn't suck.

A lot of tech companies struggle with creating an effective and efficient on-call schedule internally for their product and service, this results in much longer downtimes when something goes wrong. They often over-burden their team members with repeated on-call duty which results in team member fatigue. Here’s how to create an on-call schedule that your team might love.

How to avoid on-call burnout

It sucks to be on-call when processes are not well defined and streamlined. Especially around the holidays. You really don't want to hear your phone repeatedly going off right when you're sitting for Christmas dinner with your loved ones or getting to unwrapping the good presents (the ones with the sparkly wrapping paper :P). Your on-call team’s stress levels reflects the health of your system, the cleanliness of your code and the culture of your organization.

Reducing alert fatigue with GoAlert, Target's on-call scheduling and notification platform

At Sensu Summit 2019, Adam Westman, Sr. Engineering Manager at Target, introduced us to GoAlert, their on-call scheduling and notification open source project. In this post, I’ll recap his talk, sharing the journey that led them to build GoAlert, the problems they’ve solved, and how they use GoAlert with Sensu Go to simplify monitoring and reduce alert fatigue.

On-call doesn't have to be stressfull

“Being on-call is a critical duty that many operations and engineering teams must undertake to keep their services reliable and available. However, there are several pitfalls in the organization of on-call rotations and responsibilities that can lead to serious consequences for the services and the teams if not avoided.

Best Practices for Managing Multiple On-Call Teams

Alerting has come a long way from the days of paging an on-call administrator in the middle of the night, to multiple on-call teams that run and manage incident response around the clock. This is because as organizations grow and scale, responding to incidents also gets more complex and you often need more than one team to get involved to successfully resolve an incident.

Building a Smarter Escalation Matrix with Uptime.com

The idea behind an escalation matrix is simple: the situation requires greater authority to resolve. Authority can take many forms, including experience with a particular toolset or simply the proper permissions to flip the right switches. Therefore, escalation must involve putting the proper information into the right person’s hands (well, device).

OnPage's Latest Integration: Amion Physician Scheduling

Have you heard about OnPage’s new and exciting integration with Amion? Through this collaboration, healthcare organizations can improve clinical communications, resulting in better patient care. It’s a sure way to enhance the patient experience, ensuring that qualified, on-call physicians respond more effectively to urgent clinical matters (i.e., patient needs). So, how exactly does the integration work?

A New Bee's First Oncall

I’m Honeycomb’s newest engineer, now on my eighth week at Honeycomb. Excitingly, I did my first week of oncall two weeks ago! Almost every engineer at Honeycomb participates in oncall, and I chose to join in the tradition. This may seem unconventional for a Developer Advocate — surely my time might be better spent holding more meetings with customers and giving more talks? Yet, I found that being oncall was the right decision for me.