Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

Managed IT Service Provider, BDNet Corporate Networking Recommends OnPage

In this video, Brian Domschke, CEO of BD Net Corporate Networking recommends OnPage for on-call management. Keep watching to learn how his organization leverages OnPage's digital fail-safe scheduling capabilities and alerting system to notify on-call staff after hours. OnPage continues to empower Managed Service Providers of all sizes to accelerate incident remediation for clients and provide exceptional IT services.

On-call by default

Like many SaaS businesses, we have an on-call rota to enable us to provide 24x7 cover if there are problems with incident.io. We have a 'pager' which will alert the relevant person if something unexpected happens in our app, so that they can investigate and fix it if needed. Note: This was adapted from an internal document we wrote about how we think about on-call at incident.io.

What does a DevOps Engineer do? We analyzed 29 job postings to find out.

Introduction As all companies become software driven, DevOps is becoming an important practice in enterprises and startups across the world. DevOps is about bringing velocity to delivering tech products and services, so you can delight customers and meet business goals. To achieve this velocity, development (dev) and operations (ops) teams work closely together across the software lifecycle - from planning to release. And this has led to a new role in engineering teams - DevOps Engineer.

PagerDuty for Facilities and Crisis Response

Jason Flint, Senior Manager of Facilities and Crisis Response at PagerDuty joins the stream to chat about how PagerDuty the company uses PagerDuty the platform to meet the needs of an increasingly distributed workforce. His team keeps track of everything from extreme weather events to political unrest that might impact PagerDuty employees.

The Best Tools for System Monitoring

It takes a lot to run a modern business. From websites to technical solutions and everything in between, it’s no surprise we need better monitoring systems to make sure everything is operational. With multiple gears turning at once on any given platform, incidents are inevitable—especially for companies that are constantly growing and innovating. And the impact of incidents can affect user services, operations, and even business reputation.

Breaking down complex projects into smaller, shippable increments

Building a complex new product can be scary. What if no-one gets value from it? What if it doesn't work? What if it's hard to change? One way to mitigate these risks is to break down the product into smaller shippable increments, allowing you to capture feedback early and confirming the most important assumptions before fully committing to a solution.