Operations | Monitoring | ITSM | DevOps | Cloud

Incident Management

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

What is a Pull Request and Why You Need Them

As an engineer, you're probably familiar with version control systems like Git that let you track changes to your codebase. But are you using one of the most useful features of Git pull requests? If not, you're missing out. Pull requests are one of the best ways to collaborate on projects and create better code. In this article, we'll go over the pull request meaning, why you should be using them, and how to create your own pull requests.📑 What is incident management software?

The definitive guide to event correlation in AIOps: Processes, tools, examples, and checklist

Are you tired of sifting through a sea of IT events and alerts? Or perhaps you’ve found yourself overwhelmed by the volume of data flooding your monitoring systems and challenged to identify the incident root cause. There’s a better way to manage the chaos: using AIOps to unite disparate tools, data, and teams for event correlation.

PagerDuty for Customer Service Operations

Provide relevant context to solve customer problems. Customer service representatives need relevant historical context in order to accurately and quickly resolve the issue at hand. Reduce the impact on your customers by layering monitoring data from technical resources across your organization with data from customer calls and other systems of record—so you have a holistic view of an issue and can identify the right solution quickly.

Why Invest in Tooling? Benefits and Concerns

When looking to invest money in your engineering teams, what gives the best return? Hiring more staff to enable bigger projects and more diversified skill sets? Training engineers to uplevel their ability and productivity? Increasing salaries to retain the best talent? These are all great ideas that should be exercised often. But there’s one other investment worth considering that can offer huge benefits for relatively small amounts of money: tooling.

AIOps use cases: Technical, operational, and business examples

ITOps is at a crossroads: Teams struggle to manage a high volume of alerts and coordinate between different tools and teams. Teams also must balance cloud technologies’ agility and on-premise solutions’ stability. The sheer speed of today’s IT demands both flexibility and visibility in development and harmonized tech stacks.

Getting started on alerts with Escalation Policies

Escalation policies are essential for making sure that incidents are quickly addressed and resolved. They provide a systematic approach to automate alerts, guaranteeing that no incident goes unnoticed. Let’s get you started, shall we? An escalation policy is a way to automate alerts and assure that incidents are never missed. The first point of contact for an incident is through an alert that is sent according to the escalation policy.