Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

Why incident response automation is top-of-list for CISOs in 2020

When considering the state of critical incidents in 2019 – it’s no surprise that looking ahead to 2020, CISOs have one of the organization’s most challenging and stressful jobs. During the first half of the year alone 4.1 billion records were compromised, and the average cost of a data breach is now estimated at $3.92 million.

The Age of Service Mesh

You have built a massively successful system. The users just can't get enough and request new features. Your developers crank out new services on a regular basis. Your DevOps/SRE team configures and scale your Kubernetes cluster (or clusters). As the system becomes more complicated and sophisticated you realize that there are common themes that repeat across all your services.

What Is MTTF? Mean Time to Failure Explained in Detail

“What is MTTF?” That’s the question we’ll answer with today’s post. Yep, the article’s title makes it evident that the acronym stands for “mean time to failure.” But that, on its own, doesn’t say anything. What does “mean time to failure” actually mean? Why should you care? That’s what today’s post covers in detail.

Sensitive Medical Data Hacked by Unsophisticated Software

There’s a solid rationale behind replacing antiquated technology, as they fail to keep pace with how the healthcare environment is evolving. One such invention is the good, old pager. Recently, the U.K.’s National Health Service Trust (NHS) was on the radar when the organization’s sensitive medical data was hacked by an individual in North London. The malicious party intercepted radio waves, converting it into legible text on his computer monitor.

IDC Finds Substantial ROI for Enterprises Using PagerDuty for Digital Operations Management

In order to keep digital services running around the clock, teams need to be able to solve problems faster—or, ideally, in real time. Many vendors claim to provide value and help organizations bolster their digital operations management.

Root Cause Changes: Real Examples of Modern Root Cause Analysis from our Beta Customers

Root Cause Analysis (RCA) is an all-encompassing process. It is usually very complicated and often requires many people with many different skills – all trying to tackle an incident to determine what happened, when, why, how and ultimately who (to blame). There is, however, secret sauce today that can help solve many issues before a “full-scale” RCA process is initiated – and that is Root Cause Changes (RCC).

Cherwell & PagerDuty: Getting Real (Time) About Digital Transformation

Digital transformation may be the largest shift the IT industry will experience in a lifetime. It’s a term used throughout the tech industry and in various contexts. Gartner defines it as “…anything from IT modernization (for example, cloud computing), to digital optimization, to the invention of new digital business models,” which has massive implications for almost every organization.

Moogsoft User Conference 2019 Overview

Attendees at this year's inaugural Moogsoft User Conference were immersed in two full days of all things AIOps. Our goal: To help our users gain new insights, best practices, and AI and ML skills to transform their IT Operations and deliver continuous service assurance. The result? We all left invigorated with renewed enthusiasm, actionable expertise and fresh ideas to boost our collective organizations' AIOps strategies!

LogicMonitor and PagerDuty: Beyond the Basics

Out-of-the-box integrations are great, and they help organizations see an immediate return on investment when the technologies they have invested in work together seamlessly. However, a little customization to these integrations can dramatically increase productivity and reduce mean time to resolution. Here we will address a couple of best practices and customizations that can take your PagerDuty and LogicMonitor integration to the next level.