%term

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

AI vs. AI: from alert fatigue to agentic cybersecurity

Jul 15, 2026 By Rootly In Rootly

AI is transforming cybersecurity on both sides of the battlefield. Attackers can now launch highly personalized phishing campaigns at scale and build malware capable of making autonomous decisions. At the same time, security teams are using AI agents to investigate alerts, reduce noise, and respond to threats faster. In this episode of Humans of Reliability, we speak with Nir Soudry, Head of R&D at 7AI, about the shift from alert fatigue to agentic cybersecurity.

View Video

Rootly

Read more about AI vs. AI: from alert fatigue to agentic cybersecurity

PagerDuty Announces Arnaud Lagarde, Vice President of EMEA

Jul 14, 2026 By PagerDuty In PagerDuty

PagerDuty, Inc. announces the appointment of Arnaud Lagarde as vice president of EMEA. Lagarde will lead PagerDuty's next phase of growth in the EMEA region, bringing the entire incident management lifecycle to customers across EMEA to solve their biggest digital challenges.

Read Post

PagerDuty

Read more about PagerDuty Announces Arnaud Lagarde, Vice President of EMEA

How to lay the data foundation to support agentic ITOps

Jul 14, 2026 By Carlos Gutierrez In BigPanda

Agentic IT operations have arrived. It’s no longer a question of if enterprise IT departments will adopt agentic ITOps, but how quickly. Every year, IT environments grow more distributed, complex, and difficult to monitor with legacy tools and processes. At the same time, the pace of AI development is accelerating the volume of changes and incidents, straining teams that are still trying to manage them manually, reactively, and one alert at a time.

Read Post

BigPanda

Read more about How to lay the data foundation to support agentic ITOps

Stop Triaging in the Dark: Full Visibility Across Every IT Domain

Jul 14, 2026 By BigPanda In BigPanda

Alert correlation solved the noise problem. But noise was never the whole problem. Today’s most disruptive incidents cascade across networks, infrastructure, applications, and services simultaneously, without clear visibility into the true root cause. As a result, L1 teams are left manually piecing together context from multiple dashboards and tools to find the primary root cause while SLA clocks keep ticking and end user tickets add up.

View Video

BigPanda

Read more about Stop Triaging in the Dark: Full Visibility Across Every IT Domain

The Value of Preventive Maintenance in Modern Business Operations

Jul 13, 2026 By OpsMatters In OpsMatters

Preventive maintenance helps businesses reduce downtime, avoid costly breakdowns, extend equipment life, and maintain safer, more efficient operations. By addressing small issues early, companies can keep workflows running smoothly and protect productivity in a competitive business environment.

Read Post

OpsMatters

Read more about The Value of Preventive Maintenance in Modern Business Operations

Where Status Pages Fit in a Modern Incident-Response Workflow

Jul 12, 2026 By OpsMatters In OpsMatters

An incident-response process has two audiences from the moment a service begins to fail. Engineers need evidence detailed enough to isolate the fault. Customers need a clear account of what is affected, what still works, and when they should expect another update. Trying to serve both groups from the same dashboard usually leaves each with the wrong information.

Read Post

OpsMatters

Read more about Where Status Pages Fit in a Modern Incident-Response Workflow

We rebuilt Spike app for Slack

Jul 10, 2026 By Spike - incident response platform In Spike

The new Spike app for Slack brings incident response into the channel your team already works in. This walkthrough covers the @Spike AI assistant, the redesigned incident alert template, Statuspage syncing, and on-call overrides. To get started, head to Slack settings inside Spike and reconnect the app. Chapters Statuspage syncing is available on all plans. Spike is an incident response and on-call management platform. Alert routing, escalation policies, on-call schedules, and incident management, built for engineering teams.

View Video

Spike

Read more about We rebuilt Spike app for Slack

Slack overview video

Jul 10, 2026 By Spike - incident response platform In Spike

The new Spike app for Slack brings incident response into the channel your team already works in. This walkthrough covers the @Spike AI assistant, the redesigned incident alert template, Statuspage syncing, and on-call overrides. To get started, head to Slack settings inside Spike and reconnect the app. Chapters Statuspage syncing is available on all plans. Spike is an incident response and on-call management platform. Alert routing, escalation policies, on-call schedules, and incident management, built for engineering teams.

View Video

Spike

Read more about Slack overview video

Custom Shifts

Jul 10, 2026 By PagerDuty Inc. In PagerDuty

This week on the HowTo Happy Hour, we're looking at Custom Shifts, a feature of our new Shift-Based Schedules.

View Video

PagerDuty

Incident Management

Read more about Custom Shifts

From BigQuery to ClickHouse: How we made our analytics 5× faster

Jul 10, 2026 By Aleksandr Meshcheriakov In iLert

‍For years, ilert has given our customers extensive analytics across their alerts, notifications, and on-call activity, a comprehensive overview of how their teams and services respond to incidents. These capabilities were backed by a separate analytical database running on Google BigQuery. It held the numbers behind every reporting dashboard in ilert, and for a long stretch it was perfectly fine. Then three problems grew too big to ignore.

Read Post