Operations | Monitoring | ITSM | DevOps | Cloud

Incident Response

Denmark's Largest Utility Company Accelerates Incident Response

As Denmark’s largest power, utility and telecommunications company servicing 1.5 million customers, Norlys understands the need for fast response to security alerts. When the company first started, the Norlys security team built their own log analytics and incident response capabilities from the ground up. This homegrown approach presented challenges, including manual workflows, too many repetitive tasks and difficult-to-maintain processes.

Five worthy reads: Preparing an incident response plan for the pandemic and beyond

Five worthy reads is a regular column on five noteworthy items we’ve discovered while researching trending and timeless topics. With the rising concern over cyberattacks in the distributed workforce, this week we explore the concept of cybersecurity incident response during a pandemic.

VictorOps and Relay for Incident Response

VictorOps is an incident response tool whose mission is straightforward: “To make being on call suck less.” It enables teams to quickly detect and respond to problems like a service degredation or outage. VictorOps supports a wide range of external integrations to extend its capabilities by connecting different parts of your DevOps toolchain.

Incident Ready: How to Chaos Engineer Your Incident Response Process

We’re pretty sure using a real incident to test a new response process is not the best idea. So, how do you test your process ahead of time? In this video, FireHydrant CEO, Robert Ross, shared how our customers leverage best practices to break, mitigate, resolve, and fireproof incident processes.

Incident Ready: How to Chaos Engineer Your Incident Response Process | FireHydrant

We’re pretty sure using a real incident to test a new response process is not the best idea. So, how do you test your process ahead of time? In this video, FireHydrant CEO, Robert Ross, will share how FireHydrant customers leverage best practices to break, mitigate, resolve, and fireproof incident processes. We’ll show you how to use chaos engineering philosophies to stress test 3 critical parts of a great process.

Automating incident response with Relay and PagerDuty

DevOps and SRE teams are under intense pressure to reduce the Mean Time to Recovery (MTTR) in resolving incidents. The latest integration between Relay and PagerDuty eliminates the “digital duct tape” by creating reusable, event-driven workflows to close the loop on incidents faster through Relay’s event-based automation approach.

Adaptable Incident Response With Splunk Phantom Modular Workbooks

Splunk Phantom is a security orchestration, automation and response (SOAR) technology that lets customers automate repetitive security tasks, accelerate alert triage, and improve SOC efficiency. Case management features are also built into Phantom, including “workbooks,” that allow you to codify your security standard operating procedures into reusable templates.

Datadog and Relay for Incident Response

Datadog is an awesome tool for aggregating and visualizing the metrics that matter to you. Recently, Datadog launched a new Incident Management feature, which allows you to coordinate the activities around a problem that affected your service. In this example, I’ll walk through using Relay to roll back a Kubernetes deployment that caused a service impact, and show how the Datadog Incident timeline can keep everyone working on the incident in sync.