Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

What you need to know about the The Digital Operational Resilience Act (DORA)

The European Commission has introduced the Digital Operational Resilience Act (DORA) to bolster the digital infrastructure of the financial sector within the European Union (EU). As part of the EU's wider digital finance strategy, DORA's objective is to create a comprehensive framework governing digital operational resilience. Financial institutions must ensure full compliance with DORA by January 2025.

The Unplanned Show, Episode 19: Cloud Security response with Ashley Ward

As organizations move to the cloud, where is there overlap between security and IT and engineering? In this session, Dormain will sit down with Orca Security's Principal Technical Evangelist, Ashley Ward, to learn about how working practices have to evolve with the speed of change in the cloud.

How we manage incidents at Datadog

Incidents put systems and organizations to the test. They pose particular challenges at scale: in complex distributed environments overseen by many different teams, managing incidents requires extensive structure and planning. But incidents, by definition, break structures and foil plans. As a result, they demand carefully orchestrated yet highly flexible forms of response. This post will provide a look into how we manage incidents at Datadog. We’ll cover our entire process.

The Journey Into Automation: Optimizing Care Delivery

In a world where efficiency and precision are the cornerstones of progress, automation has become the unsung hero across diverse industries. From manufacturing floors to customer service, its transformative power has reshaped the way we work and deliver services. Today, we embark on a journey to explore the profound influence of automation on healthcare, where each automated process is a progressive step towards optimizing care delivery and reshaping the future of patient-centered care delivery.

xMatters Support - Broadcast Groups

In xMatters, groups determine how and when people are notified using on-call schedules, escalation timelines, and rotations. But what if you don't use complex on-call schedules, or need to notify all members of the group simultaneously? Broadcast groups make it easier for customers who don't always need on-call schedules. Let’s take a look.

Suppressing Alert Noise during Scheduled Maintenance

Alert noise is a common problem for IT teams that monitor and manage complex systems. Excessive unactionable alerts triggered by various sources, such as applications, servers, network devices, etc., can cause alert fatigue. The higher volume of alerts can be overwhelming, reducing the ability to respond to critical alerts. One event of possible alert noise is during scheduled maintenance, awhich is a common practice in the digital realm.

6 Best Practices for Tuning Network Monitoring Alerts

Network monitoring and alerting provide the foundation for efficient IT operations and cyber resilience. By keeping track of the status and performance of network infrastructure and applications, network monitoring tools can automatically generate alerts when defined thresholds are exceeded or specific events occur. These network monitoring alerts allow IT teams to detect outages, performance degradation, and potential security incidents so they can respond swiftly to minimize disruption.