Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

4 Ways to Reduce Your Mean Time to Resolution

Dealing with a high MTTR in your network? Auvik Network Management is a comprehensive network monitoring and troubleshooting solution. With over 50 pre-configured alerts, it keeps you informed about critical network events. Users have the flexibility to customize these alerts and control notification frequency so that they have all the essential context to be able to fix issues.

NOC Success Like Never Before: Automation Strategies for All-new Incident Management

Network Operations might never be the same. But then again, why would anyone want it to be? The power of automation and orchestration can bring incredible value to the Network Operations Center (NOC), including the business-critical call to get proactive and ahead of the incidence response and management game. It’s more than a towering volume of events – it’s the complexities involved, too.

Global AWS Orchestration with Runbook Automation

It is common for companies to have multiple AWS Accounts, and as it turns out, there are cases where certain operational tasks need to be performed on EC2’s that reside in each account. Examples of this include standardizing practices for auditing, patching, and incident-response – such as retrieving diagnostics or remediation. This demo showcases how Runbook Automation orchestrates commands and scripts on EC2’s spanning numerous AWS accounts through an integration with Systems Manager (SSM).

Everything you need to know about IT Operations Analytics

Data is both a challenge and an asset for IT professionals, who rely on IT Operations Analytics (ITOA) to guide them towards operational excellence, system reliability, and swift incident resolution. So whether you’re seeking clarity on understanding what ITOA is and its connection to related technologies, are contemplating how to use it within your organization, or are curious about its enhanced efficiency and cost savings benefits, we’ve got you covered.

Panel Discussion: Modern Monitoring and Observability

Struggling with effective monitoring for your services? Not sure how to handle the volume of information your environment creates? Join us for a panel discussion about Monitoring and Observability, featuring Jason Hand of Datadog, Ernest Mueller of Accenture, Steve McGhee of Google, and Peco Karayanev of PagerDuty. Hosted by PagerDuty DevOps Advocate Mandi Walls.