Operations | Monitoring | ITSM | DevOps | Cloud

Latest Videos

Incidents as we Imagine Them Versus How They Actually Are with John Allspaw

There is a tendency to imagine (or remember!) incidents as unfolding much neater and orderly than they actually are. Events can lead some engineers scratching their heads about what is happening, while their teammates can instead be confused about how it's happening.

Monitoring that Monitors the Monitors of the Monitors

One way to break the cycle of alert fatigue is by improving the quality of the signals you monitor. That can mean greater resolution at which monitoring data is ingested and processed, smarter statistical methods for aggregating and correlating data across multiple services, or routing alerts through an escalation and incident management system.

This IS NOT Fine: Putting Out (Code) Fires

So the dumpster is on fire. Again. The site’s down. Your boss’s face is an ever-deepening purple. And you begin debating whether you should join the #incident channel or call an ambulance to deal with his impending stroke. Firefighters have clear procedures and a strong hierarchy. The first truck at a scene immediately begins assessing the situation.

Reducing Noise with Event Intelligence

Learn how Event Intelligence, the next-gen approach to Event Management and AIOps, helps teams to cut through the noise and operate at scale. This introductory session will walk through key best practices and requirements such as reducing noise via adaptive machine learning, accelerating triage via integrating machine data with human response, and much more.

Introducing Jira Ops: Respond Faster with Atlassian + PagerDuty

Atlassian’s mission is to unleash the potential of every team. Atlassian’s newest product, Jira Ops, is built on top of Jira with a direct connection to PagerDuty to ensure teams can be successful and respond quickly when things break. This session will cover how PagerDuty and JiraOps work together to help teams respond to incidents, quickly and in real-time.

Accelerating Incident Response

Incidents are never fun, but a bad incident response process makes them even less so. How do technical teams mobilize the right people and provide the right context and tooling to rapidly take action and drive incident resolution? With the clock ticking and up to millions of dollars lost per minute of downtime, there’s no time to waste in assembling the right experts.