Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

Evolving in CloudOps Maturity? Investing in People and Teams Pays Off

CloudOps is on the up. This is in part due to the rapid acceleration of the shift to cloud that was caused by the pandemic. The shift allowed companies to innovate faster, enjoy greater flexibility and scalability, and become more cost efficient. Many organizations who rapidly adopted cloud or increased their usage now realize that they need to better manage their cloud investments in order to fully embrace these benefits.

HUG Relies on PagerDuty When Healthcare Incidents Arise

The Geneva University Hospital (HUG) is one of the five university hospitals in Switzerland and one of the largest hospitals in Europe. Pierryves Fournier, SRE Team Lead at HUG, explains how PagerDuty and Rundeck help automate his team's incident response process, empowering the right action when seconds matter.

What's New With Runbook Automation: Rundeck 3.4.1

Technical teams are under more pressure than ever to move faster, protect revenue and availability, and push mean time to resolve (MTTR) ever lower. However, teams frequently find themselves encumbered by complex, repetitive, and manual tasks, rather than innovating. When urgent incidents arise, organizations often have to wait for specific developers or subject matter experts (SMEs) to deploy a fix.

Coffee Break Webinar Series: "Intelligent Observability - What the Analysts Say"

We know commitment issues are the real deal, especially when it comes to significant and costly tech investments. Understanding how the market is performing and what’s up ahead is critical for investing in AIOps. Our crew is here to help you through the challenging decision-making days and offer up the best analyst guidance.

Pragmatic Incident Response: 3 Lessons Learned from Failures

In my past experience as an SRE I’ve learned some valuable lessons about how to respond and learn from incidents. Declare and run retros for the small incidents. It's less stressful, and action items become much more actionable. Decrease the time it takes to analyze an incident. You'll remember more, and will learn more from the incident. Alert on pain felt by people — not computers. The only reason we declare incidents at all is because of the people on the other side of them.

Upcoming trends in DevOps and SRE

DevOps and SRE are domains with rapid growth and frequent innovations. With this blog you can explore the latest trends in DevOps, SRE and stay ahead of the curve. The past decade has seen widespread adoption of DevOps methodologies in software development. Unsurprisingly, as the needs of users change, DevOps techniques have evolved as well. In this blog we will look at the trends that are most likely to have a significant impact in the coming years.