Operations | Monitoring | ITSM | DevOps | Cloud

Incident Management

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

Reinventing Deployments: From Docker to Dagger -- Incidentally Reliable with Solomon Hykes

Catch Solomon Hykes (Co-founder of @Docker and @Dagger) shares stories from the early days of Docker, the rollercoaster journey leading to 20 million active developers worldwide, the heavy crown of a tech leader and his vision to revolutionize CI/CD with Dagger today. Exclusively on The Incidentally Reliable podcast — made by SREs for SREs, hosted by Zenduty.

The Unplanned Show, Episode 32: Platform Engineering with Paula Kennedy

Supporting developer velocity AND operational efficiency, stability, and security doesn't happen by accident. In this episode, Dormain will sit down with Paula Kennedy to discuss how platform engineering supports businesses go faster, decrease risk, and increase efficiency.

Elevating Engineering Excellence: The Imperative of Site Reliability for Every Engineer

In the ever-evolving landscape of technology, engineers are the architects of the digital world. Their expertise shapes the platforms, applications, and services that define our daily interactions with technology. Yet, in the pursuit of innovation and functionality, there's one crucial aspect that often takes a backseat—site reliability. Site reliability engineering (SRE) has emerged as a critical discipline in the realm of software development and operations.

SIGNL4 Onboarding: Customizing Alerts and Notifications

The SIGNL4 Onboarding series walks users through the process's of SIGNL4 from Signup to Alerts to Settings. Today's video focuses on using Overrides to enable different alerting options during different dates and times. This video is packed with helpful tips to help you get the most out of your account.

PTO peace of mind: Sync Grafana OnCall with Google Calendar out-of-office events

Sometimes, the little things can make a big difference. We’ve added a new feature in Grafana Incident & Response Management (IRM) that lets you sync your Google Calendar out-of-office events with Grafana OnCall.

Insights of an Observability Advocate: The Challenges and Rewards

At a recent SRE Meetup in Bangalore, we had the pleasure of meeting Akshay Deshpande. During our conversation, Akshay, who manages a Performance/Observability Engineering team at Smarsh discussed his passion for observability and his constant drive to improve the field. Smarsh helps companies gain valuable insights from their communication data, enabling them to proactively identify potential regulatory and reputational risks before they escalate.