Operations | Monitoring | ITSM | DevOps | Cloud

Fireside Chat with Jesse Robbins and Kolton Andrus Failover Conf 2021

Long before Chaos Engineering was even a phrase, Jesse Robbins was Amazon.com's "Master of Disaster" using intentional failure to help the company become more reliable. Kolton Andrus (CEO at Gremlin), sits down with Jesse to learn more about his early work with GameDays, the evolution of reliability, and where the future of SRE lies.

Fireside Chat with Ines Sombra and Ana Medina Failover Conf 2021

Reliability is a requirement for the modern internet. Ana Medina joins Inés Sombra, Sr. Director of Engineering at Fastly, to discuss their approach to resilience, how the past year has influenced the way they work, and what practices your engineering organization can adopt to become more reliable.

Atlassian Open DevOps Overview Video

Software and DevOps teams have everything they need to develop, ship, and operate software in Atlassian Open DevOps, an development experience built on Jira Software.Open DevOps starts with Jira Software, Confluence, Bitbucket, and Opsgenie. Teams can easily add the tools they want, such as GitHub or GitLab, with a single click. In this video get an overview of Open DevOps and how it can supercharge your development.

Leaving the Nest: Guidelines, guardrails, and human error by Laura Santamaria Failover Conf 2021

When we talk about reliable systems, we talk a lot about human error. Human error in an incident or a bug report is often treated with a bit of a facepalm reaction. The term masks a lot of scenarios from accidents to exhaustion to everything in between. However, human error helps us understand where our processes failed and how we can prevent the same error from happening again. In short, we need to think in terms of a framework of guidelines and guardrails. In this short talk, let’s discuss how guidelines like runbooks and guardrails like automation can help us address the fact that everyone will, at some point, make mistakes.

Implementing DevSecOps in the DoD by Nicolas Chaillan Failover Conf 2021

Delivering software quickly and securely is important for every organization, but it's even more important at the US Department of Defence (DoD) where reliability directly impacts national security. Nicolas Chaillan (Chief Software Officer, US Air Force) will discuss the DoD Enterprise DevSecOps Initiative—an initiative he leads along with the DOD’s Chief Information Officer that brings automated software tools, services and standards to DoD programs. He'll also share about Platform One, the Air Force's DoD-wide DevSecOps Enterprise Level Service that provides managed IT services capabilities, on-boarding, support, and baked-in zero trust security. This insight from operating at the most rigorous level will help you level up your own organization.

Pragmatic Incident Response: Lessons learned from failures by Robert Ross Failover Conf 2021

Incident response is overwhelming. So where do you start? There's a lot of advice out there, but it's mostly theories that aren't taking reality into account. So how do you get a process in place that actually works and scales? In this session, FireHydrant CEO and Co-Founder, Robert Ross, will share quick stories from his experience as an SRE and what tips he’s learned along the way.