Operations | Monitoring | ITSM | DevOps | Cloud

SRE

The latest News and Information on Service Reliability Engineering and related technologies.

squadcast

Postmark + Squadcast Integration: Simplifying Alert Routing

Postmark is a simple email delivery system used to send transactional and marketing emails and it ensures getting them delivered to the inbox on time, every time. It also helps in reducing email delivery time considerably. If you use Postmark for your email delivery requirements, you can integrate it with Squadcast, an end-to-end incident response tool, to route detailed alerts from Postmark to the right users in Squadcast. The below steps will help you set up Postmark and Squadcast integration.

civo

Day in the life of an SRE

We spoke with two members from the SRE team, Alex Blyth and Zulhilmi Zainudin, to learn more about their role at Civo. Through this series, we aim to provide you with an overview of the different roles we have at Civo and what advice our team has. You can discover more about our team in our “day in the life of a Go Dev” and “day in the life of an Intern” blog.

squadcast

CircleCI + Squadcast Integration: Alert Routing Made Easy

CircleCI is a continuous integration and continuous delivery (CI/CD) platform that helps in implementing DevOps practices. It is used to build, test, and deploy projects, by automating pipelines with jobs. If you use CircleCI for implementing your DevOps practices, you can now integrate it with Squadcast to route detailed alerts to the right users in Squadcast. The below steps will help you set up CircleCI and Squadcast integration.

influxdata

Reducing MTTR for DevOps and SREs with PagerDuty Process Automation and InfluxDB

Mean time to resolution (MTTR) is a metric that transcends industry and technology. It’s a measure of how quickly, on average, support teams identify, act, and resolve IT issues and incidents. Because MTTR directly relates to service quality, maintaining a low MTTR is a critical goal for DevOps and SRE teams. These teams have a vested interest in resolving issues quickly because escalating incidents to higher levels of the support team increases response and resolution times.

catchpoint

My Most Surprising Discoveries from The SRE Report 2023

I’ve had the honor and privilege of authoring The SRE Report for the last three years. For the 2023 version, this included working with some amazing individuals like Anna Jones, Kurt Andersen, and Steve McGhee. Download The SRE Report 2023 here (no registration required).

blameless

Blameless culture drives incident learning and other key insights from Catchpoint's 2022 SRE Report

SRE is a constantly evolving field, responding to the challenges of increasing reliance on tech and the opportunities of its evolving abilities. Reliability has to remain a step ahead of the cutting edge, whether it’s navigating remote work, implementing AI assistance, or optimizing internal processes. But how do we know that SRE is keeping up? ‍ We’re proud and excited to announce the results of the SRE Survey we ran in partnership with Catchpoint.

catchpoint

Empower the SREs - Conclusions from The SRE Report 2023

Let's be honest, nobody loves surveys. Ok, well I sure don't. But surveys satisfy a huge need in our demand for insights into complex human-computer, sociotechnical systems. It turns out that we've been measuring the computer part pretty well, but the humans – not as easy to keep track of. When Google SRE first defined toil as a metric we wanted to reduce, we spent far too long trying to quantify it numerically based on tooling and insights from computer systems.