Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Service Reliability Engineering and related technologies.

Using Squadcast's SLO Tracker | Error Budget | Setting up SLOs and configuring SLIs | Squadcast

With Squadcast, you can define and monitor Service Level Objects for your services. SLOs allow you to define and enforce an agreement between two parties regarding the delivery of a given service. A Service Level Objective (SLO) is a reliability target, measured by a Service Level Indicator (SLI), and sometimes serves as a safeguard for a Service Level Agreement (SLA). SLOs represent customer happiness and guide the development team’s velocity.

Introduction to Service Catalog | Service Ownership | Service Classification | Squadcast

To make service management a breeze, we bring to you our improved Service Catalog. The Service Catalog is designed to improve Service Classification and bring more transparency to Service Ownership within your org. This video explains how a consolidated summary of all active services from a single dashboard can help you better track your service health.
Sponsored Post

Outages ITOps professionals are thankful to avoid

As we settle into the time of year when we reflect on what we're thankful for, we tend to focus on important basics such as health, family and friends. But on a professional level, IT operations (ITOps) practitioners are thankful to avoid disastrous outages that can cause confusion, frustration, lost revenue and damaged reputations. The very last thing ITOps, network operations center (NOC) or site reliability engineering (SRE) teams want while eating their turkey and enjoying time with family is to get paged about an outage. These can be extremely costly - $12,913 per minute, in fact, and up to $1.5 million per hour for larger organizations.

Toil: Still Plaguing Engineering Teams

Our industry has always had localized expressions for work that was necessary but didn’t move the company forward. The SRE movement calls this type of work “toil.” The concept of toil is a unifying force because it provides an impartial framework for identifying — then containing — the work that takes up our time, blocks people from fulfilling their engineering potential, and doesn’t move the company forward.

Postmark + Squadcast Integration: Simplifying Alert Routing

Postmark is a simple email delivery system used to send transactional and marketing emails and it ensures getting them delivered to the inbox on time, every time. It also helps in reducing email delivery time considerably. If you use Postmark for your email delivery requirements, you can integrate it with Squadcast, an end-to-end incident response tool, to route detailed alerts from Postmark to the right users in Squadcast. The below steps will help you set up Postmark and Squadcast integration.

Day in the life of an SRE

We spoke with two members from the SRE team, Alex Blyth and Zulhilmi Zainudin, to learn more about their role at Civo. Through this series, we aim to provide you with an overview of the different roles we have at Civo and what advice our team has. You can discover more about our team in our “day in the life of a Go Dev” and “day in the life of an Intern” blog.

CircleCI + Squadcast Integration: Alert Routing Made Easy

CircleCI is a continuous integration and continuous delivery (CI/CD) platform that helps in implementing DevOps practices. It is used to build, test, and deploy projects, by automating pipelines with jobs. If you use CircleCI for implementing your DevOps practices, you can now integrate it with Squadcast to route detailed alerts to the right users in Squadcast. The below steps will help you set up CircleCI and Squadcast integration.

Reducing MTTR for DevOps and SREs with PagerDuty Process Automation and InfluxDB

Mean time to resolution (MTTR) is a metric that transcends industry and technology. It’s a measure of how quickly, on average, support teams identify, act, and resolve IT issues and incidents. Because MTTR directly relates to service quality, maintaining a low MTTR is a critical goal for DevOps and SRE teams. These teams have a vested interest in resolving issues quickly because escalating incidents to higher levels of the support team increases response and resolution times.