Operations | Monitoring | ITSM | DevOps | Cloud

SRE

The latest News and Information on Service Reliability Engineering and related technologies.

Integrating Microsoft Teams & Squadcast - Acknowledge, Resolve & Reassign Incidents | Squadcast

Teams using MS Teams can now integrate with Squadcast and easily Acknowledge, Resolve & Reassign incidents using MS Teams. You can configure Squadcast to send a notification to the configured MS Teams channel as soon as an incident is triggered.

Creating Routing Rules I Creating Incident Routing Flows I Alert Routing I Event Tags I Squadcast

Alert Routing allows you to configure Routing Rules to ensure that alerts are routed to the right responder with the help of event tags attached to them. This video explains how you can utilise Routing rules to create various incident routing flows.

Integrating Slack & Squadcast- Trigger, Acknowledge, Resolve & Reassign incidents from Slack channel

You can integrate Squadcast and Slack to collaborate efficiently with your team while working on incidents. Squadcast sends a notification to the configured Slack Channel as soon as an incident is triggered.

Alert Suppression Rules in Squadcast to prevent Alert fatigue | Squadcast

Alert suppression can help you avoid alert fatigue by suppressing notifications for non-actionable alerts. Squadcast will suppress the incidents that match any of the Suppression Rules you create for your Services. These incidents will go into the Suppressed state and you will not get any notifications for them.

Using StatusPage at squadcast | SRE Best practices | Squadcast

Let your customers know how your Services are doing, without them having to ask you about it. One of the core principles of SRE is Transparency and Status Pages help you communicate the status of your Services to your customers at all times, as opposed to you getting to know the status of your Services through support tickets logged by your customers.

APImetrics + Squadcast: Routing Alerts Made Easy

APImetrics is an API Compliance, Monitoring and Security solution that lets you make and run API calls or sequences of API calls (workflows) from external, remote cloud locations using exactly the same security configurations as a typical end user would use. If you use APImetrics for API calling requirements, you can integrate it with Squadcast, an end-to-end incident response tool, to route detailed alerts from APImetrics to the right users in Squadcast.

SRE Maturity Model: How Do You Assess Your Team?

How do you evaluate your SRE team’s progress in implementing SRE? We discuss the key SRE indicators for evaluating your team’s progress in the SRE maturity model. ‍ What is the SRE maturity model? ‍ The SRE maturity model is a way of judging how far you are in implementing SRE principles. It is a method used by teams to understand where they ought to implement more SRE best practices to reach greater SRE maturity.

Observability Pipelines for an SRE

In data management, numerous roles rely on and regularly use observability data. The Site Reliability Engineer is one of these roles. Site Reliability Engineers (SREs) work on the digital frontlines, ensuring performant experiences by using observability data to maintain stability and awareness of software running in various environments across organizations.

How to design an effective incident on-call program

If anyone on your team has paged a colleague in the middle of the night, your DevOps team has an incident on-call program. Whether that team member knew who to page, and felt comfortable sending the page, is indicative of your on-call program's effectiveness. Join Thai Wood, founder of Resilience Roundup, and Matt Davis, SRE Advocate at Blameless, to discuss: This webinar was recorded live on December 13, 2022.