Monthly Archive

Things to do to make on-call less stressful

Jan 30, 2020 By Squadcast In Squadcast

Doing on-call management in a way that’s better, less stressful and actually works to improve your incident response processes, uptime & reliability.

Read Post

Squadcast

Read more about Things to do to make on-call less stressful

Hiteshwar shares his thoughts on being an SRE

Jan 24, 2020 By Squadcast In Squadcast

Hiteshwar is an SRE based out of Mumbai, India. His area of specialization is in distributed systems. He works on Kubernetes, running his own custom clusters, maintaining them and creating tools to manage and monitor them. He likes to share his learnings by writing articles and blogs on Medium and Linkedin. He is an active speaker in meetups and developer groups and also teaches DevOps and SRE practices at learning centers.

Read Post

Squadcast

Read more about Hiteshwar shares his thoughts on being an SRE

Arild Jensen from Upwork shares his thoughts on being an SRE

Jan 17, 2020 By Squadcast In Squadcast

Arild Jensen, SRE Manager at Upwork, talks about his journey into SRE and some best practices he picked up along the way including implementing a blameless culture, code review and making decisions based on hard data.

Read Post

Squadcast

Read more about Arild Jensen from Upwork shares his thoughts on being an SRE

What you can show on your status page

Jan 14, 2020 By Squadcast In Squadcast

When something goes down, the first thing a customer does is check if there is something wrong with their systems or if it is an issue with one of their service providers. So it’s important to make sure that your status page has all the information that is needed where they don’t feel the need to raise an issue or create a ticket, adding to your support costs.

Read Post

Squadcast

Read more about What you can show on your status page

Using a Status Page in your Incident response process

Jan 10, 2020 By Prakya Vasudevan In Squadcast

A status page is a communication tool that allows you to display the current working status of your various services - whether fully functional, partially degraded, severely affected, etc. The nomenclature of the service status can be defined by you. On the status page, you can also access & update the uptime and incident history data for all your internal facing or customer impacting components.

Read Post

Squadcast

Read more about Using a Status Page in your Incident response process

Reducing On-call Alert Fatigue with Deduplication

Jan 8, 2020 By Prakya Vasudevan In Squadcast

Alert noise is a very common on-call complaint leading to fatigue and on-call burnout. This article is an attempt at helping folks address this problem.

Read Post

Squadcast

Read more about Reducing On-call Alert Fatigue with Deduplication

Making Observability Actionable at Scale - Sisir Koppaka | DBS DevConnect 2019

Jan 6, 2020 By Squadcast In Squadcast

Many organisations already possess a vast amount of existing data about production systems. As customer expectations evolve, organisations are often challenged to find more proactive ways of dealing with traditionally reactive incident response activity. In this talk, we discuss approaches to unlock value from this data by making it truly actionable.

View Video

Squadcast

Read more about Making Observability Actionable at Scale - Sisir Koppaka | DBS DevConnect 2019

Operations | Monitoring | ITSM | DevOps | Cloud

Things to do to make on-call less stressful

Hiteshwar shares his thoughts on being an SRE

Arild Jensen from Upwork shares his thoughts on being an SRE

What you can show on your status page

Using a Status Page in your Incident response process

Reducing On-call Alert Fatigue with Deduplication

Making Observability Actionable at Scale - Sisir Koppaka | DBS DevConnect 2019

Monthly Archive

Follow Us