Squadcast

2017
Palo Alto, CA, USA
May 27, 2020   |  By Squadcast
It can be quite challenging for an SRE team to maintain the well-being of a large-scale Kubernetes based system with hundreds or thousands of services. In this blog post, Gigi Sayfan, author of “Mastering Kubernetes”, outlines the SRE challenge and how we can achieve the ultimate goal of automated SRE with Kubernetes operators
May 20, 2020   |  By Squadcast
And it starts with the company culture. Irrespective of how small or large your team is, it’s wise to invest some time in creating a good on-call onboarding plan. A humane on-call is the mark of a good engineering culture. Being on-call means that you’re expected to be reachable for any issues that may occur during your shift. It’s easy to lose any and all motivation by just anxiously anticipating that mid-dinner ping.
May 7, 2020   |  By Squadcast
In an always-on world, companies look to systems and processes to keep their services up and running at all times. The most important part of maintaining this uptime is having an Incident Management process in place to restore your services in the event of an interruption or unplanned downtime. Incident Management processes are typically used by SRE, DevOps, NOC and other IT teams to respond to incidents that affect services and work on restoring their uptime.
Apr 30, 2020   |  By Squadcast
Leverage Multiple Alert Sources in Squadcast to reflect your actual system infrastructure on your Service Dashboard Having your Incident Management Tool reflect your system architecture is a big milestone in reducing cognitive load on your on-call team. In order to help our users move one step closer to this milestone, we recently released the functionality to add multiple alert sources to a service. You can now model your service dashboard to mimic your actual system architecture.
Apr 27, 2020   |  By Squadcast
An incident postmortem is not only an essential document for reference, but also necessary as a process by which teams can collaboratively learn from failure, and communicate independent learnings across the organization.
Mar 18, 2020   |  By Squadcast
Squadcast is an incident management tool that’s purpose-built for SRE. Create a blameless culture by reducing the need for physical war rooms, centralize SLO dashboards, unify internal and external SLIs and automate incident resolution with Squadcast Actions and create a knowledge base to effectively handle incidents.
Mar 18, 2020   |  By Squadcast
Squadcast is an incident management tool that’s purpose-built for SRE. Create a blameless culture by reducing the need for physical war rooms, centralize SLO dashboards, unify internal and external SLIs and automate incident resolution with Squadcast Actions and create a knowledge base to effectively handle incidents.
Jan 6, 2020   |  By Squadcast
Many organisations already possess a vast amount of existing data about production systems. As customer expectations evolve, organisations are often challenged to find more proactive ways of dealing with traditionally reactive incident response activity. In this talk, we discuss approaches to unlock value from this data by making it truly actionable.
Dec 24, 2019   |  By Squadcast
Squadcast is an incident management tool that’s purpose-built for SRE. Create a blameless culture by reducing the need for physical war rooms, centralize SLO dashboards, unify internal and external SLIs and automate incident resolution with Squadcast Actions and create a knowledge base to effectively handle incidents.
Nov 15, 2019   |  By Squadcast
Squadcast is an incident management tool that’s purpose-built for SRE. Create a blameless culture by reducing the need for physical war rooms, centralize SLO dashboards, unify internal and external SLIs and automate incident resolution with Squadcast Actions and create a knowledge base to effectively handle incidents.