February 2023

Reducing Security Incidents: Implementing Docker Image Security Scanner

Feb 28, 2023 By Shishir Khandelwal In Squadcast

Are you utilizing Docker to deploy your applications? If so, you're not alone. The use of Docker has skyrocketed in popularity in recent years. While it offers numerous benefits, it also introduces new security risks that need to be addressed. But, why is reducing security incidents so important? Simple - the cost of a security breach can be devastating. From lost customer trust to financial losses, the consequences of a security incident can be severe. That's why it's crucial to take steps to prevent them from occurring in the first place. Enter Docker image security scanners.

Read Post

Squadcast

Read more about Reducing Security Incidents: Implementing Docker Image Security Scanner

Webinar on 'Evolution of Incident Management from On-Call to SRE' | Squadcast

Feb 26, 2023 By Squadcast In Squadcast

This Incident Management has evolved considerably over the last decade, more so in the last few years. What was traditionally limited to having just an in-house on-call team and an alerting system, has now grown well beyond that to ensure Automation, Collaboration, Transparency, and Retrospection are deeply entrenched in Incident Response.

View Video

Squadcast

Read more about Webinar on 'Evolution of Incident Management from On-Call to SRE' | Squadcast

Site Reliability Engineer: Responsibilities, Roles and Salaries

Feb 24, 2023 By Stephen Watts In Splunk

DevOps gained popularity in order to combat siloed workflows, decreased collaboration and a lack of visibility across the software development lifecycle. While establishing a culture of DevOps has helped teams collaborate better and deliver reliable software faster, DevOps teams don’t necessarily have someone specifically dedicated to developing systems that increase site reliability and performance. That’s where a site reliability engineer (SRE) comes into the picture.

Read Post

Splunk

Read more about Site Reliability Engineer: Responsibilities, Roles and Salaries

Strategies for Kubernetes Cluster Administrators: Understanding Pod Scheduling

Feb 22, 2023 By Shishir Khandelwal In Squadcast

Kubernetes has revolutionized container orchestration, allowing developers to deploy and manage applications at scale. However, as the complexity of a Kubernetes cluster grows, managing resources such as CPU and memory becomes more challenging. Efficient pod scheduling is critical to ensure optimal resource utilization and enable a stable and responsive environment for applications to run in.

Read Post

Squadcast

Read more about Strategies for Kubernetes Cluster Administrators: Understanding Pod Scheduling

Getting Started with Incident Details Page I Attaching Runbooks & Postmortems I Squadcast

Feb 19, 2023 By Squadcast In Squadcast

An Incident Details Page in Squadcast gives you a detailed overview of an incident right from when it is created till it is resolved.

View Video

Squadcast

Read more about Getting Started with Incident Details Page I Attaching Runbooks & Postmortems I Squadcast

Communication Channels in Squadcast | Incident Management | Squadcast

Feb 19, 2023 By Squadcast In Squadcast

Communication Channels help you add Video Call links, ChatOps links, and other external links to an incident. Additionally, you can create a dedicated Slack Channel for an incident using the Communications Card.

View Video

Squadcast

Read more about Communication Channels in Squadcast | Incident Management | Squadcast

Deduplication Rules | Reduce Alert Noise by Clustering Similar Alerts I Squadcast

Feb 18, 2023 By Squadcast In Squadcast

Alert Deduplication can help you reduce alert noise by organising and grouping alerts. It also provides easy access to similar alerts when needed. This video on Alert Deduplication rules will help you define Deduplication Rules for each Service in Squadcast. Alerts will get deduplicated when these rules evaluate true for an incoming incident.

View Video

Squadcast

Read more about Deduplication Rules | Reduce Alert Noise by Clustering Similar Alerts I Squadcast

Incident Management Process | A Step-By-Step Guide

Feb 17, 2023 By Noor-ul-Anam Ruqayya In Blameless

How does an incident management workflow look? We give a step-by-step guide to the ITIL process and best practices for an effective resolution.

Read Post

Blameless

Read more about Incident Management Process | A Step-By-Step Guide

Maintenance Mode in Squadcast I Creating Maintenance Windows for Services | Squadcast

Feb 17, 2023 By Squadcast In Squadcast

This video explains how Maintenance Mode enables you to reduce alert noise during the scheduled maintenance window and how alert notifications for false-positive incidents can be suppressed during Maintenance windows.

View Video

Squadcast

Read more about Maintenance Mode in Squadcast I Creating Maintenance Windows for Services | Squadcast

Integrating Slack & Squadcast- Trigger, Acknowledge, Resolve & Reassign incidents from Slack channel

Feb 16, 2023 By Squadcast In Squadcast

You can integrate Squadcast and Slack to collaborate efficiently with your team while working on incidents. Squadcast sends a notification to the configured Slack Channel as soon as an incident is triggered.

View Video

Squadcast

Read more about Integrating Slack & Squadcast- Trigger, Acknowledge, Resolve & Reassign incidents from Slack channel

SLA vs. SLO vs. SLI (Differences Explained)

Feb 16, 2023 By Emily Arnott In Blameless

Wondering about SLAs and SLOs, and SLIs? We explain service level agreements, and service level objectives, service level indicators, their differences, and the importance of each.

Read Post

Blameless

Read more about SLA vs. SLO vs. SLI (Differences Explained)

Rootly Overview - Quick Demo

Feb 16, 2023 By Rootly In Rootly

View Video

Rootly

Read more about Rootly Overview - Quick Demo

Creating Tickets in Jira From Squadcast I Jira Integration (Cloud & Server) I Squadcast

Feb 15, 2023 By Squadcast In Squadcast

This video will help you install and configure the Squadcast extension for Jira Cloud & Jira Server. It will help you create tickets in Jira projects whenever there is an incident in Squadcast. Also, learn to automatically or manually sync the status bidirectionally.

View Video

Squadcast

Read more about Creating Tickets in Jira From Squadcast I Jira Integration (Cloud & Server) I Squadcast

Integrating Microsoft Teams & Squadcast - Acknowledge, Resolve & Reassign Incidents | Squadcast

Feb 15, 2023 By Squadcast In Squadcast

Teams using MS Teams can now integrate with Squadcast and easily Acknowledge, Resolve & Reassign incidents using MS Teams. You can configure Squadcast to send a notification to the configured MS Teams channel as soon as an incident is triggered.

View Video

Squadcast

Read more about Integrating Microsoft Teams & Squadcast - Acknowledge, Resolve & Reassign Incidents | Squadcast

CommsFlow Messaging Templates | Blameless

Feb 14, 2023 By Emily Arnott In Blameless

Effective communication is critical during incidents. In order to minimize the impact of an incident and resolve it quickly, it's important that all stakeholders are kept informed and updated throughout the incident response process. However, communicating during an incident can be challenging, especially when dealing with multiple stakeholders and a high level of stress. On-call engineers can have their focus disrupted by switching out of their diagnostic tools to issue communications.

Read Post

Blameless

Read more about CommsFlow Messaging Templates | Blameless

Types of Incident Retrospective Templates

Feb 14, 2023 By Emily Arnott In Blameless

When an incident occurs, it's important to take the time to review what happened, understand all the contributing factors, and identify systemic changes to prevent similar incidents from happening in the future. This process is known as an incident retrospective. However, conducting incident retrospectives can be time-consuming and difficult, especially when dealing with multiple stakeholders and a large amount of data.

Read Post

Blameless

Read more about Types of Incident Retrospective Templates

Take the "work" out of your incident workflow: Integrating Blameless with Opsgenie

Feb 14, 2023 By Blameless In Blameless

Assemble the right team for incident management fast with the new bidirectional integration of Blameless and OpsGenie. In this 30-minute live webinar, Blameless's Aaron Lober, Paul Chu, and Nicolas Philip show you how to seamlessly connect your alerting and service registry to your incident response processes. Webinar includes a live demo.

View Video

Blameless

Read more about Take the "work" out of your incident workflow: Integrating Blameless with Opsgenie

Reporting Incident Using Webforms I Creating Alerts from Outside the Squadcast Ecosystem I Squadcast

Feb 12, 2023 By Squadcast In Squadcast

Webforms can help stakeholders & the customers of an organization easily report issues. This video explains how users from outside the Squadcast ecosystem can report incidents by filling out a simple form and extend customer support by empowering internal stakeholders and customers to report issues on the go.

View Video

Squadcast

Read more about Reporting Incident Using Webforms I Creating Alerts from Outside the Squadcast Ecosystem I Squadcast

How To Setup Outgoing Webhooks in Squadcast | Recieving Incident Information | Squadcast

Feb 10, 2023 By Squadcast In Squadcast

Webhooks allow you to connect a platform you manage (either an API you create by yourself or a third-party service) to a stream of future events. Setting up a Webhook on Squadcast enables you to receive information (referred to as events) from Squadcast as they happen. This can help you avoid continuously polling Squadcast’s REST APIs or manually checking the Squadcast web/mobile application for desired information.

View Video

Squadcast

Read more about How To Setup Outgoing Webhooks in Squadcast | Recieving Incident Information | Squadcast

How to Set up SLOs and Configure SLIs in Squadcast | Tracking Error Budget & Burn Rates | Squadcast

Feb 9, 2023 By Squadcast In Squadcast

This video will help you define and monitor Service Level Objects for your services and also set up and track error budget burn rates in Squadcast. A Service Level Objective (SLO) is a reliability target, measured by a Service Level Indicator (SLI), and sometimes serves as a safeguard for a Service Level Agreement (SLA). SLOs represent customer happiness and guide the development team’s velocity.

View Video

Squadcast

Read more about How to Set up SLOs and Configure SLIs in Squadcast | Tracking Error Budget & Burn Rates | Squadcast

5 Best practices for developing a culture of continuous improvement

Feb 9, 2023 By Aaron Lober In Blameless

How do you create a great engineering team? Exclusively hire brilliant, tenured computer science PhDs. There we solved it. You can skip the next 400 words. (I can hear my college professor in my head saying “Humor might not be your strong suit”) Building a great engineering team isn’t easy. Understatement of the year. It’s not even a problem to be solved per se. We need to think about it as preparation to solve an infinite set of constantly evolving problems.

Read Post

Blameless

Read more about 5 Best practices for developing a culture of continuous improvement

Suppression Rules in Squadcast | Minimise Alert fatigue | Suppress Non-Actionable Alerts | Squadcast

Feb 8, 2023 By Squadcast In Squadcast

This video talks about Alert suppression in Squadcast. Alert Suppression helps you avoid alert fatigue by suppressing notifications for non-actionable alerts. Squadcast will suppress the incidents that match any of the Suppression Rules you create for your Services. These incidents will go into the Suppressed state and you will not get any notifications for them.

View Video

Squadcast

Read more about Suppression Rules in Squadcast | Minimise Alert fatigue | Suppress Non-Actionable Alerts | Squadcast

Using Tagging and Routing Rules in Squadcast I Incident Classification I Event Tagging I Squadcast

Feb 7, 2023 By Squadcast In Squadcast

Event Tagging is a rule-based, auto-tagging system with which you can define customized tags based on incident payloads, that get automatically assigned to incidents when they are triggered. This video explains how to create Tagging rules for efficient Incident Classification.

View Video

Squadcast

Read more about Using Tagging and Routing Rules in Squadcast I Incident Classification I Event Tagging I Squadcast

Announcing our improved Schedules & On-Call Rotations

Feb 7, 2023 By Nakul Shetty In Squadcast

Hey folks! We are super excited to announce that our schedules feature has gone through a bit of an update. Well, more than a bit 🙂. We’ve gone through the feature with a fine-toothed comb and introduced a bunch of UI and functional improvements which we hope will help you achieve one thing: set up, edit and manage your on-call schedules at scale in a matter of minutes (Yes, that was three things but it was tough to condense it to ONE thing)

Read Post

Squadcast

Read more about Announcing our improved Schedules & On-Call Rotations

SRE Report 2023: Findings From the Field - Toil

Feb 7, 2023 By Kurt Andersen In Catchpoint

Toil. Few other words have the same visceral impact for SREs as their four-letter nemesis: toil. Although pretty much everyone recognizes and agrees that toil is bad, it is a term that is frequently misused in colloquial use. In common English usage, toil is defined as “long strenuous fatiguing labor”. As a term of art in the SRE profession, “toil” has several very specific characteristics which distinguish it from other sorts of work which people spend time on.

Read Post

Catchpoint

Read more about SRE Report 2023: Findings From the Field - Toil

[SRE: From Theory to Practice] What's difficult about problem detection?

Feb 7, 2023 By Blameless In Blameless

In this episode of FTTP, Kurt Andersen and Matt Davis are joined by Joanna Mazgaj and Laura Nolan to talk about the implications of and considerations for problem detection. Watch the full episode and hear them share personal stories about the types of challenges you might face. Ultimately, how do we explain and address the socio-technical concepts behind problem detection?

View Video

Blameless

Read more about [SRE: From Theory to Practice] What's difficult about problem detection?

[SRE: From Theory to Practice] What's difficult about incident command?

Feb 7, 2023 By Blameless In Blameless

Welcome back to our mini series of fireside chats with SRE experts talking about the realities of their day-to-day. Episode 2 gets intimate — What’s difficult about incident command? We invited Alyson van Hardenberg, Engineering Manager at Honeycomb.io, and Varun Pal, Staff SRE at Procore, to chat with Jake Englund and Matt Davis from the Blameless team. Watch the full conversation where they cover everything from methodologies and technical expertise to the human and social aspects of reliability engineering.

View Video

Blameless

Read more about [SRE: From Theory to Practice] What's difficult about incident command?

Adding Incident Watchers in Squadcast | Incident Notifications and Updates | Squadcast

Feb 6, 2023 By Squadcast In Squadcast

This video talks about Squadcast's Incident Watchers Feature. In Squadcast, any user/stakeholder can subscribe to an Incident and act as a Watcher for an incident. Incident Watchers can choose to receive notifications for all the updates of an incident. This allows any user/stakeholder to act as an observer of the incident, even if they are not active responders. You can customize your watch options for the incident and receive notifications only for those updates.

View Video

Squadcast

Read more about Adding Incident Watchers in Squadcast | Incident Notifications and Updates | Squadcast

SRE Vs. DevOps: A Simple Breakdown Of The Differences

Feb 3, 2023 By CloudZero In CloudZero

You know this already. Regardless of your size, you must keep up with technological developments in your industry — and, increasingly, in other industries, even those that seem unrelated. Embracing disruption can enable you to increase your market share, revenue, and profit margins. Delegating some development and operations responsibilities to Site Reliability Engineering (SRE) experts allows developers to innovate and create new solutions faster.

Read Post

CloudZero

Read more about SRE Vs. DevOps: A Simple Breakdown Of The Differences

SRE Principles for Edge Management and Improving Resiliency Using the Best of Kubernetes

Feb 3, 2023 By Kirti Apte and Gabry (Maria Gabriella) Brodi In VMware Tanzu

This post was co-written by Kirti Apte and Gabry (Maria Gabriella) Brodi. Over the last couple of years, customers have been adopting Kubernetes and microservice-based application deployment models for various technology and business reasons. In fact, there is a trend that customers are now looking to the next set of use cases that include applications across multiple clouds, as well as edge clouds.

Read Post

VMware Tanzu

Read more about SRE Principles for Edge Management and Improving Resiliency Using the Best of Kubernetes

Blameless Announces New Opsgenie Integration to Help Engineers Simplify and Speed Incident Management Workflow

Feb 1, 2023 By Blameless In Blameless

Assemble the Right Team to Resolve Incidents Fast by Integrating Alerting and Service Catalog Functions.

Read Post

Blameless

Read more about Blameless Announces New Opsgenie Integration to Help Engineers Simplify and Speed Incident Management Workflow

Announcing: Blameless + OpsGenie Integration

Feb 1, 2023 By Aaron Lober In Blameless

In the opening moments of an engineering incident, the most important aspect of a response plan is speed. Getting out of the gate quickly by leveraging automation to assemble the team can save precious moments during a critical engineering incident and make the difference between happy and unhappy customers downstream. This is why we’re excited to announce the integration of Blameless with OpsGenie.

Read Post

Blameless

Read more about Announcing: Blameless + OpsGenie Integration

Operations | Monitoring | ITSM | DevOps | Cloud

February 2023

Reducing Security Incidents: Implementing Docker Image Security Scanner

Webinar on 'Evolution of Incident Management from On-Call to SRE' | Squadcast

Site Reliability Engineer: Responsibilities, Roles and Salaries

Strategies for Kubernetes Cluster Administrators: Understanding Pod Scheduling

Getting Started with Incident Details Page I Attaching Runbooks & Postmortems I Squadcast

Communication Channels in Squadcast | Incident Management | Squadcast

Deduplication Rules | Reduce Alert Noise by Clustering Similar Alerts I Squadcast

Incident Management Process | A Step-By-Step Guide

Maintenance Mode in Squadcast I Creating Maintenance Windows for Services | Squadcast

Integrating Slack & Squadcast- Trigger, Acknowledge, Resolve & Reassign incidents from Slack channel

SLA vs. SLO vs. SLI (Differences Explained)

Rootly Overview - Quick Demo

Creating Tickets in Jira From Squadcast I Jira Integration (Cloud & Server) I Squadcast

Integrating Microsoft Teams & Squadcast - Acknowledge, Resolve & Reassign Incidents | Squadcast

CommsFlow Messaging Templates | Blameless

Types of Incident Retrospective Templates

Take the "work" out of your incident workflow: Integrating Blameless with Opsgenie

Reporting Incident Using Webforms I Creating Alerts from Outside the Squadcast Ecosystem I Squadcast

How To Setup Outgoing Webhooks in Squadcast | Recieving Incident Information | Squadcast

How to Set up SLOs and Configure SLIs in Squadcast | Tracking Error Budget & Burn Rates | Squadcast

5 Best practices for developing a culture of continuous improvement

Suppression Rules in Squadcast | Minimise Alert fatigue | Suppress Non-Actionable Alerts | Squadcast

Using Tagging and Routing Rules in Squadcast I Incident Classification I Event Tagging I Squadcast

Announcing our improved Schedules & On-Call Rotations

SRE Report 2023: Findings From the Field - Toil

[SRE: From Theory to Practice] What's difficult about problem detection?

[SRE: From Theory to Practice] What's difficult about incident command?

Adding Incident Watchers in Squadcast | Incident Notifications and Updates | Squadcast

SRE Vs. DevOps: A Simple Breakdown Of The Differences

SRE Principles for Edge Management and Improving Resiliency Using the Best of Kubernetes

Blameless Announces New Opsgenie Integration to Help Engineers Simplify and Speed Incident Management Workflow

Announcing: Blameless + OpsGenie Integration

Monthly Archive

Follow Us