October 2022

Why 'owning Services' is critical for effective Incident Response

Oct 31, 2022 By Vardhan NS In Squadcast

There is a famous quote that goes like this…‘For every minute spent organizing, an hour is earned.’ At least in the world of incident response, nothing is more apt than this. Digital infrastructure these days is made up of multiple services, an outage could result from either one impacted service or multiple impacted services. So it's essential to have a catalog of all the services along with the point of contact (service owner) responsible for maintaining it.

Read Post

Squadcast

Read more about Why 'owning Services' is critical for effective Incident Response

On Building a Platform Team

Oct 31, 2022 By Jess Mink In Honeycomb

It may surprise you to hear, but Honeycomb doesn’t currently have a platform team. We have a platform org, and my title is Director of Platform Engineering. We have engineers doing platform work. And, we even have an SRE team and a core services team. But a platform team? Nope. I’ve been thinking about what it might mean to build a platform team up from scratch—a situation some of you may also be in—and it led me to asking crucial questions. What should such a team own?

Read Post

Honeycomb

Read more about On Building a Platform Team

Blameless - Integrated, proactive incident management

Oct 28, 2022 By Blameless In Blameless

Recruit the right players, manage communications and capture everything without ever leaving Slack. When things break, Blameless has you covered.

View Video

Blameless

Read more about Blameless - Integrated, proactive incident management

Routing alerts from AWS Elastic Beanstalk via CloudWatch

Oct 27, 2022 By Vishal Padghan In Squadcast

Amazon Web Services (AWS) offers 100+ services, each focusing on a specific area of functionality. However, it can be challenging to pick the right services for the task and also to provision them. AWS Elastic Beanstalk, lets you easily deploy and manage applications without the need to learn about the underlying infrastructure that runs these applications.

Read Post

Squadcast

Read more about Routing alerts from AWS Elastic Beanstalk via CloudWatch

Incident Tracking - How it Works & Why It Matters | Blameless

Oct 26, 2022 By Noor-ul-Anam Ruqayya In Blameless

Looking into incident tracking? We explain what incident tracking is, how it’s done, and why it matters.

Read Post

Blameless

Read more about Incident Tracking - How it Works & Why It Matters | Blameless

Introduction to Automation Testing Strategies For Microservices

Oct 25, 2022 By Rajiv Srivastava In Squadcast

Microservices are distributed applications deployed in different environments and could be developed in different programming languages having different databases with too many internal and external communications. A microservice architecture is dependent on multiple interdependent applications for its end-to-end functionalities. This complex microservices architecture requires a systematic testing strategy to ensure end-to-end (E2E) testing for any given use case. In this blog, we will discuss some of the most adopted automation testing strategies for microservices and to do that we will use the testing triangle approach.

Read Post

Squadcast

Read more about Introduction to Automation Testing Strategies For Microservices

What Is Infrastructure Monitoring & How Does It Work?

Oct 19, 2022 By Myra Nizami In Blameless

We explain what infrastructure monitoring is, how it works, how to overcome the challenges in complex systems, best practices for monitoring, and the tools you need.

Read Post

Blameless

Read more about What Is Infrastructure Monitoring & How Does It Work?

Authors' Cut-Gear up! Exploring the Broader Observability Ecosystem of Cloud-Native, DevOps, and SRE

Oct 13, 2022 By Liz Fong-Jones In Honeycomb

You know that old adage about not seeing the forest for the trees? In our Authors’ Cut series, we’ve been looking at the trees that make up the observability forest—among them, CI/CD pipelines, Service Level Objectives, and the Core Analysis Loop. Today, I'd like to step back and take a look at how observability fits into the broader technical and cultural shifts in technology: cloud-native, DevOps, and SRE.

Read Post

Honeycomb

Read more about Authors' Cut-Gear up! Exploring the Broader Observability Ecosystem of Cloud-Native, DevOps, and SRE

SRE Fundamentals: Everything you need to know

Oct 13, 2022 By Cortex In Cortex

Google has had an outsized impact on the world, from its unrivaled search engine to its expansion into a range of customer-focused services. It would be difficult to make an impact of this magnitude without also leading the way in the software development industry. One of its biggest contributions to the community is a set of principles known as site reliability engineering or SRE.

Read Post

Cortex

Read more about SRE Fundamentals: Everything you need to know

Reliability vs. Availability: What's The Difference?

Oct 12, 2022 By Noor-ul-Anam Ruqayya In Blameless

Reliability and availability have different meanings when it comes to software. What are the differences and what is the importance of each?

Read Post

Blameless

Read more about Reliability vs. Availability: What's The Difference?

Setting better SLOs using Google's Golden Signals

Oct 11, 2022 By Andre Newman In Gremlin

To many engineers, the idea that you can accurately and comprehensively track your application's user experience using just a few simple metrics might sound far-fetched. Believe it or not, there are four metrics that aim to do just that. They're called the four Golden Signals and should be a core part of your observability and reliability practices.

Read Post

Gremlin

Read more about Setting better SLOs using Google's Golden Signals

The Blameless Complete Guide to Incident Management

Oct 10, 2022 By

Incidents are inevitable. As your service expands and becomes more complex, you are more likely to encounter outages, slowdowns, errors, and other disruptions to healthy operation. At the same time, as your service becomes more popular and relied on by users, the cost of incidents becomes higher. Studies have shown that the cost of downtime is high, and growing fast in the digital-first world. Since you can never fully prevent incidents, it's important to resolve them as efficiently as possible.

Get EBook

Blameless

Read more about The Blameless Complete Guide to Incident Management

How Many SREs Does Your Company Need? Here's How to Decide

Oct 9, 2022 By JJ Tang In Rootly

So you’ve decided to take advantage of Site Reliability Engineering by hiring SREs for your company. Now, you have a second decision to make: Exactly how many SREs to hire. Do you need just one or two SREs? Or should you build a sprawling SRE team, with a dozen or more SREs on hand to support your organization’s reliability needs? The answers to these questions will, of course, vary; every business’s needs are different.

Read Post

Rootly

Read more about How Many SREs Does Your Company Need? Here's How to Decide

Announcing Incident watchers: Subscribe to incidents and receive incident updates in real-time

Oct 6, 2022 By Nakul Shetty In Squadcast

Hey folks, We’re back with another feature update for all our customers! We have recently gone live with the incident watchers feature which nests within an incident details page. This blog will outline how you can access the feature, its primary functionalities and how we foresee it helping improve your incident management process. Note: This feature will be available to pro, premium and enterprise plan users only.

Read Post

Squadcast

Read more about Announcing Incident watchers: Subscribe to incidents and receive incident updates in real-time

SRE Hiring Guide - Interview Questions and Skills to Look for

Oct 5, 2022 By Myra Nizami In Blameless

Are you looking to start an SRE team or add to your existing team? We explain the SRE hiring process and how to find and evaluate an SRE.

Read Post

Blameless

Read more about SRE Hiring Guide - Interview Questions and Skills to Look for

Kubernetes alternatives to Spring Java framework

Oct 4, 2022 By Rajiv Srivastava In Squadcast

Spring Cloud and Kubernetes both complement each other to build a cloud native platform and run microservices on the Kubernetes containers. Kubernetes provides many features which are similar to Spring Cloud and Spring Config Server features. Spring framework has been around for many years. Even today, many organizations prefer to go with Spring libraries because it provides many features. It's a great deal when developers have total control over cloud configuration along with business logic source code.

Read Post

Squadcast

Read more about Kubernetes alternatives to Spring Java framework

Introducing Squadcast Premium

Oct 3, 2022 By Squadcast Community In Squadcast

For the last few years, Squadcast has been building out a market-leading on-call and alert management solution. Over the past few quarters, we have significantly enhanced our on-call product by releasing and improving features related to Incident Response - including Slack / MS Teams integration, Runbooks, Postmortems, Service Level Objectives, and Status Pages. We believe that a reliability platform involves both on-call and incident response - one cannot work effectively without the other.

Read Post

Squadcast

Read more about Introducing Squadcast Premium

Operations | Monitoring | ITSM | DevOps | Cloud

October 2022

Why 'owning Services' is critical for effective Incident Response

On Building a Platform Team

Blameless - Integrated, proactive incident management

Routing alerts from AWS Elastic Beanstalk via CloudWatch

Incident Tracking - How it Works & Why It Matters | Blameless

Introduction to Automation Testing Strategies For Microservices

What Is Infrastructure Monitoring & How Does It Work?

Authors' Cut-Gear up! Exploring the Broader Observability Ecosystem of Cloud-Native, DevOps, and SRE

SRE Fundamentals: Everything you need to know

Reliability vs. Availability: What's The Difference?

Setting better SLOs using Google's Golden Signals

The Blameless Complete Guide to Incident Management

How Many SREs Does Your Company Need? Here's How to Decide

Announcing Incident watchers: Subscribe to incidents and receive incident updates in real-time

SRE Hiring Guide - Interview Questions and Skills to Look for

Kubernetes alternatives to Spring Java framework

Introducing Squadcast Premium

Monthly Archive

Follow Us