January 2024

How Organizations Hire SRE's- Laterals or Internal?

Jan 27, 2024 By Anjali Udasi In Zenduty

Securing reliable system operation necessitates building a formidable Site Reliability Engineering (SRE) team. However, a critical strategic decision confronts every organization: do we cultivate SRE talent internally or venture into the external talent pool? Both approaches possess distinct advantages and disadvantages, each impacting the composition, skillset, and overall effectiveness of the SRE team.

Read Post

Zenduty

Read more about How Organizations Hire SRE's- Laterals or Internal?

8 Strategies for Reducing Alert Fatigue

Jan 16, 2024 By Anjali Udasi In Zenduty

Site Reliability Engineers (SREs) and DevOps teams often deal with alert fatigue. It's like when you get too alert that it's hard to keep up, making it tougher to respond quickly and adding extra stress to the current responsibilities. According to a study, 62% of participants noted that alert fatigue played a role in employee turnover, while 60% reported that it resulted in internal conflicts within their organization.

Read Post

Zenduty

Read more about 8 Strategies for Reducing Alert Fatigue

Tech is Easy, People are Hard - Incidentally Reliable with Suresh Kumar Khemka(Head of Infra @apna)

Jan 15, 2024 By Zenduty In Zenduty

Settle in and listen to Suresh Kumar Khemka(Head of Platform & Infra at apna) talk about platform engineering, balancing bureaucracy and velocity at startups and Tech Giants, and the rippling impact of an e-commerce's downtime. Exclusively on The Incidentally Reliable podcast — made by SREs for SREs, hosted by Zenduty.

View Video

Zenduty

Read more about Tech is Easy, People are Hard - Incidentally Reliable with Suresh Kumar Khemka(Head of Infra @apna)

Non-Abstract Large System Design (NALSD): The Ultimate Guide

Jan 13, 2024 By Anjali Udasi In Zenduty

Non-Abstract Large System Design (NALSD) is an approach where intricate systems are crafted with precision and purpose. It holds particular importance for Site Reliability Engineers (SREs) due to its inherent alignment with the core principles and goals of SRE practices. It improves the reliability of systems, allows for scalable architectures, optimizes performance, encourages fault tolerance, streamlines the processes of monitoring and debugging, and enables efficient incident response.

Read Post

Zenduty

Read more about Non-Abstract Large System Design (NALSD): The Ultimate Guide

How to Calculate and Minimize Downtime Costs

Jan 5, 2024 By Anjali Udasi In Zenduty

Downtime is an unwelcome reality. But, beyond the immediate disruption, outages carry a significant financial burden, impacting revenue, customer satisfaction, and brand reputation. For SREs and IT professionals, understanding the cost of downtime is crucial to mitigating its impact and building a more resilient infrastructure.

Read Post

Zenduty

Read more about How to Calculate and Minimize Downtime Costs

Operations | Monitoring | ITSM | DevOps | Cloud

January 2024

How Organizations Hire SRE's- Laterals or Internal?

8 Strategies for Reducing Alert Fatigue

Tech is Easy, People are Hard - Incidentally Reliable with Suresh Kumar Khemka(Head of Infra @apna)

Non-Abstract Large System Design (NALSD): The Ultimate Guide

How to Calculate and Minimize Downtime Costs

Monthly Archive

Follow Us