Operations | Monitoring | ITSM | DevOps | Cloud

August 2023

10 Observability Tools in 2023: Features, Market Share and Choose the Right One for You

Understanding what's happening within your systems is a necessity. Have you ever wondered how experts keep an eye on systems to make sure everything's running smoothly? That's where observability tools come in! Observability tools are like helpers that give you a peek inside your tech. In this blog, we will talk about observability tools and how they can be used in different situations so it's easier for you to choose the right one for your organization.

How To Write Incident Postmortems

Writing a public postmortem regarding an outage is essential to maintaining transparency and accountability when things go wrong in a service or system. The purpose of writing a postmortem is to analyze and document an incident or event that has occurred, usually with a focus on identifying its root causes, understanding what went wrong, and outlining steps to prevent similar issues from happening in the future.

Evolution of Site Reliability - Incidentally Reliable with Manoj Sebastian

Catch Manoj Sebastian(ex-Flipkart, Amazon, Atlassian, Intuit, Yahoo) talk about The Evolution of SRE through 20 years, Incident Response and Post Incident Culture at Big Tech and the Future of Reliability with AI ramping up at full speed. The freshest podcast for Site Reliability Engineers, hosted by Vishwa and Shubham from Zenduty.

Understanding Blameless Postmortems

Progress often accompanies unforeseen challenges and mishaps in organizations. Traditionally, these setbacks resulted in pointing fingers, hindering progress, and creating a negative work atmosphere. However, a "Blameless Postmortems" approach transforms how organizations respond to failure. In this blog, we will delve into the importance of cultivating a blameless postrmortem culture when faced with setbacks.