Zenduty

Bangalore, India
2019
  |  By Ashwin Hariharan
In 1750 BCE, in the bustling heart of ancient Mesopotamia, a copper merchant named Ea-nāṣir thought he had closed another routine sale of copper ingots. Little did he know, his customer wasn't exactly thrilled. In fact, the customer was so displeased that he decided to write Ea-nāṣir a strongly worded letter. Yes, you heard that right! A literal stone tablet of dissatisfaction, complaining about the shoddy grade of copper and some other delivery mishap.
  |  By Anjali Udasi
When it comes to managing incidents and ensuring operational efficiency, understanding key metrics is crucial. Among the most important are MTBF (Mean Time Between Failures), MTTR (Mean Time To Repair), MTTF (Mean Time To Failure), and MTTA (Mean Time To Acknowledge). In this blog, we'll explore these metrics along with some best practices and practical applications.
  |  By Anjali Udasi
System reliability is crucial for providing seamless user experiences and enabling effective business operations. The "4 Golden Signals" —latency, traffic, errors, and saturation—offer a comprehensive view of system performance and potential issues. In this blog, we deep dive into system reliability and explore these four key metrics for monitoring system health and ensuring optimal performance.
  |  By Anjali Udasi
We had the pleasure of meeting Ponmani Palanisamy, a Staff Site Reliability Engineer at LinkedIn, at a recent SRE Meetup in Bangalore. Ponmani gave an insightful talk on "Improving data redundancy and rebalancing data in HDFS." We were captivated by his talk and eager to learn more about his experience in the reliability space. We talked about everything including his journey, experiences, and of course, his most memorable war room stories over a steady career of 17 years. Here's what he had to share.
  |  By Anjali Udasi
Organizations prioritize Key Performance Indicators (KPIs) and Service Level Agreements (SLAs) to achieve optimal performance. However, understanding the differences between KPIs and SLAs can be challenging. In this blog, we discuss everything about Key Performance Indicators (KPIs), Service Level Agreements (SLAs), and the key differences between KPIs vs SLAs.
  |  By Anjali Udasi
At a recent SRE Meetup in Bangalore, we had the pleasure of meeting Akshay Deshpande. During our conversation, Akshay, who manages a Performance/Observability Engineering team at Smarsh discussed his passion for observability and his constant drive to improve the field. Smarsh helps companies gain valuable insights from their communication data, enabling them to proactively identify potential regulatory and reputational risks before they escalate.
  |  By Anjali Udasi
Traditional database design prioritizes data integrity through normalization. However, for read-heavy workloads, normalized data structures can lead to complex queries and slower performance. Denormalization offers an alternative approach to optimize query execution and improve efficiency. A study concluded that denormalization can improve query performance when implemented with a thorough understanding of application requirements.
  |  By Shubham Srivastava
I was out there in sunny Austin this February, speaking at Civo Navigate 2024. The event was jam packed with amazing talks, and it was great meeting so many people with long and fascinating careers in engineering and Site Reliability. I had the privilege of meeting Bob Lee, who currently leads DevOps at Twingate — a cloud-based service that provides secured remote access, and poised to replace VPNs.
  |  By Shubham Srivastava
I was out there in sunny Austin this February, speaking at Civo Navigate 2024. The event was jam packed with amazing talks, and it was great meeting so many people with long and fascinating careers in engineering and Site Reliability. I had the privilege of meeting Bob Lee, who currently leads DevOps at Twingate — a cloud-based service that provides secured remote access, and poised to replace VPNs.
  |  By Anjali Udasi
The digital world comes with advantages and inherent risks. These IT incidents, which can encompass cyberattacks, system outages, and data breaches, can have a devastating impact. Beyond financial losses, IT incidents disrupt business operations, damage reputations, and erode customer trust. During an outage, having a well-prepared Incident Response Team (IRT) is essential to reduce downtime and improve response times.
  |  By Zenduty
Catch Ramiro Berrelleza — Founder and CEO at Okteto talk about how impactful DevTool startups are built, the importance of investing in Developer Experience, and the emerging issues with the Cloud Native ecosystem.
  |  By Zenduty
Catch Krishnendu Majumdar (CPTO at Yubi) talk about his journey in the dynamic Indian startup ecosystem, strategies to build for scale from Day 1 and insights into building sustained user trust via exceptional product performance in high governance industries like credit and finance.
  |  By Zenduty
Catch Niall Murphy (Co-Founder of Stanza Systems) talk about graceful degradation, what startups are getting wrong about reliability and how well-thought user-experiences can communicate credibility to current and potential customers. Exclusively on The Incidentally Reliable podcast — made by SREs for SREs, hosted by Zenduty.
  |  By Zenduty
What are some startups Solomon Hykes is rooting for? What's his most controversial opinion? Who are some community members that more people should follow? Discover the answers to these questions, and a lot more in the Incidentally Reliable Podcast with Solomon Hykes, live on all major platforms! Tune in as Solomon shares stories from the early days of Docker, Inc, the rollercoaster journey leading to 20 million active developers worldwide, the heavy crown of a tech leader and his vision to revolutionize CI/CD with Dagger today.
  |  By Zenduty
Catch Solomon Hykes (Co-founder of @Docker and @Dagger) shares stories from the early days of Docker, the rollercoaster journey leading to 20 million active developers worldwide, the heavy crown of a tech leader and his vision to revolutionize CI/CD with Dagger today. Exclusively on The Incidentally Reliable podcast — made by SREs for SREs, hosted by Zenduty.
  |  By Zenduty
We're about to drop a major revamp to one of your most used Zenduty features. Get ready to experience scheduling like never before! Join our YouTube Premiere Live to see the new on-call schedules that'll make your on-call life smoother and better! P.S. Zenduty is a revolutionary incident management platform that gives you greater control and automation over the incident management lifecycle.
  |  By Zenduty
Dive into an in depth conversation on how software has now become the backbone of things and get access to extraordinary reliability nuggets with Piyush. Zenduty is a revolutionary incident management platform that gives you greater control and automation over the incident management lifecycle.
  |  By Zenduty
Catch Piyush Verma, Co-Founder and CTO at Last9 in conversation with Ankur Rawal, Co-Founder and CTO at Zenduty — discussing what reliability means to the modern consumer, why SREs make excellent decision-makers, and the current state of observability. Exclusively on The Incidentally Reliable podcast — made by SREs for SREs, hosted by Zenduty. Zenduty is an advanced incident management platform that gives you greater control and automation over the incident management lifecycle.
  |  By Zenduty
Settle in and listen to Suresh Kumar Khemka(Head of Platform & Infra at apna) talk about platform engineering, balancing bureaucracy and velocity at startups and Tech Giants, and the rippling impact of an e-commerce's downtime. Exclusively on The Incidentally Reliable podcast — made by SREs for SREs, hosted by Zenduty.
  |  By Zenduty
Grab some popcorn and catch Viraj talk about his experiences and BookMyShow's journey from its inception in the early 2000s to the entertainment behemoth it is today, their stints innovating at the forefront of the mobile and e-commerce revolutions, and their harmony with reliability engineering in the colourful, ever-changing yet challenging world of movies and online ticketing. Exclusively on The Incidentally Reliable podcast — made by SREs for SREs, hosted by Zenduty.

Zenduty is a collaborative incident management system for the management of always-on services, helping teams orchestrate incident response for creating better user experiences and brand value. Zenduty centralizes all incoming alerts through predefined notification rules to ensure that the right people are notified at the right time.

Zenduty supports over 100+ integrations where IT teams receive contextual notifications from the services of their choice to foster speedy resolution of potentially damaging downtime:

  • Assign predefined incident roles along with highly customizable task templates to empower teams to rapidly resolve crisis with minimal noise and confusion.
  • Customizable escalation policies define your internal alerting rules as per your company's on-call schedules to notify the right responders.
  • Leverage rich contextual data to perform rapid RCAs
  • Customizable post-mortems insights to streamline processes and institutionalize a culture of continuous improvement and world-class reliability.

Modern on-call and incident response platform for SRE, DevOps, ITOps and Support teams.