Operations | Monitoring | ITSM | DevOps | Cloud

May 2020

Zenduty - Incident Priorities and SLAs

Incident Priorities and SLAs in Zenduty Incident SLAs let you set acknowledgement and resolution SLAs for your incidents. SLAs allow your teams to prioritize incidents as well as increase transparency amongst incident stakeholders - support, account managers and management. Incident priority is the sequence in which an Incident or Problem needs to be resolved, based on Impact and Urgency. Priority also defines response and resolution targets associated with Service Level Agreements. Each team in Zenduty can define their own priorities like P0/P1/P2/P3 or L0/L4/L16 etc.

Using context to triage change-triggered incidents

One of the first things incident managers do when they get an alert page from Zenduty is to check the “Context” tab of the incident. Incident context is extremely critical to get a first responder’s view of what happened and what could possibly have caused it. Context tells you what happened before an incident. In the case of 40–50% of all incidents, Zenduty’s incident context can tell you within 5–10 seconds, what could be the cause of an incident.

Real-time alerts from Zabbix and escalation with Zenduty

Recently, one of our customers, a 20-member NOC team of a large B2C company, had set up Zabbix to monitor a network of over 1000+ servers, routers, and switches. The NOC team wanted to set up alerting, on-call scheduling, and an escalation matrix whenever a critical network component encountered any downtime. The NOC team used Slack as the primary communication channel and Zoom for real-time communication. For NOC teams like these running a very large operation, setting up alerting can be very tricky.

Accelerating your Zendesk customer support response times by 50% and meeting support SLAs

Zendesk is one of the most popular ticketing support and customer service platforms available in the market. Two metrics that measure the effectiveness of your customer support are the response and resolution times — how soon are you able to respond to a customer ticket, and how soon are you able to mobilize relevant personnel, perform necessary remediation tasks and finally resolve the ticket.

Monitoring service health and downtime events within your Google Cloud with Zenduty

Google Cloud Platform (GCP) is a collection of Google’s computing resources, made available via services to the general public as a public cloud offering. The GCP resources consist of physical hardware infrastructure — computers, hard disk drives, solid-state drives, and networking — contained within Google’s globally distributed data centers, where any of the components are custom designed using patterns similar to those available in the Open Compute Project.