Operations | Monitoring | ITSM | DevOps | Cloud

Alerting

Adtech Leader Natural Intelligence Now Resolving Glitches in Minutes Rather than Days

Natural Intelligence runs comparison websites that generate millions in ad traffic. A glitch could easily cost the company thousands in ad revenue. VP R&D Lior Schachter shares the difference Anodot’s real-time analytics, with machine learning anomaly detection, has made across the company.

Making the Most of PagerDuty + Datadog

For your team to effectively respond to incidents, you need a shared, unambiguous incident definition so you can recognize when an incident has occurred and assign the appropriate severity. Definitions of an incident differ across teams, but whatever definition you use, identifying and monitoring key service level indicators (SLIs) can help you understand when your service is operating normally—and when its performance has degraded to the point where you need to trigger an incident.

A single person on-call "rotation" is a critical vulnerability

One of the most common complaints we hear from operations and site reliability engineers is about the quality of life impacts and the resulting stress imposed by their on-call responsibilities. Most of us are already aware that a proper on-call rotation is critical to our engineering organization’s health in terms of both immediate incident response and long-term sustainable growth.

OnPage Mentioned in Two 2019 Gartner Hype Cycle Reports

Gartner’s Hype Cycle for Business Continuity and IT Performance Analysis are trusted reports, identifying solutions that enhance and solidify an organization’s business continuity. The OnPage team is pleased to announce that we’ve been included in two of Gartner’s Hype Cycle reports, listing OnPage’s incident alert management solution as a trusted tool for today’s support teams.