Operations | Monitoring | ITSM | DevOps | Cloud

Integrating dynamic SaaS hosted Uptime Monitoring into your customer-served Applications

Imagine you are rolling out your application to multiple customers, they even might use it on premise. Of course you want to know if your application is running fine and the customer is not experiencing any kind of trouble or downtime - surely you would not want to ship this validation in your own system, as that might also be prone to any kind of error at some point. Which is why you decide to go for a third party uptime monitoring solution e.g. Uptime Monitoring.

Tips for Modern NOCs - Correlating Incidents to the IT Changes that Caused Them

Every NOC engineer will tell you that the first thing they look for in an outage is “what changed?”. And they are right to look. While every organization is unique, Gartner reports that on average about 80% of IT incidents today are caused by changes in infrastructure and/or software.

OnPage Overrides Silent Switch on iOS and Do Not Disturb Mode

Since its inception, OnPage has been dedicated in providing a powerful critical alerting solution. This mission continues in 2020, as OnPage is pleased to introduce its ability to override the silent switch and Do Not Disturb (DND) mode on iOS. The latest advancements ensure that tasked recipients always receive high-priority, OnPage audible alerts, regardless of their current iPhone settings.

On-call On-boarding Checklist

And it starts with the company culture. Irrespective of how small or large your team is, it’s wise to invest some time in creating a good on-call onboarding plan. A humane on-call is the mark of a good engineering culture. Being on-call means that you’re expected to be reachable for any issues that may occur during your shift. It’s easy to lose any and all motivation by just anxiously anticipating that mid-dinner ping.

Importance of Dependency Mapping & Asset Intelligence

Enterprise applications typically sprawl and develop inter-dependencies producing complicated solutions. Ultimately the complexity makes change management complex, error prone, difficult to troubleshoot during service issues and ultimately start impacting the business in multiple ways. To provide the right context when taking up transformation initiatives or addressing service issues one should be equipped with dependency and impact insights. In this video, Rich Lane, a Sr.

Announcing Our Series A

It’s Friday at about quitting time, and my plans for the evening involved a great cocktail, hanging out with friends, and maybe continuing to binge The Office. Sadly, there was a problem. Our alerting system detected an enormous and immediate spike in errors. The error description was along the lines of “table ‘servers’ does not exist” and thousands of customers couldn’t use a large cloud provider’s services.

Driving Real-Time ChatOps With PagerDuty and Microsoft Teams

With over 75 million daily active users, it’s safe to say Microsoft Teams is essential to many global businesses. On top of that, Microsoft CEO Satya Nadella recently shared that Microsoft saw 200 million meeting participants in a single day this month. While Microsoft Teams’ explosive growth can be tied to recent spikes in remote work, many enterprises have relied on Teams to connect people across the globe for quite some time.

Building an Effective Alert Strategy

Alerts are an essential part of performance monitoring. Alerts and notifications need to be sent out as soon as an issue is identified, allowing you to know about any problems before your customers do. In this week’s Tip Tuesday, we look at building an effective alert strategy and how to utilize Catchpoint Alerts so that you can quickly and effectively leverage the information provided to take carefully targeted action and improve your MTTR. Building an effective alert strategy is important.