Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

AIOps: Beyond the Hype - It's Not Hollywood AI

Many AIOps initiatives experience difficulties due to unrealistic expectations and a lack of a clear AIOps strategy. What is the reality beyond the AI hype, and how do we make these initiatives a success? Join us in this CTO Perspective discussion with Jason Walker, Field CTO at BigPanda, to find out.

Measure Customer Value with Self-Service Observability

DevOps practices, and the teams that implement them, are becoming increasingly critical to the value which any company provides its customers. This was the key message throughout a recent fireside chat between DevOps Institute Chief Ambassador Helen Beal and Moogsoft VP of Product and Design Adam Frank. A great paradox of the digital era is that, once written, software is invisible to those who write it.

Automation: The Key to Modern IT

Automation is everywhere in our day-to-day IT practices. Many of the processes that have been created for managing hardware and software components were designed, or at least initiated, in a time when managing only a few instances of an application was the norm. When we look at the work required to create, deploy, and maintain applications at a modern scale, the shortcomings of these processes become apparent.

What is IT Operations Management (and should you prioritize it)?

IT operations management (ITOM) involves the administration of technology applications and components across an enterprise. To effectively manage your IT operations, you must prioritize capacity management, security, availability, and cost-control of all IT infrastructure and assets. Yet, doing so can put a strain on your enterprise. At AlertOps, we offer a major incident management and response platform designed to help your enterprise manage its IT operations.

Is your online gaming platform "Chaos Monkey"-proof?

Try to imagine a bunch of monkeys running around your data center, pulling cables, trashing routers and wreaking havoc on your applications and infrastructure. Ever more crucial in these days of heated competition between online gaming operators, is player experience. Continuity of operations is “Uber-Alles” and avoiding churn, due to service disruption, is the organizational mantra.

Zen Your Life With IT Event Noise Reduction

IT incident responders have been inundated with alerts since the start of the COVID-19 pandemic. These engineers must dig through their messages to collect and respond to real alerts for real critical events. This process wastes time and prolongs incident response. The objective is to focus on IT event noise reduction to recognize and resolve real incidents promptly.

Incident Management in Mattermost: Creating an Incident Playbook

The idea behind Incident Management is to be ready. Not ready for anything, as that can be an unrealistic expectation, but ready to respond when the unexpected inevitably happens. DevOps teams often create incident playbooks in order to ensure they are as ready as possible to handle situations as they arise. Luckily, there is some amazing documentation on how to do just that from our friends at PagerDuty.

Improve Customer Satisfaction With Customer Service Incident Commanders

The global pandemic has drastically accelerated digital transformation initiatives and forced organizations to reimagine customer service by having them take on the incident commander role in managing and responding to customer issues and engaging with customers. In addition to prioritizing digital services, many businesses have migrated to the cloud to increase business agility, develop and deliver new features faster, and meet the growing demands of end users.