Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on AIOps, alerting in complex systems and related technologies.

Resolve Systems Named a Strong Performer in Infrastructure Automation

We are excited to share that Resolve has been named a Strong Performer in “The Forrester Wave™: Infrastructure Automation Platforms, Q3 2020.” This report evaluates how the 13 most significant infrastructure automation platform providers measure up and helps infrastructure and operations professionals select the right solution for their needs.

Performing Zabbix Alert Correlation and Incident Acceleration with CloudFabrix AIOps

CloudFabrix AIOps 360 solution can ingest alerts, events, metrics and from various monitoring tools to perform event correlation, alert noise reduction and enable incident resolution acceleration. Learn more about CloudFabrix AIOps 360 In this blog I will cover Zabbix integration aspects with our AIOps 360 solution. Zabbix is one of the popular open source monitoring platforms used by many enterprises and MSPs, including some of our customers.

AIOps Best Practices | First Data/Fiserv: Going Ticketless with AIOps and Moogsoft

At First Data/Fiserv, AIOps dramatically improved incident management and resolution, a transformation that allowed this financial services provider to almost go ticketless. The speakers describe the entire process, started when the CIO called for a global, next-gen monitoring platform. First Data/Fiserv soon realized that Moogsoft’s collaboration and record-keeping capabilities allowed it to slash tickets by 95%. They also describe how the system was fine-tuned to handle both regular and critical incidents transparently.

Telemetry Everywhere: Observability in the DevOps Cosmos

Rockets constantly blast off into space headed towards planets, aiming to create shiny new stars, while meteors whizz by them, threatening their journeys. That’s how global DevOps expert Helen Beal describes the complicated and risky universe of DevOps practitioners and SRE teams. The rockets are these teams’ frequent code releases. Planets represent customers that benefit from the value — stars — created by these launches.

Automated Root Cause Analysis & Anomaly Detection in Concert

Everyday IT operators are trying to prevent outages of business-critical applications. When prevention is not possible, IT operators strive to reduce the mean time to repair (MTTR) as much as possible. Improving resolution time can be quite a challenge. But IT operators don't stand alone in this challenge. They can use smart solutions that support Automated Root Cause Analysis and Anomaly Detection.

Why Observability Matters to Site Reliability Engineers

This is the first in a three-post series themed around Ops-led DevOps, where I’ll explore the relationship between observability and a set of software delivery lifecycle practices that support the adoption of DevOps practices and the transition from project to product-centric ways of working. I’ll start with Site Reliability Engineering, move onto Value Stream Management and finish with Continuous Delivery.

The New Normal: Shifting IT Strategies in the Wake of the Pandemic

There’s next to nothing in the world that hasn’t been impacted by COVID-19. We’ve now reached the stage of the pandemic where we’re evaluating the effect on every part of our lives. Over the last few weeks, I’ve spent a lot of time speaking with IT leaders and reflecting on how the business technology landscape has been shifting.