Operations | Monitoring | ITSM | DevOps | Cloud

Incident Management

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

Webinar: 2023 ITOps budgeting to win: use new research-based outage cost data

It’s no secret that the digital transformation essentially broke IT operations. With the rise in technology came a rise in outages capable of bringing organizations to a screeching halt. Those outages are expensive, and for years, the same number was thrown around as the authority on how much an outage cost (around $5,600 per minute). This number took off and was used in presentations, sales decks and other resources for years. But how could this number have stayed the same year over year?

Maximize efficiency with Terraformer: Manage Squadcast resources via IaC

Ever since Terraform was first launched by HashiCorp, infrastructure teams have been quick to leverage its functionality. Because deploying infrastructure via code became so much easier and error-free. This surely became a great way to deploy new infrastructure with custom configurations, but what about managing cloud infrastructure that is already defined? Can Terraform be used to make changes to them? Or can it be used to deploy the same configurations to new environments?

Automation Seasons Freezings Wrap Up and New Year's Resolutions

It’s that time of year where you may feel pressured to pick your New Year’s resolutions. Well, we went ahead and tried to give you a head start. 2023 is the year we tame toil so we can focus on the fun stuff like engineering and innovation. Hopefully you have had the chance to follow along with us for the month of December for Seasons Freezings, the time of year you are locked out of production, so you have time to explore new ideas like automation 🙂.

Alarm optimization - what SIGNL4 has to offer

Having all relevant information pertaining to a critical incident is vital for quickly identifying the issue and prioritize its importance. SIGNL4 optimizes the perception, response and handling of incidents through customizable alerts with enriched parameters, images, sounds files, links to tickets or PDFs, as well as maps with geo-location information.

Best Practices for API Versioning

As your experience and knowledge of a system grow, change becomes inevitable. Your application requirements change, your bug fixes require code changes, and your APIs evolve. A key challenge in the software ecosystem is managing changes—especially when they concern APIs. Because you’re likely using APIs in multiple applications, you must document all updates and changes made to your APIs. This is where API versioning becomes crucial.

Why AIOps is the Connector Between Monitoring, Observability and Incident Management

Over the years, as companies have moved from monolith to cloud-native architectures, maintaining high availability has become more challenging. After all, today’s IT ecosystems are complex, distributed and ephemeral, making it increasingly difficult (and, in many cases, downright impossible) for DevOps practitioners and SREs to identify and fix issues manually.

Incident management vs. event management

As you explore IT event management and IT incident management, they may look and even sound similar, but it’s essential to understand how they differ. Your IT management team needs to know what to look for, both in an event and an incident, so they can resolve any red-flag issues and return your system to normalcy. But why is it so important to recognize the difference?

Goodbye, 2022. Hello, 2023 - reflecting on a year of change, progress and incidents

Let’s get one thing out of the way: we’re going into 2023 on a high-note. We’ve closed deals with some of the most respected companies in both the UK and US, we’ve hired in the double-digits, expanded into New York, and revenue is growing steadily. But we aren’t hanging up our football boots just yet. Yes, we can take some time to celebrate our wins, but we’re all hands on deck for 2023 planning.