Operations | Monitoring | ITSM | DevOps | Cloud

Alerting

Introducing a brand-new look for Statuspage

Here at Statuspage, we take pride in helping you communicate proactively to customers during a service outage. We also believe that during moments of stress times, the tools you rely on should be simple, intuitive, and easy to use. Using simplicity as our guide, we’ve updated the design of the Statuspage management portal. We kept things organized much as before, keeping the focus on what’s most important – helping you create and update incident communications.

Demand Forecasting Using Autonomous Forecast

Demand forecasting is the process of predicting future demand for a company’s products or services. Understanding how many customers will want to purchase or use something is critical for acquiring inventory, planning capacity, scheduling resources, producing products and managing the supply chain. How many people should you schedule for a work shift? How many widgets should you produce this quarter? When should you build additional capacity into your systems?

Takeaways from PagerDuty Summit: Tracing + Timely Alerts

Last week, we spoke with a lot of folks at PagerDuty Summit, where we explored the power of error monitoring in a world that’s always on (or on-call). What a neat tweet. Throughout our conversations, one thing became clear: developers are seeking out increased efficiency — they want to minimize time to error detection and resolution. They want to know about issues, find the root cause, and fix them quickly so that they can move on to other things — like writing more code.

Zoom is now available in Opsgenie's Incident Command Center

When incidents occur, the key to a fast resolution is seamless communication. Traditionally, folks would gather in a “war room” – a room with four walls that served as a gathering place for various teams to solve high-impact problems. As incident management modernizes, teams are more dispersed, and therefore need a higher-tech way to assemble. Opsgenie developed the Incident Command Center (ICC) with exactly this in mind.

The importance of Incident Roles

Modern technology organizations are required to be adaptive in their approach to incident management. A single project will have multiple teams working as different branches on integrated systems. Even if all the members have unified communication channels when an interruption occurs in the service there’s bound to be chaos. The frontline response team will have to be on their toes to get to the root issues at the first signs of trouble.

Summit Day Two: New Integrations and Developer Platform to Bring Real-Time Work to More People

Yesterday, we kicked off PagerDuty Summit by launching new features that support the themes of Visibility and Intelligence. If you missed the keynotes or want to know more, check out this blog post. Today, we are making several announcements around two other themes that our CEO Jennifer Tejada touched on during her keynote yesterday: Platform and People. In fact, these themes are so closely related that we refer to them as one—that PagerDuty is a platform for people to do real-time work.

CIO Dive Playbook: AIOps Brings Calm to Overwhelmed IT Ops Teams

Much has been said about how Artificial Intelligence (AI) is already proving its ability to transform business, as well as the way most people live. In fact, according to Accenture’s “ExplAIned: A Guide for Executives,” AI is on par with such life-changing innovations as electricity and the internal combustion engine, and is no longer science fiction.

Slack Loses $8M to Outages

On July 22, 2019, Slack was in the middle of deploying an update to their desktop app. The update was supposed to decrease memory consumption and increase load time, but instead the company suffered a significant, widespread outage on a global scale. After approximately 40 minutes of downtime, the service was back up. But in the meantime, the company whose motto is ‘where work happens’ essentially stopped working.