Monthly Archive

What the Big Brother Approach to IT Monitoring and Incident Management May Be Missing

Jan 28, 2021 By Olaf Schouws In StackState

We asked in a recent poll which popular TV show your IT team resembles the most. Big Brother came out on top, with almost 40% of respondents saying that their incident resolution process most resembled this show. Would you compare your incident management process to an episode of Big Brother? If so, it's likely that your IT environment is highly monitored, but incidents still seem to slip through the cracks.

Read Post

StackState

Read more about What the Big Brother Approach to IT Monitoring and Incident Management May Be Missing

SLA vs SLI vs SLO: Know the differences between them.

Jan 28, 2021 By Neil Haran In OneUptime

SLA basically means a Service Level Agreement. It’s a formal agreement between you and your customer. It basically describes the reliability of your product/service so you can have a formal agreement which basically says our product will be online 99 percent of the time annually and if we fail to achieve that objective we will give 30% of your annual license fee back. SLA’s also include penalties in the contract.

Read Post

OneUptime

Read more about SLA vs SLI vs SLO: Know the differences between them.

SLA vs SLO vs SLI

Jan 27, 2021 By OneUptime In OneUptime

SLA vs SLO vs SLI Service Level Agreement (SLA): Formal agreement betwen you and your customer. Service Level Objective (SLO): Reliability goal of a resource in your organization (eg: API uptime should be 99.0% annually) Service Level Indicator (SLI): Current reliability (ie: what your monitoring tool tells you)

View Video

OneUptime

Read more about SLA vs SLO vs SLI

PagerDuty + AWS Outposts Integration Workflow Demo

Jan 27, 2021 By PagerDuty In PagerDuty

PagerDuty for AWS Outposts empowers teams to manage incidents in real-time for AWS infrastructure used in a private data center, co-location space, or on-premises facility.

View Video

PagerDuty

Read more about PagerDuty + AWS Outposts Integration Workflow Demo

PagerDuty + Amazon EventBridge Quick Start Integration Workflow Demo

Jan 27, 2021 By PagerDuty In PagerDuty

View Video

PagerDuty

Read more about PagerDuty + Amazon EventBridge Quick Start Integration Workflow Demo

Does your MSP Have a Backup and Disaster Recovery Plan?

Jan 27, 2021 By AlertOps In AlertOps

Data loss can cause big problems for managed service providers (MSPs) and their customers. With an MSP backup and disaster recovery (BDR) solution in place, MSPs can guard against data loss following a cyber attack, hardware failure, or any other IT incident.

Read Post

AlertOps

Read more about Does your MSP Have a Backup and Disaster Recovery Plan?

The U.S. COVID Vaccine Distribution Plan: Challenges and Solutions

Jan 27, 2021 By Ritika Bramhe In OnPage

As coronavirus (COVID-19) continues to spread and new virus strains emerge, the public is frantically looking for answers regarding the U.S. government’s vaccine distribution plan. A sound vaccine distribution plan is especially crucial in times like these. All U.S. states, stretching from both coasts, are experiencing a vast number of COVID-related deaths and hospitalizations. The dire situation underscores the importance of having an effective, accelerated vaccine delivery process.

Read Post

OnPage

Read more about The U.S. COVID Vaccine Distribution Plan: Challenges and Solutions

New Feature: Incident types

Jan 27, 2021 By Robert Ross In FireHydrant

Incidents are inevitable, and the reality is some of them are inevitably going to repeat themselves. FireHydrant has always strived to make the entire incident response lifecycle smooth, but up until today, common incident types were slightly burdensome for our customers. We decided it was time to help people make it easy to declare incidents using easy-to-use templates, which we’re deeming Incident types.

Read Post

FireHydrant

Read more about New Feature: Incident types

PagerDuty AWS Control Tower Integration Workflow Demo

Jan 26, 2021 By PagerDuty In PagerDuty

PagerDuty for AWS ControlTower helps teams drive real-time operations in multi-account AWS environments, add policy guardrails to disparate teams, and auto-remediate compliance issues.

View Video

PagerDuty

Read more about PagerDuty AWS Control Tower Integration Workflow Demo

OnPage Corporation Continues To Grow Despite the 2020 Pandemic

Jan 26, 2021 By OnPage Corporation In OnPage

WALTHAM, Mass., Jan. 25, 2021 — OnPage Corporation, a Boston-based incident management and pager replacement company, today unveiled its fiscal 2020 year in review. OnPage delivered another year of strong results considering the uncertain situation brought upon the world with COVID-19. Past year results were driven by current customers that rely on OnPage for critical notifications and had to enlarge their deployment.

Read Post

OnPage

Read more about OnPage Corporation Continues To Grow Despite the 2020 Pandemic

How to build your own incident management process

Jan 25, 2021 By Eyal Katz In Exigence

IT incident management is a fundamental operational process designed to ensure rapid service restoration. This process is typically assigned to the help desk but is also very much entrenched in the day-to-day of DevOps. When incident management goes right, service is restored quickly and the impact on productivity, continuity, and customer satisfaction is minimal.

Read Post

Exigence

Read more about How to build your own incident management process

7 Tips On Building And Maintaining An SRE Team In Your Company

Jan 22, 2021 By Squadcast Community In Squadcast

In today's "always on" world, Reliability is a primary business KPI. Plant the culture of Reliability by implementing these 7 simple tips to build a solid SRE team in your organization. Many of today’s hottest jobs didn’t exist at the turn of the millennium. Social media managers, data scientists, and growth hackers were never heard of before. Another relatively new job role in demand is that of a Site Reliability Engineer or SRE. The profession is quite new.

Read Post

Squadcast

Read more about 7 Tips On Building And Maintaining An SRE Team In Your Company

The Key Differences between SLI, SLO, and SLA in SRE

Jan 20, 2021 By Biju Chacko In Squadcast

To incentivize reliability in your platform, there should be shared goals across your team to measure & quantify the capabilities of your product/service along with customer experience. Define the path of "Always-On" services by understanding few key SRE fundamentals and their implications - SLIs, SLOs & SLA. Framing SRE metrics for building or scaling a product is quite a daunting task.

Read Post

Squadcast

Read more about The Key Differences between SLI, SLO, and SLA in SRE

Why AlertOps is the best PagerDuty alternative

Jan 20, 2021 By AlertOps In AlertOps

We will compare AlertOps to PagerDuty in 3 broad areas: On-call management Whether your on-call management needs are basic or complex, AlertOps has a solution for you. Creating on-call schedules is simple whether there one person on-call, two or more people on-call, or even multiple teams on-call. Escalations Automatic escalations based on your on-call schedules. Expand the possibilities with Workflows and Escalation Rule.

Read Post

AlertOps

Read more about Why AlertOps is the best PagerDuty alternative

4 Essential Types of MSP Tools (in 2021)

Jan 20, 2021 By AlertOps In AlertOps

Managed service providers (MSPs) need the right tools to get the job done quickly and securely. MSP tools dictate control over everything from virtual machine (VM) management and database administration to application and server monitoring. They can also help MSPs oversee IT infrastructure. MSP tools are valuable, but not all tools are created equal.

Read Post

AlertOps

Read more about 4 Essential Types of MSP Tools (in 2021)

2021 is the Year of Reliability

Jan 20, 2021 By Robert Ross In FireHydrant

There’s no better time than now to dedicate effort to reliable software. If it wasn’t apparent before, this past year has made it more evident than ever: People expect their software tools to work every time, all the time. The shift in the way end-users think about software was as inevitable as our daily applications entered our lives, almost like water and electricity entered our homes.

Read Post

FireHydrant

Read more about 2021 is the Year of Reliability

Working with Teams in SIGNL4

Jan 19, 2021 By SIGNL4 In SIGNL4

Creating and Using the new Teams features inside of Signl4

View Video

SIGNL4

Read more about Working with Teams in SIGNL4

OnPage Recognized in Gartner's Market Guide for Emergency Mass Notification Solutions

Jan 15, 2021 By Ritika Bramhe In OnPage

Gartner’s Market Guide for Emergency Mass Notification Solutions (EMNS) is a trusted report for security and risk management leaders. It provides insight into effective crisis communication procedures and identifies solutions that help perfect emergency management plans. The EMNS Market Guide has a large, loyal readership in several industries including, state and local government, healthcare, IT support and higher education.

Read Post

OnPage

Read more about OnPage Recognized in Gartner's Market Guide for Emergency Mass Notification Solutions

Best Practices for Incident Management: A Checklist

Jan 14, 2021 By Stephen Burke In Martello Technologies

If productivity is the engine that helps optimize how a business operates then being proactive is the oil and knowing how to effectively maintain productivity is regularly checking and replacing said oil. Whenever a service outage occurs it throws a wrench into the whole process and can put an entire organization in flux, mainly because the outage.

Read Post

Martello Technologies

Read more about Best Practices for Incident Management: A Checklist

The True Cost of Building your Own Incident Management System (IMS)

Jan 13, 2021 By Biju Chacko In Squadcast

Is your organization on the lookout for an incident management tool? If yes, you may wonder- am I better off building my own? Our latest blog outlines some of the key factors to consider while choosing whether to build or buy an incident management software.

Read Post

Squadcast

Read more about The True Cost of Building your Own Incident Management System (IMS)

Incident Communications With Alina Anderson

Jan 13, 2021 By Mandi Walls In PagerDuty

Incidents happen. They’re disruptive, they can be stressful, and if they aren’t managed well, they can cause chaos on your team. How your team manages incidents is only half the battle. How you let other stakeholders know what is going on is the other half. Alina Anderson from Smartsheet joined the Community team in our booth this year at PagerDuty Summit to talk about Incident Communications, and we’ve shared that conversation as an episode of our Page It to the Limit podcast.

Read Post

PagerDuty

Read more about Incident Communications With Alina Anderson

What's in store for IT Ops in 2021? Top execs from leading enterprises share their predictions

Jan 13, 2021 By Yoram Pollack In BigPanda

2020 is (finally) over, and it’s safe to say that this very challenging year taught us once again that (as the old Danish proverb says) it’s difficult making predictions, especially about the future. Who would have imagined in January 2020 that we would find ourselves where we are today… And yet, as Tim Harford once wrote in the Financial Times, predictions are like Pringles: nobody thinks that there’s any great virtue in them but we find them hard to resist.

Read Post

BigPanda

Read more about What's in store for IT Ops in 2021? Top execs from leading enterprises share their predictions

A look back at 2020

Jan 13, 2021 By The FireHydrant Team In FireHydrant

2020 was, needless to say, not the best. Looking on the brighter side, in December, FireHydrant turned 2, and in spite of it all, we grew quite a bit. We raised our $8M Series A in May, our team grew nearly 4x in size, added some amazing features such as making FireHydrant Runbooks even more powerful with conditions, and great integrations, which you can find here. But even better, we got to work with all of you!

Read Post

FireHydrant

Read more about A look back at 2020

The Top 10 Incident Management Solutions for 2021

Jan 13, 2021 By Noam Morginstin In Exigence

Moving full speed ahead into 2021, a year that is slated to be marked by unpredictability, fast-paced change, and (still) a lot of disruption – no organization can afford to allow such disorder to impact productivity, operations, and the business overall.

Read Post

Exigence

Read more about The Top 10 Incident Management Solutions for 2021

The Top Incident Management Software, Tools & Systems For 2021

Jan 12, 2021 By Eleanor Bennett In Logit.io

Incident management tools allow technology and security teams to resolve major incidents faster including urgent issues that may lead to businesses seeing application and site downtime affecting their users.

Read Post

Logit.io

Read more about The Top Incident Management Software, Tools & Systems For 2021

2-way integration with ConnectWise Manage

Jan 12, 2021 By René In SIGNL4

SIGNL4 now includes a new app for connecting to ConnectWise Manage (“Manage”). This makes 2-way integration with Manage a breeze, and responding to service desk tickets can now be done conveniently via mobile app in SIGNL4. This blog article has all the important details.

Read Post

SIGNL4

Read more about 2-way integration with ConnectWise Manage

Ruby on Rails Cheat Sheet

Jan 12, 2021 By Austin Miller In PagerTree

We’ve been doing some Ruby on Rails development lately, in preparation for PagerTree 4, and we wanted to put together a Ruby on Rails Cheat sheet. This is a quick reference guide to common ruby on rails commands and usage.

Read Post

PagerTree

Read more about Ruby on Rails Cheat Sheet

Building and Scaling Your SRE Team

Jan 12, 2021 By Julie Gunderson In PagerDuty

Building Site Reliability Engineering (SRE) teams is hard! There are so many articles and explanations of what SRE means, it’s easy to get lost. Going beyond understanding what the individual SRE role is into building and scaling a team of SREs is more of a challenge. It’s important to find the right information that will help you take your SRE team to the next level.

Read Post

PagerDuty

Read more about Building and Scaling Your SRE Team

5 Steps to Building a Robust Incident Response Plan for your MSP

Jan 12, 2021 By AlertOps In AlertOps

Today’s organizations face ransomware, malware, and other cyber attacks, and managed service providers (MSPs) need an incident response plan (or “IRP”) to mitigate against these threats. In a recent survey of 200 MSPs, 74% of respondents said they have suffered a cyber attack, and 83% noted their small and medium-sized business (SMB) customers experienced one as well. Yet, with an incident response plan (IRP), MSPs can protect themselves and their customers against cyber attacks.

Read Post

AlertOps

Read more about 5 Steps to Building a Robust Incident Response Plan for your MSP

Seamless CMDB Provisioning Gives Responders the Data They Need to Respond Faster

Jan 11, 2021 By Divya Balasubramanian In PagerDuty

We knew that the most loved feature in our ServiceNow 7.0 release would be the CMDB features. And in our ServiceNow 7.5 release (available now), we’ve expanded our CMDB capabilities even further—based on your feedback—around the importance of reducing the effort it takes to re-create the same services within PagerDuty.

Read Post

PagerDuty

Read more about Seamless CMDB Provisioning Gives Responders the Data They Need to Respond Faster

2020 Year in Review: OnPage Continues to Grow Despite the Pandemic

Jan 8, 2021 By Christopher Gonzalez In OnPage

2020 was an unpredictable year that presented several challenges, such as the outbreak of the coronavirus (COVID-19) pandemic. As part of the “new normal,” the world has adopted infection prevention procedures. The 2020 calendar year was defined by face coverings, constant sanitization and physical distancing. At its core, the year was an exhausting, surreal 12-month period for many.

Read Post

OnPage

Read more about 2020 Year in Review: OnPage Continues to Grow Despite the Pandemic

Better incident management while working remotely: The Squadcast way

Jan 7, 2021 By Nir Sharma In Squadcast

As the pandemic wears on, remote incident management has become the norm worldwide for businesses. Here we share some best practices that helped us to address remote incidents and make on-call less stressful. With the onset of remote work due to Covid-19, remote incident management has become the norm for businesses worldwide. Organisations that were earlier used to having war rooms now find themselves having to coordinate teams through Slack, MS Teams or other collaboration tools.

Read Post

Squadcast

Read more about Better incident management while working remotely: The Squadcast way

Four key metrics for responding to IT incidents and failures

Jan 7, 2021 By Jennifer Briston In netdata

If you’re a veteran in this space, you probably understand the many incident response metrics and concepts, along with the many (at times exasperating) acronyms. For those new to the space, or even those with years of experience, the terminology is often overwhelming. If you’re one of those people who’s struggling to navigate through the world of DevOps metrics, we’ve created this article for you.

Read Post

netdata

Read more about Four key metrics for responding to IT incidents and failures

G2 Recognizes Squadcast as Momentum Leader in Incident Management

Jan 6, 2021 By Nir Sharma In Squadcast

We are thrilled to begin the year on a high note! Squadcast has been awarded in the Incident management and IT Alerting category in G2's Winter Report 2021 for below categories. ‍‍ “We are honoured to be recognised as a Momentum Leader in the IT Incident management category by G2. We have always strived to create the fastest and easiest Incident Response experience for Engineering and DevOps teams that enables organisations to better monitor their IT infrastructure and applications.

Read Post

Squadcast

Read more about G2 Recognizes Squadcast as Momentum Leader in Incident Management

Leverage MSP Automation to Drive Profitability (in 2021)

Jan 6, 2021 By AlertOps In AlertOps

Managed service providers (MSPs) require automation, so they can deliver fast, efficient IT services that meet customer expectations. But, MSP automation can be difficult — and the longer it takes an MSP to automate IT service management (ITSM), the further it falls behind its competitors. Today’s MSPs face several challenges relative to automation, including: 1. Complex Scripting Language IT technicians may need to learn a complex scripting language to leverage an ITSM platform.

Read Post

AlertOps

Read more about Leverage MSP Automation to Drive Profitability (in 2021)

Incident Ready: How to Chaos Engineer Your Incident Response Process - FireHydrant

Jan 5, 2021 By FireHydrant In FireHydrant

We’re pretty sure using a real incident to test a new response process is not the best idea. So, how do you test your process ahead of time? In this video, FireHydrant CEO, Robert Ross, will share how FireHydrant customers leverage best practices to break, mitigate, resolve, and fireproof incident processes. We’ll show you how to use chaos engineering philosophies to stress test 3 critical parts of a great process.

View Video

FireHydrant

Read more about Incident Ready: How to Chaos Engineer Your Incident Response Process - FireHydrant

Boost IT Savings with CloudReady and Incident Workflow

Jan 4, 2021 By Sidharth Kumar In Exoprise

Companies love data. Aggregating data from multiple sources makes decision-making easier and brings a new depth of the conversation to business meetings. But all of this is at the management level. IT managers and administrators also search for data from multiple sources to ensure that the ecosystem works. Companies demand the continued maintenance and availability of mission-critical applications. Without a framework or incident workflow, revenue can suffer, and customers churn if the company does not proactively address problems that arise in its infrastructure.

Read Post

Exoprise

Read more about Boost IT Savings with CloudReady and Incident Workflow

Signl4 Teams

Jan 4, 2021 By SIGNL4 In SIGNL4

Creating and Using the new Teams features inside of Signl4

View Video

SIGNL4

Read more about Signl4 Teams

Segment and SIGNL4: Know your Customer's Actions, Anywhere and Anytime

Jan 4, 2021 By Ronald In SIGNL4

You have a web site, app, online shop, or SaaS offering? Then you have plenty of user actions. That can be visiting a certain page, signing up for a service or canceling a subscription. Wouldn’t it be great to know in real time when an important customer action takes place? This would allow you sales, customer service or technical teams to act immediately no matter where they are.

Read Post

SIGNL4

Read more about Segment and SIGNL4: Know your Customer's Actions, Anywhere and Anytime

Operations | Monitoring | ITSM | DevOps | Cloud