Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

Site Reliability Engineer's Guide to Black Friday

It’s gotten to the point where Black Friday reliability prep has to start on…well Black Friday. This year, 32% of consumers in the US claimed that they were going to start their holiday shopping in July-October. Plus, Black Friday isn’t the only day eCommerce businesses have to worry about, now we have Cyber Monday, Travel Tuesday, and the thousands of Prime Days from Amazon.

Cloud Engineer - Roles and Responsibilities

Cloud engineers have become a vital part of many organizations – orchestrating cloud services to create seamless digital experiences for clients. With responsibilities spanning across cloud security to troubleshooting incidents, cloud engineers are key to keeping modern businesses running efficiently. And as the need for cloud expertise continues to rise, so do opportunities in the field.

The Vitals Signs: Why Managed IT Services for Healthcare?

Organizations across the globe are seeing rapid growth in the technologies they use every day. And while the healthcare industry has always been slow to adopt, they are quickly starting to benefit from the role new technologies play in enhancing patient care and operational efficiency. However, one major setback for healthcare SMBs when investing in advanced technology is working out how they are going to keep up with cybersecurity, performance, and management of these IT solutions.

How Effective are Your Alerting Rules?

Recently, I came across this Reddit post highlighting the challenges of having ineffective alerting rules: And, here at OnPage we have experience with various companies who have dealt with just that, so I felt I should share some of our top tips for creating effective alerting rules in this blog. Read on to discover…

Using LLMs for Automated IT Incident Management

Large language models are algorithms designed to understand, generate, and manipulate human language. State-of-the-art large language models include OpenAI’s GPT-4o, Anthropic Claude Sonnet 3.5, and Meta LLaMA 3.1. They are built using neural networks with billions or even trillions of parameters. They are trained on vast datasets that can include text from the internet, books, code, and other information sources.

Health Unit Coordinator - Roles and Responsibilities

In bustling healthcare settings, where patients, doctors, and nurses are always on the move, maintaining order can feel like an uphill battle. The constant activity makes it challenging to stay organized and keep everyone in sync. Which is why it is essential for healthcare facilities to maintain a sense of coordination that enables them to seamlessly deliver quality patient care. That’s where the Health Unit Coordinator come in…

Protect Your Alerts: Why Incident Alert Management Shouldn't Share a Cloud

When managing IT infrastructure, one crucial aspect is ensuring that your incident alert management system remains operational during critical failures or outages. Relying on a single cloud provider for both your primary services and incident management can create a significant vulnerability. If that cloud provider experiences an outage, your alert management system could become inaccessible precisely when it’s needed most, leading to delayed responses and extended downtime.

Evaluating Opsgenie Alternatives in 2024

In today’s digital age, customer expectations are at an all-time high, with demands for instant support, flawless user experiences, and constant service availability. This environment of heightened expectations pushes organizations to innovate and streamline their operations continuously. Ensuring seamless service delivery hinges on the ability to detect and resolve issues swiftly, whether they are server crashes, software bugs, or unexpected outages.

Evaluating PagerDuty Alternatives in 2024 (Updated)

We live in times of instant gratification, where customers expect same-day delivery, round-the-clock tech support, and seamless browsing experiences. Disruptive technologies and continuous innovation have raised expectations for faster and uninterrupted delivery of services. This shift is compelling organizations to adapt their operations to meet these new demands and stay competitive.

The Impact of On-Call on Mental Health

Lately, I have been thinking about the mental health effects that stem from working in the cybersecurity industry. And in my research, I came across an Afternoon Cyber Tea podcast that sparked my interest. During their talk, host Ann Johnson and Dr. Ryan Louie, MD, PhD, dissect parallels between those who work in cybersecurity and those who work in healthcare, and uncover how these types of jobs affect mental health.