Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

Humanizing a DevOps Transformation

Anyone who’s ever played the game of chess knows there’s more than one way to reach a desired outcome. There are 400 possible setups after the first turn; 197,742 after the second; and just north of 120 million after the third—all of which are marching toward the same desired outcome. “So, what does any of this have to do with DevOps?” you ask? Fair question.

The "Problems" With Agile and Scrum

Agile is a popular buzzword in software development, with some organizations and teams masquerading as “agile” while they are actually doing something very different. I’ve seen it numerous times in my career as an Agile Coach: A leader claims to embrace agile values, but micromanages engineering teams and uses agility as a way to manipulate developers to work long hours. The result?

Keeping PagerDuty Always On With Remote Incident Response

Earlier this month, many areas of the internet experienced a major incident caused by a router misconfiguration within a highly used service provider. This led to cascading service failures, causing widespread outages and disruptions for several well-known SaaS organizations. When the outage occurred, our teams at PagerDuty immediately noticed a global spike in events and incidents.

What's New: Updates to Visibility Console, Event Intelligence, Analytics, and More!

We’re excited to announce a new set of product updates and enhancements to the PagerDuty platform! PagerDuty partners with organizations to help teams create efficiencies across IT organizations and protect customer relationships. These updates will help further improve your team’s ability to manage and reduce noise, automate critical response workflows, and quickly mobilize a response in order to mitigate disruptions across your digital operations when seconds matter.

Rein in Your Incidents: Incidents and Alerts Foundations

Solving incidents is hard. Depending on your current situation, you may also be losing a lot of time figuring out what notifications constitute an incident. This results in more and more lost time as every notification must be triaged as a potential incident before you can proceed to move to resolve or disregard (as a non-incident). All this may sound very cumbersome, but the fastest way to improve is to learn and define what incidents are. And you’re in luck!

Summit EMEA: How Vodafone Is Enabling Immutable Telemetry

In June, we were delighted to host our first ever virtual PagerDuty Summit EMEA! Llywelyn Griffith-Swain, SRE Manager, and David Jambor, Head of Systems Engineering at Vodafone, were among our speakers. They outlined Vodafone’s approach to achieving immutable telemetry. David opened the session by defining Vodafone’s strategic goals. “Our vision is to create an engineering-driven culture,” he explained. “We want to empower development teams to be self-sufficient.

PagerDuty Paying Dividends for Form3's Digital Payment Platform

Your payment systems have slowed to a crawl, customers are getting impatient and abandoning their shopping carts both online and in stores, and you’re losing money every minute this problem goes on. Behind the scenes, technical responders are scrambling to resolve the issue before it impacts more customers—and before even more money is lost.

Postmortems and More With J. Paul Reed

PagerDuty sat down with J. Paul Reed, a Senior Applied Resilience Engineer at Netflix, for an Ask Me Anything (AMA) to discuss best practices around postmortems. Reed is a prominent speaker and advocate of DevOps and operations complexity, and has over 15 years of experience in release engineering. His background in tech, along with his previous work at companies like Mozilla and VMware, give him a unique perspective into the inner workings of innovative organizations.

Improve Customer Experiences & Collaboration Between Support and Engineering With Bidirectional Communication

We are delighted to announce our new PagerDuty integration for Salesforce Cloud. This integration empowers Customer Service, Engineering, and IT teams to proactively resolve customer issues in real time by improving communication and collaboration.