Operations | Monitoring | ITSM | DevOps | Cloud

AWS Outage: How do you prepare for the failure of your own safety net?

When AWS’s massive outage struck, it didn’t just take down cloud services, apps, and enterprise platforms. It also knocked out many of the monitoring systems organizations depend on for real-time answers. Observability companies, including Datadog, New Relic, Checkly, Dynatrace, SpeedCurve, and Splunk Observability, lost visibility or functionality precisely when organizations needed them most.

PagerDuty Joins AWS QuickSuite: Connect Your Incident Management with 1,000+ Applications

Today, we’re announcing that PagerDuty is now available in AWS QuickSuite through the Model Context Protocol (MCP). This means PagerDuty’s incident management capabilities can now connect with the 1,000+ applications and data sources that QuickSuite integrates with, from AWS services to enterprise SaaS platforms, all accessible through natural language.

How to test the reliability of a Point of Sale (POS) system

Point of Sale (POS) systems are the backbone of any retail store. A single outage can cost retail companies thousands of dollars each minute in lost sales, and even more if the outage happens during peak hours. If the outage goes on too long, it can cause even more costly damage as customers abandon carts and turn to competitors. In an industry where customer loyalty is worth its weight in gold, that brand damage can end up even more costly than the initial lost sales.

Unreal Engine crash reporting now available on gaming consoles with trace-connected logs

With the first major release of the Sentry Unreal SDK (now on v1.2.0, and you can also explore in our interactive sandbox), we’ve made some important improvements to support cross-platform Unreal developers when it comes to platform coverage, debugging with user feedback, and performance monitoring improvements. Here’s what’s new.

Exploring why PostgreSQL made generated columns virtual

Generated columns have existed in PostgreSQL since version 12. The idea was to allow people to make calculations based on other columns in the table and store them. It makes your life easier by ensuring data accuracy and preventing unauthorized manual modifications. But why, after 6 versions, is PostgreSQL making generated columns virtual and also making this new way the default? This is an example of the types of questions we had when exploring and playing around with this new feature.

Announcing HAProxy Enterprise 3.2

HAProxy Enterprise 3.2 is a pivotal release that reinforces the product’s identity as both the world’s fastest software load balancer and a sophisticated edge security layer. This release brings next-generation security intelligence, extends its industry-leading performance, and expands the native routing and integration capabilities in HAProxy Enterprise.

Unpatchable Vulnerabilities: Key Risk Mitigation Strategies

Wouldn’t it be great if every vulnerability had a fix waiting in the wings? If patching were always fast, easy, and complete? That’s not the world we live in. Some vulnerabilities can’t be patched at all. Others are buried in systems or services you don’t fully control. And the longer your focus stays limited to internal infrastructure, the more risk slips through the cracks.

Secrets We Forgot... Until Automation Saved Us

We All Have That One Secret… That API key that has been sitting in production for ages. The personal access token that was supposed to be rotated 2 months ago. The service key that is about to expire… wait, when does it expire again? Most developers have experienced working with secrets. We create secrets, use them, and promise ourselves that we will rotate them. But somehow, the secret that was supposed to be rotated after 90 days is still standing strong after 6 months. Sounds familiar?