Operations | Monitoring | ITSM | DevOps | Cloud

Cloudflare outage: another wake-up call for resilience planning

Another day, another massive Internet disruption, and this time it’s Cloudflare taking huge parts of the Internet offline. This incident is not an anomaly. It is part of a recurring pattern that has become standard in digital infrastructure. We have reached an inflection point in digital operations. Outages at major cloud and content delivery network (CDN) providers are now expected. The only real uncertainty is when it will happen next.

Fork Your Database for Staging & Testing

Learn how to quickly create a fork of your Aiven for PostgreSQL instance to set up a safe staging or testing environment. In this demo, we walk you through selecting your project and service, navigating to the backups & forking section, naming your new instance, choosing the cloud provider and plan, and finalising the fork to replicate your original database. This approach allows you to test changes safely without affecting your production database, making development and QA workflows much more reliable.

Introducing webvitals.com: Find out what's slowing down your site

Developers don’t need another “run this tool, stare at a number, and feel bad about it” website. So we built something different. WebVitals helps you analyze, optimize, and ship faster websites, all in one place. Built by the same folks who obsess over stack traces and slow queries, it connects the dots between performance metrics and what’s actually slowing your users down. In one place, you can.

KubeCon Atlanta Signals Key Shift: From Cloud Cost To Value Engineering

After three days of demos, sessions, and hallway conversations at KubeCon Atlanta, one thing became clear to CloudZero CTO Erik Peterson: the cloud-native world is shifting from cost control to value engineering. Teams aren’t just fighting bills anymore. They’re fighting complexity, GPU scarcity, Kubernetes sprawl, and pressure from the business to justify every dollar of technical investment. And this year’s KubeCon attendees? They were ready for those conversations.

AWS And Azure Outages Will Recur - Here's How You Ensure Resilience

The cloud has long promised limitless scalability and near-perfect uptime. But if you tried to access your Microsoft 365 dashboard or recline your smart bed last week, and got nothing but a spinning icon, you weren’t alone. In the span of 10 days, both Amazon Web Services (AWS) and Microsoft’s Azure Cloud suffered widespread outages that rippled across industries.

Uptrends x OpenTelemetry: Stream browser-level synthetic data into your observability stack

Dashboards and alerts can tell you something’s wrong, but they don’t immediately tell you why. A red indicator or synthetic test failure prompts detective work. You flip between dashboards, timestamps, and logs, trying to line up what the check saw with what the system did. Now imagine your monitoring could explain itself by sending traces directly into your OpenTelemetry (OTel) backend.

When Bots Grow Brains: RPA and Agentic AI For the Win

For a long time, robotic process automation (RPA) was the fastest way to scale repetitive digital work. Bots copied, clicked, and executed rule-based tasks faster than any human. They reduced error rates and delivered early wins for efficiency. Sounds just fine, right? Prepare for a Matrix moment, because the truth is that IT teams built RPA only for predictability. It could follow instructions, but it couldn’t adapt when something unexpected happened.

Maintaining Software Excellence in the Age of AI Coding Assistance

In this preview of his AWS re:Invent session, Cortex CTO & Co-Founder Ganesh Datta breaks down how AI coding assistants are transforming software development, and what high-performing teams are doing to keep speed and reliability in balance. You’ll learn: If you care about AI, engineering velocity, or building sustainable systems, this is a must-watch. Full Session: December 3 at 2:30 PM Learn more about Cortex: go.cortex.io/reinvent.