Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on DevOps, CI/CD, Automation and related technologies.

A guide to setting up alerts for a new service

When you launch a new service in production, you’re working with a lot of unknowns. You don’t yet know how it behaves under real traffic or which incidents are worth waking someone up for. That makes alerting for a new service a little different from what you’re used to with an established one. The goal in the early days isn’t to get everything perfectly configured. It’s to learn enough about the service to get your alerting right.

Stop ECS Containers From Collapsing Into One Service in OpenTelemetry

Why ECS containers collapse under service.name = aws_ecs and how to fix it for both EC2 launch type and Fargate, including the resource-vs-log-record pitfall that quietly breaks log filtering. Prathamesh works as an evangelist at Last9, runs SRE stories - where SRE and DevOps folks share their stories, and maintains o11y.wiki - a glossary of all terms related to observability.

Step 5 to Web App Deployment: Cloud Configuration (Where Your App Actually Lives)

So far in this deployment series, you’ve: Now we arrive at the layer that quietly determines whether your app thrives… or throws mysterious 2am errors. Step 5 is cloud configuration. This is where your application gets its infrastructure, its environment, and its ability to scale without drama.

Build with Claude Code, Deploy with Qovery

AI coding tools eliminated the 'writing code' bottleneck. But deploying that code? Still a mess. Here's how Claude Code + Qovery Skill lets you go from idea to production in a single prompt - with enterprise-grade guardrails. Romaric founded Qovery to make Kubernetes accessible to every engineering team. He writes about platform strategy, developer experience, and the future of cloud infrastructure.

Hyperscaler vs. independent cloud: How startups should choose in 2026

A two-person startup signs up for the obvious hyperscaler because their last company used it, because Stripe runs on it, because the documentation is exhaustive, and because the free tier looks generous. Eighteen months later, with a small team and a healthy seed round, they discover they're spending $18,000 a month, and they don't quite know where most of it is going. Three engineers can describe the architecture in detail. Nobody can describe the bill.

Google Cloud Storage Pricing: The No BS Guide To GCP Storage Costs [2026]

This straightforward guide will help you understand GCP storage pricing without the jargon. Understanding where your cloud spend goes enables you to pinpoint who, why, and what drives your cloud costs. This visibility supports informed decisions about reducing unnecessary spend or increasing investment in high-return areas.

How Criteo handles 23M requests per second (RPS) with HAProxy Runtime API automation

Criteo handles 23 million requests per second (RPS) while maintaining peak performance and minimizing downtime. For most organizations, handling that level of traffic is just a theoretical stress test — a what-if scenario should their infrastructure ever be overwhelmed by an unexpected wave of requests. But for Criteo, 23 million RPS is just another Tuesday.

Resolve Webinar: Introducing AgentLab: The Foundation of the Autonomous Service Desk

Most service desks still operate across fragmented systems. A single ticket can touch 4–7 tools, often more, slowing resolution and increasing cost. Copilots suggest. Traditional automation executes fixed paths. Neither closes the loop. AgentLab changes that. In this webinar, we introduce a new model built on agentic AI and orchestration. One where AI agents don’t just assist. They act, adapt, and resolve.