Operations | Monitoring | ITSM | DevOps | Cloud

June 24 Global Shopify outage: Timeline and impact

On June 24, 2026, Shopify experienced a widespread service disruption that affected storefronts, admin dashboards, and merchant access across multiple regions. While the outage did not impact every user, reports quickly surfaced from merchants around the world who were unable to access stores, log in to administrative tools, or complete routine operations.

Achieving sovereign and secure AIOps with Ollama and OpManager

Enterprise IT networks power business operations across the world. As businesses scale to catch up with an increasingly-demanding user base, networks also grow more complex. IT teams managing these networks have to monitor more data than before, under more stringent SLA terms, with little room for failure. Trying to do this manually across thousands of devices can take a lot of time and effort, and are prone to errors.

The Four Pillars of AI Observability in 90 Seconds

AI applications can behave unpredictably, potentially leading to errors such as hallucinations or data leaks, even when classic monitoring indicates a successful response. To effectively monitor AI systems, four key areas should be focused on. Implementing these pillars can enhance trust in AI deployments, help manage costs, and identify safety issues before they impact users.

How Grafana Cloud Ingests Your Data | Data Sources, Alloy & OTel Explained

Learn the two main ways to get data into Grafana Cloud. In this video, we break down how Grafana Cloud connects to over 150 external data sources (like Salesforce, Postgres, and CloudWatch) where your data stays in place, and how you can send raw telemetry into Grafana’s fully managed databases for logs, metrics, traces, and profiles.

Why you should use Language Server Protocol (LSP) with Claude Code

Agentic coding tools like Claude Code can write, refactor, and debug across an entire codebase, but by default they read code as plain text, the way grep does. The Language Server Protocol (LSP) changes that: it’s the same code-intelligence layer an IDE uses, and wiring it into an agent lets it read code by meaning instead of by string match. The bigger the codebase, the more a wrong guess about a symbol costs, and the more that structural view pays off.

Network Monitoring, the Netdata Way: Topology, NetFlow, SNMP, and Traps

Interface counters tell you a port is busy. Bytes in, bytes out, errors, drops. That’s enough to know a link is saturated, but not enough to know which conversations are saturating it, which devices are involved, or how a problem propagates across your network. For that you’ve traditionally needed dedicated network performance monitoring tools, usually expensive, usually a separate console from the rest of your monitoring.

How Git Worktrees Changed My Development Workflow

Since I started using Claude Code more frequently, I kept noticing a “worktree” checkbox popping up whenever I started a session in a Git repository. I had no idea what it meant, so I did what any curious developer would do and started digging. What I found was a Git feature I somehow never came across before: git worktrees.

Multi-Agent Architectures - What we shipped, what broke, and what we'd do differently

At LLMday Lisbon, our Software Engineer, Viktor Vasylkovskyi, highlights the realities of building production AI agents with LangGraph - sometimes getting it right, often learning the hard way. This talk is about what was actually shipped, including a distributed multi-agent setup at PagerDuty. Viktor breaks down the real tradeoffs between LLM-driven and deterministic orchestration, what broke, and how he’d approach it differently now.