Operations | Monitoring | ITSM | DevOps | Cloud

Introducing: Checkly Agent Skills

AI coding agents are excellent at writing code. Ask Claude Code, Codex, or Cursor to add a feature, and it just works. At Checkly, we were ready for the new agentic world from the start! Monitoring as Code means your entire monitoring setup lives in your repository. API Checks, Browser Checks, alert channels, status pages; everything is defined in code, managed with the Checkly CLI, and version-controlled like any other part of your stack.

The post-mortem problem

Post-mortems are required, time-consuming, and widely disliked — but they’re also one of the biggest opportunities to improve reliability. In this webinar, we talked about how to run post-mortems that actually lead to learning and improvement. This covered why most post-mortems fall flat, how to structure them effectively, and walk through a real example to show what good looks like in practice. The goal: fewer wasted hours, better outcomes, and post-mortems that actually matter.

Improve performance and reliability with APM Recommendations

SREs and application developers rely on telemetry data to understand and improve their systems. As organizations scale and evolve, those systems generate an ever-growing volume of metrics, logs, and traces. But more data alone does not make it easier to improve performance or reliability: Identifying meaningful optimizations still requires careful investigation and analysis.

Automation Unlocked: Making Your IT Life Easier

Knowing how to take advantage of automations is a massive part of maintaining your sanity when running an IT organization. Between automating technical workflows and creating templates for your people management processes, there are so many opportunities for you to make your IT life easier with automation! As always, we have hosted a panel of IT leaders that discussed their own automations and how it saves them time. If you have any that you'd like to share, feel free to leave them in the comments.

AI Query Assist for SolarWinds SQL Sentry

Rewrite inefficient SQL Server queries in seconds—not hours. In this demo, we show you how AI Query Assist in SolarWinds SQL Sentry transforms the way you tune performance. Watch how to take a problematic query from the "Top SQL" view and use generative AI to instantly generate optimized rewrites and uncover missing indexes. What you will see: Instant Optimization: How to automate query rewriting and get plain-language explanations of the logic changes.

Predict, compare, and reduce costs with our S3 cost calculator

Previously I have written about how useful public cloud storage can be when starting a new project without knowing how much data you will need to store. However, as datasets grow over time, the costs of public cloud storage can become overwhelming. This is where an on premise, or co-located, self-hosted storage system becomes advantageous: it provides the greatest range of benefits, including cost, performance, security, and data sovereignty.

GPU-as-a-Service: The network's critical role in accelerated computing

The explosion of AI has created a continuous demand for computing power. At the heart of this need sits one critical resource: GPUs. They have become the hardware of choice for AI and machine learning, particularly deep learning workloads that operate on enormous data sets. However, as organisations race to train larger models and deliver faster inference, many are discovering that owning and operating GPU infrastructure isn’t always practical.