Operations | Monitoring | ITSM | DevOps | Cloud

Checkly

How Often Should You Ping Your Site?

How often should you ping your site? Should you be checking every few minutes, or every hour? Surely you have other ways to detect problems, so maybe just a daily check of your API and main page would be enough, right? While there’s no single right answer for everyone, this post tries to break down how you can find the right cadence for your site checks.

Your Practical Guide to Reducing MTTR

Let’s face it. Incidents will always happen. We simply can’t prevent them. But we can strive to mitigate the impact incidents have on our product and customers. Ensuring high reliability depends on quickly and effectively finding and fixing problems. This is where the metric MTTR, standing for “mean time to restore” or “mean time to resolve,” becomes valuable for organizations.

How to wait for a specific API response in your Playwright end-to-end tests

Learn in this video how to monitor network HTTP calls in your end-to-end tests and use Playwright's "waitForResponse" method to capture specific network responses. This approach allows you to wait for specific API calls to validate if you website or app shows the correct data.

Open Source Observability with OpenTelemetry and ChecklyDescription

We need to monitor our service's performance, but large closed SaaS options are expensive and complex. OpenTelemetry is the 'wave of the future' for observability, but is it ready for your team? Yes! Join Nočnica to see a demonstration of instrumenting a demo application and learn what OpenTelemetry can do. We'll also add external site monitors with Checkly synthetics checks.

Exploring the Synergy Between Testing and Monitoring in Software Development

The roles of testing and monitoring often intersect, yet they maintain distinct identities. In my near-decade in the tech sector I've observed how end-to-end (E2E) tests and synthetic monitoring, despite common frameworks and requirements, often fail to benefit from collaboration and synergy.

Parallel Scheduling Is Now GA: Detect Regional Outages Up to 20x Faster

I am happy to announce that Checkly now supports parallel scheduling as a new way to schedule your checks. Parallel scheduling allows you to reduce mean time to detection, provide better insights when addressing outages, and give improved accuracy in performance trends, making it a powerful new feature for all Checkly users.

Never miss an Outage: Improve your monitoring with Checkly's Parallel Scheduling

In this video, you will learn how to leverage Checkly's parallel scheduling feature to simultaneously monitor and test all your essential production targets. This knowledge will help you reduce your mean time to detect outages, assess whether production problems are regional or global, and enhance your monitoring data granularity.

Observability with OpenTelemetry and Checkly

Observability isn't just a buzzword; it's a vital compass guiding us through the maze of system health and performance. As we’ve adopted microservice architectures, the ability to know ‘what is currently happening in our system’ has diminished as our operational resilience has increased. We find services scattered among a maze of interconnections and interdependencies. And even the logs that used to guide are now scattered throughout this maze.

How to combine Playwright locators to test non-deterministic application flows

Sometimes, applications can behave differently even though your users do the same things. How can you test these non-deterministic flows? Learn in this video how Playwright's "locator.or()" method helps to write tests that can handle different application flows.

A Guide to Visual Regression Testing With Playwright and How to Get Started

I’m pretty sure that you’ve had a situation where you deployed a major UX change on your web app and missed the most obvious issues, like a misaligned button or distorted images. Unintended changes on your site can cause not only a sharp decline in user satisfaction but also a large fall in sales and customer retention. By identifying and resolving these discrepancies before the update went live, you could have prevented these outcomes.