Operations | Monitoring | ITSM | DevOps | Cloud

Reducing MTTR: Why Speed Matters for B2B SaaS Companies

For B2B SaaS companies, downtime isn’t just an inconvenience—it’s a direct threat to customer satisfaction and revenue. Unlike consumer applications, they serve a mix of power users pushing the system to its limits and new users expecting a seamless experience from day one. Reliability isn’t just about keeping services online—it’s about ensuring every user interaction runs smoothly. A minor hiccup for one customer might be a major disruption for another.

Why you should never use page.waitForTimeout() in Playwright

Playwright isn’t a testing framework. Sure it’s got assertions, scripted behaviors, even controls over environments. But testing isn’t Playwright’s only purpose. Playwright is an automation tool. It can carry out any browser-based action consistently, and carry out instructions robustly. Locators for buttons and other elements aren’t visual or CSS class-based, but based on ARIA role, and even small styling changes won’t make the scripted action fail.

Why you shouldn't run tests sequentially

Frequently in support conversations and posts on Playwright forums, a problem has come up that’s a little bit hard to describe, but comes down to synchronous testing: developers writing a series of Playwright tests that operate on the assumption that one of the tests will either run first or run last, and perform the function of a setup and cleanup script.

TCP Checks Now Available in Checkly

Checkly has always helped you monitor your APIs and web services, ensuring they stay fast, reliable, and available. But application reliability doesn’t stop there—databases, message queues, and mail servers all play a crucial role in your infrastructure. To provide full application reliability, we’re expanding into network monitoring with TCP checks. Now, you can monitor critical non-HTTP services directly in Checkly—without adding extra tools to your stack.

Why and How You Should Use Your Learning & Visiting Budget

When I joined Checkly as Junior People Operations Manager, one of the benefits that immediately stood out to me was the Learning & Visiting budget. I found myself wondering—how is this budget actually being used across the company? At the start of the year, many of our team members plan how they’ll use their learning budget—whether to enhance professional skills or pursue self-driven projects. With flexible guidelines, we encourage them to invest in what matters most.

Shorten your MTTR with Checkly Traces

We all know that Checkly is a ‘secret weapon’ for engineering teams who want to shorten their mean time to detection (MTTD). With Checkly, you can know within minutes if your service is unavailable for users, or acting unexpectedly. In this article we’ll talk about how Checkly traces can help you expand on the benefits of Checkly, adding insights that will help you diagnose root causes, and further reduce your mean time to resolution (MTTR) for outages and other incidents.

Networks are everyone's business - TCP Checks for app developers

Checkly is the industry’s best tool to monitor your production applications. With the power of playwright, developers can test the systems they’ve developed, and roll out those tests as production monitors running from multiple geographies on the Checkly system. And Checkly monitors thousands of API endpoints with complex validation, setup and cleanup scripts, and reliable alerting. So why are we expanding into TCP-based checks?

Optimize MTTD with the right check frequency

Checkly enables engineers to automate the monitoring of their production services. Using the automation framework Playwright, you can run an end-to-end test on a regular cadence to make sure every feature is working for your users. But once you’ve got your check set up, either with Playwright scripting, a Terraform template, or an OpenAPI spec, we come to the question of what frequency you should run these checks. Should you be checking every few minutes, or every hour?

Making sure you get a Checkly alert for every detected failure

It’s every ops team’s biggest anxiety: a monitoring system detects a failure, but the notification either isn’t delivered or isn’t noticed by the team. Now we have to wait for users to complain before our team knows about the problem. Checkly sends an alert every time the system detects a failure, but how can you be sure you’re getting those alerts, and that those alerts are going to the right people?

Announcing Checkly Traces: Unified Synthetic Monitoring and Distributed Tracing

Until recently, Checkly was telling you what broke in your app. Now, it can also tell you why it broke. We're excited to announce the general availability of Checkly Traces, a new addition to our synthetic monitoring platform that bridges the gap between frontend monitoring and backend observability. By combining synthetic monitoring with distributed tracing, Checkly Traces empowers development teams to detect, diagnose, and resolve issues faster than ever before.