Monthly Archive

API Latency Monitoring: Metrics, Percentiles, and Alerting Best Practices

Mar 31, 2026 By Dotcom-Monitor In Dotcom-Monitor

APIs power modern applications. Every login request, product search, payment authorization, and mobile app refresh depends on an API responding quickly and reliably. When latency increases, users feel it immediately. Pages stall. Transactions hang. Confidence drops. Most engineering teams measure API latency. Fewer truly monitor it. There is a difference. Many teams track average latency in dashboards and assume performance is healthy.

Read Post

Dotcom-Monitor

Read more about API Latency Monitoring: Metrics, Percentiles, and Alerting Best Practices

API Endpoint Monitoring: How to Ensure Reliability, Performance & Functional Accuracy

Mar 31, 2026 By Dotcom-Monitor In Dotcom-Monitor

APIs sit at the core of modern digital infrastructure. From e-commerce checkouts and payment processing to SaaS platforms and mobile applications, APIs move the data that keeps systems running. But APIs do not operate as a single unit. They are made up of individual endpoints, and each endpoint represents a specific function or resource that users depend on. As organizations shift toward microservices, cloud native applications, and third party integrations, the number of endpoints increases rapidly.

Read Post

Dotcom-Monitor

Read more about API Endpoint Monitoring: How to Ensure Reliability, Performance & Functional Accuracy

Stop Mocking K8s. Start Recording with eBPF. #speedscale #sre #ebpf #kubernetes #devops

Mar 31, 2026 By Speedscale In Speedscale

Forget OpenTelemetry overhead. Capture every K8s transaction via eBPF with zero manual work. Turn production "gobbledygook" into a perfect staging environment instantly.

View Video

Speedscale

Read more about Stop Mocking K8s. Start Recording with eBPF. #speedscale #sre #ebpf #kubernetes #devops

Is OpenTelemetry overkill? There's a lazier (and better) way. #speedscale #sre #ebpf #kubernetes

Mar 31, 2026 By Speedscale In Speedscale

If you "aspire to be lazy" like we do, you know that building staging environments and mocking complex back-ends (like MySQL, AI models, and 3rd party APIs) is a massive time sink. In this demo, we show you how to use Internet Magic (aka eBPF) to: Stay tuned for Part 2, where we take these recordings and spin up a staging environment automatically.

View Video

Speedscale

Read more about Is OpenTelemetry overkill? There's a lazier (and better) way. #speedscale #sre #ebpf #kubernetes

AI Coding Agents Break What Works

Mar 30, 2026 By Josh Thornton In Speedscale

Your AI coding agent just made every test pass. Ship it, right? Not so fast. A growing class of AI-generated bugs doesn’t come from writing bad code. It comes from the AI changing working code to accommodate its own mistakes. This isn’t a theoretical risk. It’s happening now, in production codebases, and it’s harder to catch than any bug the AI might introduce from scratch.

Read Post

Speedscale

Read more about AI Coding Agents Break What Works

The 4 Golden Signals of Monitoring Explained

Mar 27, 2026 By Kush Mansingh In Speedscale

As a team, we have spent many years troubleshooting performance problems in production systems. Applications have become so complex that you need a standard methodology to understand performance. Our approach to this problem is called the Golden Signals. By measuring these signals and paying very close attention to these four key metrics, providers can simplify even the most complex systems into an understandable corpus of services and systems.

Read Post

Speedscale

Read more about The 4 Golden Signals of Monitoring Explained

Enhancing our API for better agentic consumption

Mar 27, 2026 By Mattias Geniar In Oh Dear

AI coding agents like Claude Code and Codex are becoming a real part of developer workflows. They don't just write code, they call APIs, interpret responses, and take action based on what they find. That means the quality of your API responses directly affects how useful an agent can be. We've shipped a series of improvements to the Oh Dear API with this in mind. Every change helps humans too, but we specifically optimized for how agents consume and reason about data.

Read Post

Oh Dear

Read more about Enhancing our API for better agentic consumption

The Observability Gap: Why Monitoring Data Should Drive Tests

Mar 26, 2026 By Matt LeRay In Speedscale

Most teams already know a lot about production. They have dashboards. They have traces. They have alerts. They have enough telemetry to explain what happened after an incident and enough graphs to argue about it for the rest of the week. Then they go to test a change and start from scratch. The integration tests hit a hand-written mock that returns {"status": "ok"}. The load tests replay a CSV somebody exported months ago. Staging is close enough to production right up until it matters.

Read Post

Speedscale

Read more about The Observability Gap: Why Monitoring Data Should Drive Tests

Automate Your Monitoring and Incident Handling: How Agents Dominate the Checkly CLI

Mar 26, 2026 By Checkly In Checkly

50% of Checkly's CLI users are already coding agents. We predict that agents will become dominant by the end of 2026. This video demonstrates an agentic workflow where an alert reports a broken Shopify store login flow, and Claude Code, using the installed Checkly Skill and the Checkly CLI, pulls monitoring results, identifies a Playwright test failure, investigates the codebase, finds and fixes a bug, and then updates a Checkly status page by creating an incident.

View Video

Checkly

Read more about Automate Your Monitoring and Incident Handling: How Agents Dominate the Checkly CLI

Checkly and the Agentic Software Layer

Mar 26, 2026 By Hannes Lenke In Checkly

November 24th, the Opus 4.5 release turned around the entire tech industry. This was the moment when agents became capable. Capable enough to write solid staff-level code. Capable enough to reason about alerts, investigate root causes much faster than most engineers, and set up the reliability layer faster. For me, this feels like an iPhone moment on steroids; the adoption of AI is accelerating much faster than any adoption curve I’ve seen over the past few decades.

Read Post

Checkly

Read more about Checkly and the Agentic Software Layer

One CLI, Two Audiences: How We Built for Agents and Human

Mar 26, 2026 By Stefan Judis In Checkly

Half of the Checkly CLI users are already coding agents. This is not a prediction — it's what the data shows today. Since February, more and more agents have been using the CLI to manage and configure their Checkly monitoring setups. Right now, we're at 50% human and 50% agentic CLI users. And we predict that by the end of 2026, it won't be humans using the CLI; the agents will have taken over. The terminal became the primary interface for AI agents doing real work in the Checkly ecosystem.

Read Post

Checkly

Read more about One CLI, Two Audiences: How We Built for Agents and Human

The Secret to 10x Faster API Testing #speedscale #apitesting #api #automation #production

Mar 26, 2026 By Speedscale In Speedscale

Stop living in the past. See how to use real production traffic to automate your API testing with zero code changes. Replay real-world patterns in your CI/CD and catch regressions before your users do. Learn more: speedscale.com.

View Video

Speedscale

API
DevOps

Read more about The Secret to 10x Faster API Testing #speedscale #apitesting #api #automation #production

Production Data Access for Developers: RBAC and DLP

Mar 24, 2026 By Speedscale Team In Speedscale

If you run a software engineering tools team, you have almost certainly had this conversation: a developer asks for production data access to debug a real incident, and someone in the room says no. Not because the request is unreasonable (it isn’t), but because nobody wants to be the person who said yes when something goes wrong. That instinct is understandable. Production environments carry real risk. But the reflex to lock everything down has a cost that rarely gets accounted for.

Read Post

Speedscale

Read more about Production Data Access for Developers: RBAC and DLP

What Your Engineering Team Means When They Say "We Need a Better Solana API"

Mar 24, 2026 By OpsMatters In OpsMatters

If your engineering lead has asked for a budget to upgrade your Solana infrastructure, you've probably heard about APIs, RPC nodes, latency, and landing rates. And if you are like most non-technical executives, you nodded, approved a line item, and moved on. This post is the translation layer. A Solana API is the single connection between your product and the blockchain it runs on. When that connection is slow, unreliable, or undersized, your users feel it before your dashboards show it.

Read Post

OpsMatters

Read more about What Your Engineering Team Means When They Say "We Need a Better Solana API"

Nano Banana 2 API in Production: Real Use Cases and Why APIPASS Makes It Accessible

Mar 24, 2026 By OpsMatters In OpsMatters

That first question is not which of the models in Google's Nano Banana model family looks better on a benchmark, but instead, which should you actually ship with? Nano Banana Pro has always had the luxury edge: higher reasoning, maximal photorealism, studio-grade fidelity. Nano Banana 2, based on Gemini 3.1 Flash Image, came with an entirely different promise - the Pro-world knowledge and output quality to Flash-speed infrastructure at penny-pinch levels of pricing.

Read Post

OpsMatters

Read more about Nano Banana 2 API in Production: Real Use Cases and Why APIPASS Makes It Accessible

FastAPI Testing: Mock LLM APIs for Free

Mar 23, 2026 By Ken Ahrens In Speedscale

Testing a FastAPI app that calls OpenAI, Anthropic, or Gemini gets expensive fast. The problem is not just the API bill in production. It is all the repeated traffic in development: prompt tweaks, CI runs, regression checks, and the load tests you keep putting off because every run burns tokens. Hand-written mocks do not help much once the app is doing multi-step LLM work.

Read Post

Speedscale

Read more about FastAPI Testing: Mock LLM APIs for Free

New API update: Filter incidents by phase & severity

Mar 21, 2026 By Valeria Kurolapova In StatusGator

We’ve enhanced our API for custom monitors with more powerful filtering options – giving you better control over how incidents are tracked and surfaced.

Read Post

StatusGator

Read more about New API update: Filter incidents by phase & severity

API Availability Monitoring: How to Measure True API Availability

Mar 21, 2026 By Dotcom-Monitor In Dotcom-Monitor

APIs are no longer just integration layers. They power customer logins, payment processing, SaaS workflows, partner ecosystems, and mobile applications. When an API becomes unavailable, revenue stops, user trust declines, and service level agreements are immediately at risk. Yet many teams still define API availability in the simplest possible way. If an endpoint responds with a 200 OK, the API is considered available. Monitoring dashboards stay green. Alerts remain silent. Everything appears healthy.

Read Post

Dotcom-Monitor

Read more about API Availability Monitoring: How to Measure True API Availability

API Error Monitoring: A Complete Guide to Detecting and Resolving API Failures

Mar 21, 2026 By Dotcom-Monitor In Dotcom-Monitor

APIs power nearly every modern digital experience. From mobile apps and SaaS platforms to payment gateways and internal microservices, APIs handle authentication, transactions, content delivery, and system-to-system communication. When an API fails, users often experience broken features, slow responses, or complete service outages. In many cases, they leave before your team even realizes something is wrong. The business impact of API failures is significant.

Read Post

Dotcom-Monitor

Read more about API Error Monitoring: A Complete Guide to Detecting and Resolving API Failures

The Hidden AI Bill: Why Non-Prod LLM Costs Spiral

Mar 20, 2026 By Ken Ahrens In Speedscale

Most teams know they are spending money on AI in production. Far fewer realize how much they are spending outside production. It’s easy to get lost as you evaluate which model has the best responses, is fast enough, and cheap enough to run in production. That is because the AI bill usually shows up as a giant blob. It is easy to see the total.

Read Post

Speedscale

Read more about The Hidden AI Bill: Why Non-Prod LLM Costs Spiral

API Observability Tools: Complete Guide to Platforms, Features & Use Cases (2026)

Mar 20, 2026 By Dotcom-Monitor In Dotcom-Monitor

Modern software runs on APIs. Whether you are operating microservices, integrating third party services, or building customer facing platforms, APIs are the backbone of your architecture. As systems become more distributed, simply knowing whether an endpoint is up or down is no longer enough. Teams need deeper visibility into performance, reliability, latency, and behavior across environments. That is where API observability tools come in. API observability goes beyond basic health checks.

Read Post

Dotcom-Monitor

Read more about API Observability Tools: Complete Guide to Platforms, Features & Use Cases (2026)

Cut your AI API costs while you develop. #speedscale #api #softwaredevelopment #aicoding #devops

Mar 20, 2026 By Speedscale In Speedscale

Speed is everything, but accuracy matters too. Learn the exact procedure to record live AI responses and use them as simulations for your automated tests. Watch the full breakdown and start saving tokens today.

View Video

Speedscale

Read more about Cut your AI API costs while you develop. #speedscale #api #softwaredevelopment #aicoding #devops

API Status Monitoring: Real-Time Health & Uptime Tracking

Mar 20, 2026 By Dotcom-Monitor In Dotcom-Monitor

APIs sit at the center of modern digital infrastructure. Mobile applications, SaaS platforms, microservices, and third party integrations all depend on APIs to exchange data and execute business logic in real time. When an API becomes unavailable, slows down, or returns incorrect data, users feel it immediately. Transactions fail. Dashboards stop updating. Logins break. Revenue and trust are affected within minutes.

Read Post

Dotcom-Monitor

Read more about API Status Monitoring: Real-Time Health & Uptime Tracking

The "Secret" to Faster LLM Development Cycles

Mar 20, 2026 By Speedscale In Speedscale

Stop paying for every test run! Building AI apps is expensive, but your dev environment shouldn't be. In this video, I show you how to use LLM simulation to get realistic responses and latency without the massive API bill.

View Video

Speedscale

Read more about The "Secret" to Faster LLM Development Cycles

Stop Guessing Why Your App Broke #speedscale #api #sre #observability #devopsengineering

Mar 20, 2026 By Speedscale In Speedscale

Learn more: speedscale.com.

View Video

Speedscale

API
DevOps

Read more about Stop Guessing Why Your App Broke #speedscale #api #sre #observability #devopsengineering

API Response Time Monitoring: Metrics, SLAs & Optimization Guide

Mar 20, 2026 By Dotcom-Monitor In Dotcom-Monitor

Modern applications are powered by APIs. Every login request, checkout transaction, mobile interaction, and third-party integration depends on APIs responding quickly and reliably. When an API slows down, the entire user experience suffers. Even a one-second delay in response time can: For ecommerce platforms, fintech systems, SaaS products, and real-time applications, slow APIs do not simply create inconvenience. They directly affect revenue, customer retention, and operational stability.

Read Post

Dotcom-Monitor

Read more about API Response Time Monitoring: Metrics, SLAs & Optimization Guide

Network Monitoring as Code

Mar 18, 2026 By Checkly In Checkly

Tangling DNS, TCP handshake failures, packet loss: your network has blind spots that application-level dashboards miss. In this session, Daniel Paulus (VP Engineering, Checkly) sets up DNS, TCP, and ICMP monitors from scratch and deploys them as code using the Checkly CLI. You'll see how to import checks from the UI to a code project, use coding agents to build monitors, and debug network failures with Rocky AI, trace routes, and packet captures.

View Video

Checkly

Read more about Network Monitoring as Code

Prompt, Deploy, Pray Is Dead: Validating AI Code with Proxymock

Mar 13, 2026 By Alan Mon In Speedscale

Recent outages tied to AI-assisted code changes have pushed companies into a corner. After several incidents with massive “blast radius” impacts, organizations like Amazon introduced stricter controls—mandating that senior engineers manually review all AI-generated code before it hits production. That response makes sense on paper, but it exposes a fatal flaw in the modern development pipeline.

Read Post

Speedscale

Read more about Prompt, Deploy, Pray Is Dead: Validating AI Code with Proxymock

API Failure: 7 Causes and How to Fix Them | Harness Blog

Mar 13, 2026 By Harness Team In Harness

APIs have revolutionized how web and web app developers interact with data, whether for personal use or business. One of our most profound responsibilities as API developers is to protect our endpoints from being hacked. Even with essential safeguards in place, our websites can be vulnerable. This post discusses seven causes of API failures and how to fix them.

Read Post

Harness

Read more about API Failure: 7 Causes and How to Fix Them | Harness Blog

Why 200k Developers Ditched Big Tech AI #openclaw #openai #claude #aicoding #aiagents #speedscale

Mar 12, 2026 By Speedscale In Speedscale

Is architectural purity dead? The big labs are racing for enterprise control, but developers are flocking to OpenClaw for one reason: ergonomics. It treats AI like a human, not a restricted tool. Are you sticking with the corporate harnesses or going unfiltered? Let’s talk in the comments. Learn more: speedscale.com.

View Video

Speedscale

API
DevOps

Read more about Why 200k Developers Ditched Big Tech AI #openclaw #openai #claude #aicoding #aiagents #speedscale

Your Flaky Tests Are a Data Problem, Not a Test Problem

Mar 11, 2026 By Ken Ahrens In Speedscale

Your tests are not flaky. Your test data is. That 401 Unauthorized that fails every Monday morning? The OAuth token in your test fixture expired 72 hours ago. The order_id that works in staging but not in CI? It was hardcoded six months ago and the format changed from integer to UUID in January. The timestamp assertion that passes at 2pm and fails at midnight? You are comparing a hardcoded 2026-01-15T14:30:00Z against Date.now(). These are not test infrastructure problems. Retrying them will not help.

Read Post

Speedscale

Read more about Your Flaky Tests Are a Data Problem, Not a Test Problem

Runtime Validation vs Static Analysis: Why You Need Both

Mar 11, 2026 By Ken Ahrens In Speedscale

Runtime validation does not replace static analysis. They solve different problems. Static analysis catches structural defects in code before it runs. Runtime validation catches behavioral failures by testing code against real production traffic. Enterprise teams adopting AI coding tools need both layers because AI-generated code introduces a new class of defects that neither layer catches alone. According to CodeRabbit's State of AI vs Human Code Generation report, AI-generated pull requests contain roughly 1.7x more issues than human-written ones. Many of those issues pass static checks cleanly.

Read Post

Speedscale

Read more about Runtime Validation vs Static Analysis: Why You Need Both

AI Coding Agents Have a UX Problem Nobody Wants to Talk About

Mar 11, 2026 By Kush Mansingh In Speedscale

The pitch was simple: let AI write your code so you can focus on the hard problems. Three years into the AI coding revolution, and developers are focused on hard problems alright, just not the ones anyone expected. Instead of designing systems and solving business logic, engineers in 2026 spend a startling amount of their day managing the AI itself. Should you use Fast Mode or Deep Thinking? Haiku or Opus? Cursor or Claude Code or Windsurf? Should you write a SKILL.md file or a custom system prompt?

Read Post

Speedscale

Read more about AI Coding Agents Have a UX Problem Nobody Wants to Talk About

Expanding Uptime Monitoring Down The Stack: ICMP Monitors Are Now Available In Checkly

Mar 10, 2026 By Susa Tünker In Checkly

When we started building Checkly's uptime monitoring suite, the goal was to give engineering teams complete visibility across every layer of their stack, from application down to network, in one place. URL, TCP, DNS, and Heartbeat monitors covered a lot of that ground. But one fundamental piece was missing: the ability to simply ping a host and know if it's reachable.

Read Post

Checkly

Read more about Expanding Uptime Monitoring Down The Stack: ICMP Monitors Are Now Available In Checkly

ICMP Monitors Are Now Available in Checkly

Mar 10, 2026 By Checkly In Checkly

Checkly introduces ICMP monitoring to complement its existing uptime and synthetic monitoring (URL/HTTP, TCP, DNS, and heartbeat checks) for systems without HTTP endpoints, such as database hosts, VPN gateways, and load balancers.

View Video

Checkly

Read more about ICMP Monitors Are Now Available in Checkly

Why API Documentation Is a Core Engineering Discipline, Not an Afterthought

Mar 10, 2026 By OpsMatters In OpsMatters

Developers rarely cite documentation as the most exciting part of building an API. Yet it is frequently the factor that determines whether an integration succeeds in days or drags on for weeks. Poor documentation creates friction at every stage of the API lifecycle. Consumers misunderstand endpoints, send malformed requests and file support tickets that a well-structured reference would have made unnecessary.

Read Post

OpsMatters

Read more about Why API Documentation Is a Core Engineering Discipline, Not an Afterthought

WireMock vs MockServer vs Proxymock: Java Mocking in 2026

Mar 9, 2026 By Ken Ahrens In Speedscale

Your WireMock stubs are lying to you. They were accurate when someone wrote them six months ago, but the payment API added a metadata field in January, the inventory service switched from REST to gRPC in February, and nobody updated the stubs because the tests still pass. Meanwhile, production is breaking in ways your mocks will never catch. This is not a WireMock problem. It is a hand-written mock problem.

Read Post

Speedscale

Read more about WireMock vs MockServer vs Proxymock: Java Mocking in 2026

New API: Submit outage reports

Mar 6, 2026 By Valeria Kurolapova In StatusGator

We’ve added a new endpoint to the StatusGator API that allows you to submit outage reports for monitors on your board. With the new Outage Reports API, you can programmatically report issues you’re experiencing with a service. These reports help StatusGator detect outages faster and improve visibility for other users who rely on the same services.

Read Post

StatusGator

Read more about New API: Submit outage reports

We Turned Our WireShark Wizard Into a Markdown File

Mar 4, 2026 By Tim Nolet In Checkly

Rocky AI — Checkly’s AI agent — is now Generally Available. We developed Rocky AI over the last ~6 to 8 months. This is an aeon in AI-years. During this period, we learned a ton. About AI, but mostly about how to fit them into an existing SaaS product, not just another chat widget. This is my ramble…

Read Post

Checkly

Read more about We Turned Our WireShark Wizard Into a Markdown File

Introducing Rocky AI to General Availability

Mar 4, 2026 By Dan Giordano In Checkly

After months of being available in Beta for our app users, Rocky AI is now generally available to all users and plans. Rocky AI is Checkly’s AI agent that works around the clock, 24/7, to make sure your application’s reliability is optimal. In this first release, Rocky AI ships with the ability to run continual Analysis on test and check failures, giving your teams AI-powered root cause analysis, impact analysis, and more.

Read Post

Checkly

Read more about Introducing Rocky AI to General Availability

Spring Boot API Testing: A Practical Guide for Enterprise Teams

Mar 3, 2026 By Ken Ahrens In Speedscale

Enterprise Spring Boot APIs should be tested at three levels: unit tests for business logic, integration tests for external service behavior, and traffic replay for production edge cases. Most teams only do the first. This guide shows all three using a real Spring Boot application that calls external APIs (SpaceX, US Treasury) with JWT authentication. The kind of service that looks simple in development and breaks in production.

Read Post

Speedscale

Read more about Spring Boot API Testing: A Practical Guide for Enterprise Teams

Debugging Encrypted Microservice Traffic with Speedscale's eBPF Collector

Mar 3, 2026 By Matthew LeRay In Speedscale

Production bugs that only reproduce in actual traffic can be some of the most frustrating bugs in software development. You can stare at your logs, add traces to your code, add instrumentation – and still not be able to see the actual requests that went over the wire. And that gets even harder when the requests are encrypted and the system is a black box. You can use tools like Wireshark or Kubeshark to capture the requests.

Read Post

Speedscale

Read more about Debugging Encrypted Microservice Traffic with Speedscale's eBPF Collector

Operations | Monitoring | ITSM | DevOps | Cloud