Operations | Monitoring | ITSM | DevOps | Cloud

Applications Manager now officially supports Podman monitoring!

As organizations shift away from traditional container engines to embrace Podman’s rootless and daemon-less design, visibility often becomes a challenge. Because Podman doesn't rely on a central background service, traditional monitoring tools can leave you in the dark. Applications Manager's new Podman monitoring feature bridges that gap, giving you total visibility into your Podman workloads without compromising the security model you worked so hard to build.

API Status Monitoring: Real-Time Health & Uptime Tracking

APIs sit at the center of modern digital infrastructure. Mobile applications, SaaS platforms, microservices, and third party integrations all depend on APIs to exchange data and execute business logic in real time. When an API becomes unavailable, slows down, or returns incorrect data, users feel it immediately. Transactions fail. Dashboards stop updating. Logins break. Revenue and trust are affected within minutes.

API Observability Tools: Complete Guide to Platforms, Features & Use Cases (2026)

Modern software runs on APIs. Whether you are operating microservices, integrating third party services, or building customer facing platforms, APIs are the backbone of your architecture. As systems become more distributed, simply knowing whether an endpoint is up or down is no longer enough. Teams need deeper visibility into performance, reliability, latency, and behavior across environments. That is where API observability tools come in. API observability goes beyond basic health checks.

API Response Time Monitoring: Metrics, SLAs & Optimization Guide

Modern applications are powered by APIs. Every login request, checkout transaction, mobile interaction, and third-party integration depends on APIs responding quickly and reliably. When an API slows down, the entire user experience suffers. Even a one-second delay in response time can: For ecommerce platforms, fintech systems, SaaS products, and real-time applications, slow APIs do not simply create inconvenience. They directly affect revenue, customer retention, and operational stability.

Error Monitoring for Elixir: Now in Scout APM

Elixir’s “let it crash” philosophy is one of the best ideas in modern software design. Supervisors restart failed processes, the system self-heals, and life goes on. It’s like having a really good immune system. The problem is that a really good immune system can also hide chronic conditions. A GenServer crashing and restarting is working as designed.

Seer fixes Seer: How Seer pointed us toward a bug and helped fix an outage

Seer is our AI agent that takes bugs and uses all of the context Sentry has to find the root cause and suggest a fix. We use it all the time to help us improve Sentry. Seer fixes Sentry. More recently, Seer has been helping us fix itself — Seer fixing Seer. An upstream outage triggered a bit of an avalanche, revealing a bug that had been hiding away for months. When it came time to fix it, Seer pointed us exactly where we needed to look.

How to monitor LLMs in production with Grafana Cloud,OpenLIT, and OpenTelemetry

Moving a large language model (LLM) application from a demo to a production‑scale service raises very different questions than the ones you ask when playing with an API key in a notebook. In production, you have to answer: How much is each model costing us? Are we keeping latency within our service‑level objectives? Are we accidentally returning hallucinations or toxic content? Is the system vulnerable to prompt‑injection attacks?

Observe your AI agents: Endtoend tracing with OpenLIT and Grafana Cloud

In another post in this series, we discussed how to instrument large language model (LLM) calls. This can be a good starting point, but generative AI workloads increasingly rely on agents, which are systems that plan, call tools, reason, and act autonomously. And their non‑deterministic behavior makes incidents harder to diagnose, in part, because the same prompt can trigger different tool sequences and costs.

Monitor Model Context Protocol (MCP) servers with OpenLIT and Grafana Cloud

Large language models don’t work in a vacuum. They often rely on Model Context Protocol (MCP) servers to fetch additional context from external tools or data sources. MCP provides a standard way for AI agents to talk to tool servers, but this extra layer introduces complexity. Without visibility, an MCP server becomes a black box: you send a request and hope a tool answers. When something breaks, it’s hard to tell if the agent, the server or the downstream API failed.