Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Detect and map third-party outages with Datadog External Provider Status

Modern applications depend on dozens of external cloud platforms, APIs, and SaaS services to function. But when those providers experience issues, engineers often spend valuable time asking a basic question: Is the problem with us or with them? Provider-maintained status pages are often slow to update, leaving teams waiting for confirmation while incidents escalate. This delay wastes valuable time, prolongs investigations, and risks customer trust.

Optimize HPC jobs and cluster utilization with Datadog

High-performance computing (HPC) environments support some of the most critical workloads in the world—from asset pricing models in financial institutions to molecular simulations in drug discovery. These workloads often span hundreds of thousands of cores, depend on specialized infrastructure such as GPUs, and run for extended periods. As a result, performance and efficiency are critical.

Introducing Updog.ai: Real-time provider status from Datadog

When external SaaS providers or cloud services degrade or go down, engineers often find themselves wondering if the issue they're encountering is local or more widespread. The answers they find are usually slow to surface, limited in detail, or entirely dependent on the provider's updates. Vendor-controlled status pages and third-party aggregators don’t provide the timely, independent visibility that's necessary to quickly and accurately identify the root cause of slowdowns.

What is Open Telemetry? The Future Is Here

Watch SolarWinds tech evangelist, Sascha Giese, dive into OpenTelemetry (OTel) and explain why a vendor-agnostic standard is the future of observability and application performance monitoring (APM). If you’ve ever wondered, what is OpenTelemetry? Sascha’s presentation is a great start or restart to diving back into the topic.

What Is an Email Blacklist?

An email blacklist is a database that lists IP addresses or domains suspected of sending spam or malicious emails. Mail servers use these lists to decide whether to deliver or reject incoming messages. Understanding how blacklists work is essential for keeping your messages deliverable and your domain reputation intact.

AI-Powered Translation Tools: A Hidden Asset for Scaling DevOps Globally

DevOps or development (Dev) and IT operations (Ops) teams are no longer confined to single geographic locations or language groups. With over 80% of organizations now practicing DevOps (a figure projected to reach 94% in the near future), the challenge of scaling operations globally has never been more critical. Yet, one persistent bottleneck continues to slow down even the most sophisticated DevOps workflows: language barriers.

Get started with Grafana Alerting: Route alerts using dynamic labels

In this tutorial you will learn how to configure notification policies for dynamic routing based on query values Don't miss the rest of the "Get started with Grafana Alerting" series! Each part dives into a different feature to help you get the most out of alerting in Grafana.

Demo of Raygun's remote MCP

This Raygun remote MCP demo highlights the new depth of context available. The agent isn’t just fetching error lists. it’s reasoning through stack traces to find the issues. Combine this with the ability to now view associated deployment versions, browser information, breadcrumbs, customer data and more, the agent becomes infinitely more capable at solving errors. We’ve even heard of some of the early testers going from having errors in production to having them solved within minutes.