Operations | Monitoring | ITSM | DevOps | Cloud

What You Actually Need to Monitor AI Systems in Production

You did it. You added the latest AI agent into your product. Shipped it. Went to sleep. Woke up to find it returning a blank string, taking five seconds longer than yesterday, or confidently outputting lies in perfect JSON. Naturally, you check your logs. You see a prompt. You see a response. And you see nothing helpful. Surprise. Prompt in and response out is not observability. It is vibes.

Critical RCE Vulnerability in mcp-remote: CVE-2025-6514 Threatens LLM Clients

The JFrog Security Research team has recently discovered and disclosed CVE-2025-6514 – a critical (CVSS 9.6) security vulnerability in the mcp-remote project – a popular tool used by Model Context Protocol clients. The vulnerability allows attackers to trigger arbitrary OS command execution on the machine running mcp-remote when it initiates a connection to an untrusted MCP server, posing a significant risk to users – a full system compromise.

Introducing the InfluxDB 3 MCP Server: Natural Language for Time Series

Time series data underpins all real-time systems. From high-resolution telemetry to long-range trends, it’s essential for monitoring, automation, predictive maintenance, and operational insight. But it’s also hard to work with: high cardinality, shifting schemas, and time-based queries make even basic tasks feel heavy.

AI-Enabled Network Management: Revolutionize Operator Workflows with AI Agents

For today's leading service providers and large enterprises, ensuring peak performance requires navigating a labyrinth of data streams, monitoring tools, and legacy systems. This often leaves network operators spending more time searching for information than acting on it. A new AI-enabled network management is dawning, promising to upend these cumbersome workflows.

Dashboard Sharing - The Hard Way

Unlike menu items, dashboards in Icinga Web 2 currently can’t be shared across users. This is something we will implement in future versions, but for now users can only create dashboards for themselves. We don’t have an exact timeline for the dashboard sharing feature yet and our roadmap is already pretty packed for this year, so we won’t be tackling this until later next year.

Security is a leading priority for 2025

The Cloudsmith 2025 Artifact Management Report offers timely insights into how engineering and DevOps teams are evolving their approach to software artifact management and software supply chain security. With supply chain attacks on the rise and Generative AI reshaping development practices, teams are reevaluating how they manage, secure, and scale their artifact repository infrastructure.

Reduce your mean time to repair with the Datadog mobile app

For on-call engineers responding to alerts, every minute counts. Faster incident response means faster mitigation, reduced downtime, and better customer experience. But even the most finely tuned, meticulously detailed alerts can leave responders scrambling for more information. In order to effectively triage and investigate incidents and set remediation in motion, responders need data to help them contextualize alerts.

Monitor your LiteLLM AI proxy with Datadog

As organizations rapidly scale their use of large language models (LLMs), many teams are adopting LiteLLM to simplify access to a diverse set of LLM providers and models. LiteLLM provides a unified interface through both an SDK and proxy to speed up development, centralize control, and optimize LLM-powered workflows. But introducing a proxy layer adds abstraction, making it harder to understand how requests are processed.

Troubleshoot root causes with GitHub commit and ownership data in Error Tracking

When an error occurs, developers need to act quickly. But too often, they’re left searching through stack traces without enough context to understand what happened, who owns the code, or what change may have introduced the issue. This slows down triage, creates inefficient handoffs, and takes time away from building new features.