Operations | Monitoring | ITSM | DevOps | Cloud

AI Observability Deep Dive Demo | Grafana Cloud

Grafana AI Observability is our new database and platform for observing AI Agents. Over the past year at Grafana Labs, we built Agents and we needed a way to understand how they are performing, what are the costs associated with them, what's the error rate or time to the first token as well as how they are behaving. Grafana Staff Engineer, Ivana Hučková provides a deep dive demo on how Grafana AI Observability connects our experience building Agents with our experience building observability systems.

Shipped: Keep your cost allocation logic out of the wrong hands

CostFormation is how your organization models cost allocation. As more teams adopt it, protecting that logic matters. RBAC for CostFormation Namespaces lets you scope access at the namespace level, so the right people can view and edit Dimensions, and everyone else can’t.

Introducing Bits Agent Builder: Build agentic workflows for alert response and remediation

Building automated workflows that adapt to real-world complexity can be a challenge. As systems scale and scenarios multiply, teams often end up hardcoding endless logic branches just to handle every potential outcome. That’s why we’re introducing Bits Agent Builder, a powerful new tool that lets you create custom AI agents that are fully hosted by Datadog.

What Is IPoDWDM? A Guide to Converged IP and Optical Networking

IP over Dense Wavelength Division Multiplexing (IPoDWDM) is a network architecture that integrates optical transmission capabilities directly into IP networking equipment such as routers and switches. This approach represents a significant evolution from traditional network designs, where IP and optical layers were managed as separate domains with distinct hardware and operational teams.

Autonomous IT Is Here. Are You Prepared?

Enterprise IT was built for a more predictable workplace, where support began when an employee reported a problem and IT worked backward from the details they could provide. That model made sense when devices, applications, and ways of working were easier to control. Today, the digital workplace moves too quickly for IT to rely on reported issues alone. By the time a ticket appears, employees may have already lost time, worked around the problem, abandoned the tool, or turned to an unmanaged alternative.

UK GDPR compliance for cloud & hosting: requirements, risks and responsibilities

UK organisations using cloud services carry a clear legal obligation: they must demonstrate compliance with UK GDPR and the Data Protection Act 2018, not simply assert it. The shift to cloud and hosted infrastructure does not transfer that responsibility to a provider. It distributes it across a chain of controllers and processors that regulators expect you to understand and manage. Post-Brexit, that obligation is set within a distinct legal framework.

Anomaly Detection and Forecasting That Learns From Every Write in InfluxDB

For many operational time series workloads, machine learning can’t operate in the historical way, where data is compiled once and models are trained offline. Sensor readings, infrastructure metrics, application telemetry, energy data, industrial measurements, and financial ticks all share a basic property: the next datapoint is more useful when the system can respond to it immediately (or at least close to immediately).

Why Observability Is Essential for Platform Engineers?

Observability is how platform teams stop being the answer to every question and start building platforms that answer those questions themselves. This article explains specifically how observability enables platform engineers to support development teams better which reducing ticket volume, cutting MTTR, enabling SLO ownership, and making microservice debugging something devs can do without escalating to you.