Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Observabilty for complex systems and related technologies.

Peacetime Observability: Spotting Risks Before They Become Incidents

Most of the time, nothing’s broken. Traffic’s flowing, alerts are quiet, and everything seems fine. That’s peacetime, when no one’s getting paged. Coroot helps in both peacetime and wartime. When things go wrong, it guides you to the root cause fast. But during peacetime, it helps you spot risks early, clean up inefficiencies, and prevent those incidents from happening in the first place.

Why database observability is key to successful cloud data platform adoption

Data is the lifeblood of businesses the world over, from the smallest startup to the largest enterprise. Making sure that it’s available when you need it, secured for authorized use, and recoverable from faults is vital to operating data platforms, no matter where your business is on its cloud journey. This can only be achieved by putting the right data into the hands of the right people, in a timely way, to make the right decisions about how to manage that platform effectively.

Monitoring Backstage with OpenTelemetry:Closing the observability blind spot

‘One small step for a man, but a huge leap for developers’ — me, when I realised how to observe my Backstage with OpenTelemetry. Backstage is often the “portal” through which we manage all our other systems, but who watches the watcher? Recently, we gave a KubeCon Talk, highlighting that monitoring Backstage itself is critical. When Backstage isn’t observable, it becomes a blind spot in your infrastructure.

Syslog Implementation: Servers, Integration and Best Practices

Syslog is a fundamental protocol for collecting messages and event data from various devices and applications across a network. Think of it as a universal language that allows your servers, routers, firewalls, and software to send their operational insights to a central logging point. Born from Unix systems, Syslog has evolved to become the industry standard, forming the backbone of effective log management and providing a unified view of your infrastructure's activity.

Kubernetes observability: How to enrich logs with GeoIP using the Kubernetes Monitoring Helm Chart

When your Kubernetes app suddenly has traffic spikes in a distant country, it can be difficult to determine why. Let’s say, for example, we have an e-commerce app that started to receive an unusual surge of visitors from Australia — something we never anticipated. We search for answers in our logs, but without geographic context, we don’t have the full insights we need.

Detect hallucinations in your RAG LLM applications with Datadog LLM Observability

Hallucinations occur when a large language model (LLM) confidently generates information that is false or unsupported. These responses can spread misinformation that jeopardizes safety, causes reputational damage, and erodes user trust. Augmented generation techniques, such as retrieval-augmented generation (RAG), aim to reduce hallucinations by providing LLMs with relevant context from verified sources and prompting the LLMs to cite these sources in their responses.