Operations | Monitoring | ITSM | DevOps | Cloud

Monitoring

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Sponsored Post

What's new in Avantra 24.2

It's my pleasure to announce the release of Avantra 24.2. The second update of Avantra 24, building upon 24.1 which brought performance and customer requested bug fixes, 24.2 brings new innovations and enhancements to our Avantra platform. With over 300 changes in our development management system, Avantra 24.2 feels like a major release to us and we have something new everywhere you look. Let's dive deeper into the new features.

Going for gold: Testing the resilience of Olympic websites

As the world gears up for the Paris Olympics, it’s not just athletes who need to be in peak condition. This Olympics comes hot on the heels of the largest IT outage in history. Recovery efforts from the CrowdStrike outage are still ongoing. Lessons will be learned, no doubt, but at least one takeaway is already evident: the modern web is an oh-so-fragile thing; neglect digital resilience at your peril.

Kubernetes Monitoring Demo: How to Lower Costs and Improve Fleet Efficiency | Grafana

The Kubernetes Monitoring app in Grafana Cloud helps you visualize infrastructure costs across providers, identify unallocated and idle resources, and visualize and optimize Kubernetes resources. In this video, Vijay Tolani shows how to lower costs and improve fleet efficiency with the Kubernetes Monitoring app in Grafana Cloud.

Why Your Telemetry(Observability) Pipelines Need to be Responsive

At Mezmo, we consider Understand, Optimize, and Respond, the three tenets that help control telemetry data and maximize the value derived from it. We have previously discussed data Understanding and Optimization in depth. This blog discusses the need for responsive pipelines and what it takes to design them.

Kubernetes 1.31 - What's new?

Kubernetes 1.31 brings a plethora of enhancements, including 37 line items tracked as ‘Graduating’ in this release. From these, 11 enhancements are graduating to stable, including the highly anticipated AppArmor support for Kubernetes, which includes the ability to specify an AppArmor profile for a container or pod in the API, and have that profile applied by the container runtime.

How Network Observability Helps Lay the Foundation of Autonomous IT Operations

We often hear the term "observability" in the context of DevOps and how SREs use telemetry data. Collecting and analyzing this telemetry data is a vital first step to a successful autonomous IT operations strategy. Observability can help you find out about problems in your system you didn’t know you had—and before your users are impacted—by giving you new visibility that your monitoring systems don’t provide. But any observability initiative must also include network observability.

Monitor Amazon MemoryDB with Datadog

Amazon MemoryDB for Redis is a highly durable in-memory database service that uses cross-availability-zone data storage and fast failover, providing microsecond read times and single-digit-millisecond write times. Datadog’s integration for MemoryDB uses a range of metrics to provide important visibility into MemoryDB performance.

How to Monitor Your Email Services

Verifying email performance is more than the basic understanding of message flow. Outbound mail in the form of Simple Mail Transfer Protocol (SMTP) and inbound mail through MAPI or Microsoft’s Graph API only parts of email systems to monitor, usually through pings or basic delivery confirmations. Often, once email is moved to Exchange Online, even basic visibility of mail flow and reliable delivery is lost.
Sponsored Post

Can the EventSentry Agents cause the same outage & disruption like the CrowdStrike Falcon sensor did?

The faulty Rapid Response Content CrowdStrike update that disabled millions of Windows machines across the globe on 7/19/2024 was any IT professional’s nightmare. Having to manually visit and restore each affected machine (further complicated by BitLocker) severely limited the recovery speed, especially for businesses with remote locations, TVs, kiosks, etc.