Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Top Metrics for CRM companies

CRMs are a valuable tool for businesses to organize their sales and customers. The benefits of having one include increased revenue, better visibility into accounts, automated tasks, and more. But, if your CRM needs to be fixed, it can create challenges for your business. CRM monitoring helps you fix problems before they become apparent. In this article, we’ll show you how to start with MetricFire.

Monitor Your ZFS Volume Manager With Telegraf

ZFS (Zettabyte File System) is a file system and volume manager that has robust data integrity features and uses checksums for every block of data, ensuring that any data corruption is detected and corrected. Additionally, it offers advanced features such as pooled storage, efficient snapshots and cloning, built-in data compression, deduplication, and high scalability, making it ideal for large-scale and high-performance storage environments.

Going for gold: Testing the resilience of Olympic websites

As the world gears up for the Paris Olympics, it’s not just athletes who need to be in peak condition. This Olympics comes hot on the heels of the largest IT outage in history. Recovery efforts from the CrowdStrike outage are still ongoing. Lessons will be learned, no doubt, but at least one takeaway is already evident: the modern web is an oh-so-fragile thing; neglect digital resilience at your peril.

Monitor Amazon MemoryDB with Datadog

Amazon MemoryDB for Redis is a highly durable in-memory database service that uses cross-availability-zone data storage and fast failover, providing microsecond read times and single-digit-millisecond write times. Datadog’s integration for MemoryDB uses a range of metrics to provide important visibility into MemoryDB performance.

How Network Observability Helps Lay the Foundation of Autonomous IT Operations

We often hear the term "observability" in the context of DevOps and how SREs use telemetry data. Collecting and analyzing this telemetry data is a vital first step to a successful autonomous IT operations strategy. Observability can help you find out about problems in your system you didn’t know you had—and before your users are impacted—by giving you new visibility that your monitoring systems don’t provide. But any observability initiative must also include network observability.

Kubernetes 1.31 - What's new?

Kubernetes 1.31 brings a plethora of enhancements, including 37 line items tracked as ‘Graduating’ in this release. From these, 11 enhancements are graduating to stable, including the highly anticipated AppArmor support for Kubernetes, which includes the ability to specify an AppArmor profile for a container or pod in the API, and have that profile applied by the container runtime.

Why Your Telemetry(Observability) Pipelines Need to be Responsive

At Mezmo, we consider Understand, Optimize, and Respond, the three tenets that help control telemetry data and maximize the value derived from it. We have previously discussed data Understanding and Optimization in depth. This blog discusses the need for responsive pipelines and what it takes to design them.

Kubernetes Monitoring Demo: How to Lower Costs and Improve Fleet Efficiency | Grafana

The Kubernetes Monitoring app in Grafana Cloud helps you visualize infrastructure costs across providers, identify unallocated and idle resources, and visualize and optimize Kubernetes resources. In this video, Vijay Tolani shows how to lower costs and improve fleet efficiency with the Kubernetes Monitoring app in Grafana Cloud.
Sponsored Post

Can the EventSentry Agents cause the same outage & disruption like the CrowdStrike Falcon sensor did?

The faulty Rapid Response Content CrowdStrike update that disabled millions of Windows machines across the globe on 7/19/2024 was any IT professional’s nightmare. Having to manually visit and restore each affected machine (further complicated by BitLocker) severely limited the recovery speed, especially for businesses with remote locations, TVs, kiosks, etc.