Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

High Availability by Design: WhatsUp Gold Strategic Shift from Failover

As IT environments grow more distributed and resilient, the Progress WhatsUp Gold network monitoring solution is evolving to meet the moment. Starting in early 2026, Progress will officially retire the legacy Failover Manager and usher in a new era of high availability (HA) by design. This modern, scalable approach aligns with today’s best practices in infrastructure.

How to Prove DNS Monitoring ROI to Clients (Without Getting Technical)

Most clients don’t care how DNS works—until it breaks. But as an MSP, you know the damage a single DNS misconfiguration or unnoticed change can cause. So how do you prove the ROI of DNS monitoring to clients who don't speak in TTLs or CNAMEs? Here’s how to bridge the gap between technical benefits and business value—so your clients understand exactly why they’re paying for DNS protection.

Grafana Mimir: 3 reasons to run the TSDB for Prometheus on bare metal

Wilfried Roset is an engineering manager who leads an SRE team and he is a Grafana Champion. Wilfried currently works at OVHcloud where he focuses on prioritizing sustainability, resilience, and industrialization to guarantee customers satisfaction. Whether it’s for efficient resource allocation, flexibility, high availability, or scalability, it makes a lot of sense to run Grafana Mimir on Kubernetes—but it’s not the only way to deploy Mimir.

Instrument your Azure Container Apps workloads with the new Datadog Agent sidecar

Modern application development is evolving rapidly, with serverless containers and microservices becoming the standard for scalable, resilient architectures. Azure Container Apps is at the forefront of this movement, enabling developers to deploy containerized applications without having to manage infrastructure.

Identify slowdowns across your entire network with Datadog Network Path

As modern infrastructure becomes increasingly distributed across on-premises data centers, multi-cloud environments, ISPs, and remote offices, understanding how traffic flows across your network is critical to delivering reliable performance and great user experiences. But pinpointing the source of network slowdowns remains one of the most persistent challenges for operations, network, and IT teams.

Debugging Slow PHP Applications with APM Tools

A slow PHP application in production is not just a performance issue, it poses a significant risk to business operations and user satisfaction. Slow page loads frustrate users, increase bounce rates, and directly impact revenue. For developers, the bigger challenge is that these slowdowns often hide deep in the code, database queries, or external dependencies, making them hard to find.

Alerting Best Practices

A firing alert is like someone ringing your doorbell - it demands your immediate attention, interrupting whatever else you’re doing. It requires focus and a quick response. But imagine trying to live in an apartment where the doorbell never stops ringing. You could put in earplugs to block the noise, but that only masks the problem - it doesn’t solve it. On the other hand, disconnecting the doorbell entirely isn’t a solution either.
Sponsored Post

Atlassian Bitbucket Monitoring on Microsoft SCOM

As part of a customer project, we developed a custom Bitbucket Management Pack for Microsoft System Center Operations Manager (SCOM). This tailored solution enables IT operations teams to monitor key performance and health metrics of Bitbucket environments, ensuring planning and bug-tracking platforms remain available and performant. With this Use Case paper, we aim to share our knowledge with the SCOM community, highlighting the possibilities of advanced monitoring on Microsoft SCOM and helping teams improve their day-to-day tasks.