Operations | Monitoring | ITSM | DevOps | Cloud

Top AI Prompts for Engineering Leaders using the Cortex MCP

AI assistants have transformed how developers work. And now coupled with the Cortex MCP that connects AI assistants directly to live service data, ownership records, and organizational standards, developers can get accurate, context-rich answers about their services and standards right in their IDE. → Tips and prompts for developers using the Cortex MCP But what about engineering leaders?! Your opportunities with AI assistants extend far beyond code generation.

How to Prove DNS Monitoring ROI to Clients (Without Getting Technical)

Most clients don’t care how DNS works—until it breaks. But as an MSP, you know the damage a single DNS misconfiguration or unnoticed change can cause. So how do you prove the ROI of DNS monitoring to clients who don't speak in TTLs or CNAMEs? Here’s how to bridge the gap between technical benefits and business value—so your clients understand exactly why they’re paying for DNS protection.

A complete security view for every Ubuntu LTS VM on Azure

Azure’s Update Manager now shows missing Ubuntu Pro updates for all Ubuntu Long-Term Support (LTS) releases: 18.04, 20.04, 22.04 and 24.04. The feature was first introduced for only 18.04 during its move to Expanded Security Maintenance. With this addition, Azure highlights where Ubuntu LTS instances would benefit from Expanded Security Maintenance updates if the administrator attaches an Ubuntu Pro license, even for instances running more recent Ubuntu releases.

IT Alerting: Everything You Need to Know

Behind every reliable service is a team of people watching for problems. But they don’t stare at screens all day. They rely on IT alerting systems. An IT alerting system tells you when something is wrong. It finds problems fast, so your team can fix them before your business or customers are affected. This article will explain everything you need to know about IT alerting. You’ll learn what it is, why you need it, how to set it up, and which tools work best. Table of Contents.

Grafana Mimir: 3 reasons to run the TSDB for Prometheus on bare metal

Wilfried Roset is an engineering manager who leads an SRE team and he is a Grafana Champion. Wilfried currently works at OVHcloud where he focuses on prioritizing sustainability, resilience, and industrialization to guarantee customers satisfaction. Whether it’s for efficient resource allocation, flexibility, high availability, or scalability, it makes a lot of sense to run Grafana Mimir on Kubernetes—but it’s not the only way to deploy Mimir.

Instrument your Azure Container Apps workloads with the new Datadog Agent sidecar

Modern application development is evolving rapidly, with serverless containers and microservices becoming the standard for scalable, resilient architectures. Azure Container Apps is at the forefront of this movement, enabling developers to deploy containerized applications without having to manage infrastructure.

Identify slowdowns across your entire network with Datadog Network Path

As modern infrastructure becomes increasingly distributed across on-premises data centers, multi-cloud environments, ISPs, and remote offices, understanding how traffic flows across your network is critical to delivering reliable performance and great user experiences. But pinpointing the source of network slowdowns remains one of the most persistent challenges for operations, network, and IT teams.

Debugging Slow PHP Applications with APM Tools

A slow PHP application in production is not just a performance issue, it poses a significant risk to business operations and user satisfaction. Slow page loads frustrate users, increase bounce rates, and directly impact revenue. For developers, the bigger challenge is that these slowdowns often hide deep in the code, database queries, or external dependencies, making them hard to find.

You Can't Keep Hiring-It's Time to Rethink Operations With AI

Operations has always been a headcount game. More systems mean more people, with human judgment as the irreplaceable element at the end of every alert chain. This fundamental relationship between complexity and operators has defined how we’ve built and run operations infrastructure for decades. But modern product velocity and complexity outpace any organization’s ability to hire and train operators.

Self-Service Query UI for Logs in Azure Data Explorer (ADX)

This video focuses on how to create a self-service user interface (UI) for querying logs using Azure Data Explorer (ADX) and the Business Activity Monitoring (BAM) module. Perfect for developers and business users aiming to gain actionable operational insights from log data with simple visualizations and monitoring.