Operations | Monitoring | ITSM | DevOps | Cloud

Unlocking Edge AI: a collaborative reference architecture with NVIDIA

The world of edge AI is rapidly transforming how devices and data centers work together. Imagine healthcare tools powered by AI, or self-driving vehicles making real-time decisions. These advancements rely on bringing AI directly to edge devices. However, building a robust architecture for diverse edge environments presents significant hurdles. This blog introduces our new reference architecture, designed to simplify edge AI deployment.

Using CircleCI to test and deploy Python serverless functions on Microsoft Azure

Serverless computing simplifies app development by abstracting away server management. Azure Functions provides a robust platform for event-driven, on-demand code execution. In this tutorial, we’ll create and deploy a Python-based Azure Function—one that parses incoming JSON—using CircleCI. For a more granular and enable programmatic access to Azure resources, we’ll use service principal for secure authentication and the Azure CLI orb to streamline our CI/CD pipeline.

Proactive Monitoring: How DinoCloud Uses CloudWatch to Save Clients Money

At MetricFire, we love talking with engineers about their tech stacks, SRE challenges, and how they approach infrastructure monitoring. Recently, we had a great chat with Yoimer Roman from DinoCloud, a Latin American company that helps clients make smarter business decisions by leveraging AWS CloudWatch monitoring. Yoimer wears many hats: mentoring his team on all things AWS, designing custom cloud environments, and bridging the gap between technical challenges and non-technical stakeholders.

IT Governance Software: Best Options and Key Features to Look For

IT governance software helps organizations take control of their IT strategy by providing frameworks, automation, and monitoring tools to oversee IT operations. This software is the best ally for managing IT effectively: a proces that requires more than just keeping systems running — it involves aligning technology with business goals, addressing security risks, and meeting compliance standards.

Rethinking WhatsApp Alerts - A Data-Driven Approach

WhatsApp has become a major alerting channel for incident response teams. It's popular and for many, a great alternative to SMS. In our 2024 recap, we mentioned how Spike sent over 25,000 alerts on WhatsApp. It is now the 2nd most used alert channel for responders on Spike (rising from 4th spot in 2023). But... I will be the first one to admit – the WhatsApp alerts experience needed work to help responders react to incidents quicker!

What Your Mobile Devices and Favorite Jeans Might Have in Common

We can all agree that we need our mobile devices to be as secure as possible. No one wants to be hacked. No one wants to deal with the fallout of a breach. If you’re a small business owner, you could be out of business in six months because of how hard it is to recover from a single cybersecurity incident. If you’re in charge of a larger business, you might have to clean up the damage caused from leaked data for years.

Introducing Coralogix's AI Center: Real-time AI Observability

Traditional observability wasn't built for. The reason? AI operates in shades of grey, where outcomes are non-deterministic. That's why we built the AI Center, bringing real-time AI observability to thousands of enterprises worldwide. As part of our AI Center, we built an evaluation engine, designed to oversee and detect specific issues that are most common when building AI agents. Teams can choose the evaluators they want to oversee each agent and receive live alerts and reports into specific quality, security and compliance issues.

Going beyond MTTx measuring what "good" incident management looks like

Traditional MTTx metrics have long been the go-to measure for incident management effectiveness, but they often fail to provide a full picture or drive meaningful improvements. We analyzed data from over 100,000 incidents to develop new industry benchmark metrics that better define what "good" incident management looks like.

Announcing HAProxy ALOHA 17.0

HAProxy ALOHA 17.0 is now available, delivering powerful new features that improve UDP load balancing, simplify network management, and enhance performance. With this release, we’re introducing the new UDP Module and extending network management to the Data Plane API, a new API-based approach to network configuration. The Network Management CLI is enhanced with exit status codes and contextual help.