Operations | Monitoring | ITSM | DevOps | Cloud

Datadog On-Call, Code Analysis & More - This Month's Updates! #Observability #opentelemetry

On This Month in Datadog, we’re bringing you a bonus episode to spotlight Datadog On-Call, which is now generally available, and covering other updates, including the general availability of Code Analysis and our expanded integration with Pinecone.

From writing code to running a company of 300+ employees

Today we break down another exciting edition of Founders and Friends, the podcast we’ve created to hold conversations with visionary leaders shaping the tech industry. Today’s conversation features Paul Stovell, co-founder and CEO of Octopus Deploy, and of course, JD Trask, co-founder and CEO of Raygun. Together, they explore the realities of running software businesses, from the evolving nature of agile practices to scaling software teams efficiently. What’s in this article.

Top Google Cloud Platform (GCP) Services Explained with Use Cases

Google Cloud Platform (GCP) is a suite of cloud computing services that runs on the same infrastructure Google uses internally for its products, such as Google Search and YouTube. With a global network of data centers, GCP offers over 200 fully managed services spanning compute, storage, databases, AI/ML, analytics, networking, and more, enabling businesses to innovate and scale without heavy upfront infrastructure costs.

This Month in Datadog: Datadog On-Call is now generally available

Datadog is constantly elevating the approach to cloud monitoring and security. This Month in Datadog updates you on our newest product features, announcements, resources, and events. To learn more about Datadog and start a free 14-day trial, visit Cloud Monitoring as a Service | Datadog. This month, we put the Spotlight on Datadog On-Call.

Intro to Synthetic Monitoring

Welcome to the second video of our new series, Frontend Observability & Monitoring! Datadog Synthetic Monitoring is a proactive monitoring solution that enables you to create code-free API, browser, and mobile tests to automatically simulate end-user workflows and requests on your front-end applications. This video will walk you through setting up browser and api testing capabilities so you can keep tabs on your application uptime and ensure a reliable user experience.

Designing for Scale: How eG Enterprise Manages Millions of Metrics with AIOps-driven Self-Monitoring

Customers evaluate a modern observability and monitoring solution by the ROI they get, self-monitoring capabilities ultimately improve scalability and quality. The value of any observability solution lies in its ability to proactively detect and alert customers to issues before they cause a business-impacting outage. IT infrastructures and applications can fail in many different ways.

Optimizing Contract Management at Icertis with Datadog

Icertis is a leading contract lifecycle management (CLM) platform that empowers organizations to manage their contracts effectively from initiation to renewal. By leveraging advanced AI and analytics, Icertis helps businesses ensure compliance, mitigate risks, and drive better decision-making. The integration of Datadog has tripled the speed of incident detection and resolution, achieving a 20-30% reduction in overall MTTR and saving approximately $500,000 USD through optimized infrastructure scaling at Icertis.

New Relic Cost Optimization: 9 Surefire Ways To Cut Your Observability Costs

New Relic has established itself as a top observability platform with full-stack monitoring. Unifying all telemetry data — metrics, events, logs, and traces — into one platform delivers deep performance insights and enables faster troubleshooting without juggling multiple tools. Also, New Relic prioritizes developers with tools like CodeStream, integrating error details and telemetry directly into the IDE.

eG Innovations' AIOps-Powered Approach for Optimizing Digital Workspaces and ITOM

eG Innovations brings a unique AIOps-powered approach to IT Service Management (ITSM) and IT Operations Management (ITOM) cycles for managing digital workspaces. The eG Enterprise platform is equipped with capabilities for automated corrective actions, event-based triggers, and remote-control functionalities.

Managing code quality at scale with NDepend

Ensuring code quality at scale is one of the biggest challenges in software development. As applications grow in size and complexity, producing high-quality, maintainable code becomes increasingly vital. In a recent conversation on the Founder & Friends podcast, Raygun CEO John-Daniel Trask (JD) sat down with Patrick Smacchia, founder of NDepend, to discuss how this tool is revolutionizing.NET development.

Benefits of combining the trifecta of APM, RUM, and synthetic monitoring in IT operations

APM is foundational in assessing an application's internal health. It employs a variety of tools and techniques to monitor crucial metrics such as response times, error rates, and resource utilization. This comprehensive analysis enables teams to identify bottlenecks, slow database queries, and other potential performance-related issues that could diminish the user experience.

Unlock better Flutter error insights with native symbols support

We’re excited to announce that native symbols support for Flutter is now live in Raygun Crash Reporting! If you’ve ever struggled with obfuscated stack traces in your Flutter apps, this update will simplify your debugging workflow and give you more actionable insights into app crashes.

What is the Digital Operational Resilience Act (DORA)? Everything you need to know about DORA compliance.

The Digital Operational Resilience Act (DORA) is a European Union legislation designed to enhance the digital operational resilience of financial institutions and their critical third-party ICT (Information and Communication Technology) service providers. DORA has two primary objectives.

How to Use Service Level Objectives (SLOs) in Your IT Monitoring

With countless companies delivering their services digitally, reliability and performance are more important than ever. Whether you want to keep your website running 24/7 or ensure your application is responsive to user actions, you need a dependable way to measure your services’ performance and ensure they meet your requirements and those of your customers. Central to this is SLO (Service Level Objective). SLOs are targets that outline the expected performance of a particular service.

Why UX Friction is Killing Your Growth (...and How to Fix It)

Ever clicked around a website, got frustrated, and just left? Yeah, so have 88% of users. Once they have a bad experience, they don’t come back. (Google) Friction is the thing that ruins smooth experiences. It makes people abandon carts, close apps, and shake their heads at slow-loading pages. And the worst part? Most businesses don’t even realize it’s happening. Let’s talk about what UX friction really is, how to spot it, and—most importantly—how to fix it.

Datadog on LLMs: From Chatbots to Autonomous Agents

As companies rapidly adopt Large Language Models (LLMs), understanding their unique challenges becomes crucial. Join us for a special episode of "Datadog On LLMs: From Chatbots to Autonomous Agents," streaming directly from DASH 2024 on Wednesday, June 26th, to discuss this important topic. In this live session, host Jason Hand will be joined by Othmane Abou-Amal from Datadog’s Data Science team and Conor Branagan from the Bits AI team. Together, they will explore the fascinating world of LLMs and their applications at Datadog.

Raygun's 2024 in review: New features that empower developers

As 2024 wraps up, we’re taking a moment to look back at the updates and tools we launched to make your life as a developer and Raygun user easier. This year, we focused on enhancing how you monitor errors, track performance, and optimize user experiences. Here’s a breakdown of the key features we shipped in 2024.

Monitoring in the Age of the Internet: DEM, IPM, and APM-What You Need to Know

Gartner recently published the first ever Magic Quadrant for Digital Experience Monitoring (DEM). This landmark report raises important questions about what DEM is and why we need a new category now. It also prompts discussions about how DEM, Internet Performance Monitoring (IPM), and Application Performance Monitoring (APM) relate to each other and what roles they play in modern monitoring strategies.
Sponsored Post

Engineering AI systems with Model Context Protocol

On November 26, 2024, Anthropic released the Model Context Protocol (MCP)-an open standard for data exchange between applications and data sources. MCP simplifies how Large Language Models (LLMs) interact with external tools and data, addressing the challenges developers face when integrating AI into their systems. At Raygun, we've been exploring agentic workflows to improve productivity and saw real potential in MCP. This post will explain how MCP works, what we've implemented, and where we think the standard is headed.

DataDog vs Prometheus - Comprehensive Comparison Guide [2025]

Both DataDog and Prometheus are application monitoring tools aimed to improve application performance. While Datadog is a cloud-based SaaS solution, meaning there's no need to install or maintain any infrastructure, Prometheus is an open-source tool that requires manual download and installation on your infrastructure. Let us compare DataDog and Prometheus to see which tool suits The biggest difference between Datadog and Prometheus is that while Prometheus is open-source, Datadog is proprietary.

Kibana vs. Grafana - A Scenario-Based Decision Guide [2025]

Both Kibana and Grafana are data visualization tools providing users capabilities to explore, analyze and visualize data with dashboards. The difference between Kibana and Grafana lies in their genesis. Kibana was built on top of the Elasticsearch stack, famous for log analysis and management. In comparison, Grafana was created mainly for metrics monitoring supporting visualization for time-series databases.

Top 14 ELK alternatives [open source included] in 2025

ELK is the acronym Elasticsearch, Logstash, and Kibana, and combined together, it is one of the most popular log analytics tools. Elastic changed the license of Elasticsearch and Kibana from the fully open Apache 2 license to a proprietary dual license. The ELK stack is also hard to manage at scale. In this article, we will discuss 14 ELK alternatives that you can consider using.

Top 11 Grafana Alternatives & Competitors [2025]

Are you looking for Grafana alternatives? Then you have come to the right place. Grafana started as a data visualization tool. It slowly evolved into a tool that can take data from multiple data sources for visualization. For observability, Grafana offers the LGTM stack (Loki for logs, Grafana for visualization, Tempo for traces, and Mimir for metrics). You need to configure and maintain multiple configurations for a full-stack observability setup.

The Ultimate Guide to API Monitoring in 2025 - Metrics, Tools, and Proven Practices

According to Akamai, 83% of web traffic is through APIs. Microservices, servers, and clients constantly communicate to exchange information. Even the Google search you made to reach this article involved your browser client calling Google APIs. Given APIs govern the internet, businesses rely on them heavily. API health is directly proportional to business prosperity. This article covers everything about API monitoring, so your API infrastructure’s health is always in check .

What is Event Correlation? And Why Does Event Correlation Matter when Monitoring?

Event correlation in the context of an AIOps (Artificial Intelligence for IT Operations) monitoring tool, such as eG Enterprise, is the automated process of analyzing and linking related IT events to identify patterns, root causes, and significant incidents within complex IT environments. By correlating events from various sources (like servers, applications, networks, and databases), AIOps tools help IT teams manage alerts more efficiently, reduce noise, and address issues faster and more effectively.

Top 5 Azure Monitoring tools to maximize application and service performance

Many organizations migrate their workloads to the cloud or begin leveraging what the cloud offers. However, to keep their businesses up and running during this process, organizations still require integrating their systems in the cloud, like Dynamics365, Salesforce, and ServiceNow, with Azure Integration Services (AIS) and potentially on-premises. One crucial aspect of such integrations is keeping them healthy and available, which requires monitoring and diagnostics.