Operations | Monitoring | ITSM | DevOps | Cloud

Use library injection to auto-instrument and trace your Kubernetes applications with Datadog APM

Many organizations rely on distributed tracing in Datadog APM to gain end-to-end visibility into the performance of their Kubernetes applications. But as teams grow, it can become impractical for them to manually configure each new application with the libraries and environment variables needed for tracing.

Monitor Boundary on the HashiCorp Cloud Platform with Datadog

HashiCorp Boundary provides a secure way to manage remote access to applications and infrastructure without exposing the underlying network or credentials. Launched two years ago as an open source solution, HashiCorp recently announced a fully managed version on the HashiCorp Cloud Platform (HCP), enabling you to manage identity-based authorizations, user and target onboarding, and more for dynamic environments.

Monitor Tanzu Kubernetes Grid on vSphere with Datadog

With vSphere and Tanzu Kubernetes Grid (TKG), VMware enables enterprise organizations to combine the economic advantages of virtual machines (VMs) with the agility, portability, and scalability provided by Kubernetes. vSphere is VMware’s platform for the provisioning and management of VMs.

Best practices to prevent alert fatigue

As your environment changes, new trends can quickly make your existing monitoring less accurate. At the same time, building alerts after every new incident can turn a straightforward strategy into a convoluted one. Treating monitoring as a one-time or reactive effort can both result in alert fatigue. Alert fatigue occurs when an excessive number of alerts are generated by monitoring systems or when alerts are irrelevant or unhelpful, leading to a diminished ability to see critical issues.

Identify and resolve incidents faster with InsightFinder's offering in the Datadog Marketplace

InsightFinder is a SaaS platform that uses AI-backed predictive analytics to predict and prevent production incidents. Using InsightFinder with Datadog, you can quickly identify hidden correlations in your application metrics, logs, and events and address application issues before they devolve into production outages and create customer impact.

Best practices for continuous testing with Datadog

In Parts 1 and 2, we looked at how you can build and maintain effective test suites. These steps are a key part of ensuring that application workflows function as expected. But how you run your tests is another important point to consider, so in this post, we’ll walk through best practices for executing your tests across every stage of development. Along the way, we’ll also look at how Datadog supports these practices for the applications that you are already monitoring.

Use HiveMQ and OpenTelemetry to monitor IoT applications in Datadog

Large IoT environments are highly complex and comprise multiple layers of disparate devices that must move data between each other, across potentially unreliable connections. Having visibility into each layer of your IoT environment is critical for quickly identifying problems with your deployment that could negatively impact user experience.

Configure pipeline alerts with Datadog CI monitors

CI pipelines have become an integral part of the development workflow, helping teams automate the continuous building and testing of new updates to application code. The growing importance of CI pipelines has naturally led to a need for increased visibility into their performance. In 2021, Datadog introduced CI Visibility to deliver granular performance metrics for each individual pipeline, allowing you to monitor build duration and related telemetry across all recent commits.

Highlights from AWS re:Invent 2022

Just like shopping on Black Friday, AWS re:Invent has become a post-Thanksgiving tradition for some of us at Datadog. We were excited to join tens of thousands of fellow AWS users and partners for this annual gathering that features new product announcements, technical sessions, networking, and fun. This year, we saw three themes emerge from the conference announcements and sessions.

Golden signals in seconds with Universal Service Monitoring

Whether you are a site reliability engineer, DevOps engineer, or application developer, you need visibility into the health and performance of every service you run or support. But in complex, dynamic environments, it can be difficult to ensure that all services are accounted for.