Operations | Monitoring | ITSM | DevOps | Cloud

Key metrics for monitoring Istio

Istio is an open source service mesh that was released in 2017 as a joint project from Google, IBM, and Lyft. By abstracting the network routes between services from your application logic, Istio allows you to manage your network architecture without altering your application code. Istio makes it easier to implement canary deployments, circuit breakers, load balancing, and other architectural changes, while also offering service discovery, built-in telemetry, and transport layer security.

Generate metrics from your logs to view historical trends and track SLOs

Web server logs and other access logs from technologies such as NGINX, Apache, and AWS Elastic Load Balancing (ELB) provide a wealth of key performance indicators (KPIs) for monitoring the health and performance of your application and understanding your users’ experience. These logs tell you how long pages take to load, where errors are occurring, which parts of your application are requested the most, and much more.

Watchdog for Infra automatically detects infrastructure anomalies

Last year, we introduced Watchdog to help Datadog APM users detect performance problems in their services by applying machine learning algorithms to automatically surface anomalies. Today, we’re excited to announce Watchdog for Infra, which expands the scope of Watchdog to automatically provide ongoing visibility into the health and performance of your infrastructure with no setup required.

Speed up your root cause analysis with Metric Correlations

In a world where the applications we run are constantly changing, the number of monitored metrics and events is skyrocketing, and responsibility for system components is fragmented across teams, it becomes increasingly difficult to pinpoint possible root causes of an issue in a timely manner. To address this challenge, we’re introducing Metric Correlations, which automatically finds candidates for the causes of an issue by searching your system for correlated metrics.

SAP HANA monitoring with Datadog

SAP HANA is a data analytics platform that uses an in-memory, column-oriented data store to efficiently execute transactional (OLTP) and analytical (OLAP) queries. It can perform these queries against its own tables, or against data that resides in remote, non-SAP databases like Hadoop or SQL Server. SAP HANA also serves as the database behind SAP’s S/4HANA ERP platform. Datadog’s new integration helps you better understand the health and performance of your SAP HANA systems.

Introducing Datadog Agent 7 with Python 3 support

We’re excited to release version 7 of the Datadog Agent. It has all of the same functionality as Agent 6, but it is the first version to ship with only the Python 3 runtime. With Python 2 reaching its end of life on January 1, 2020, migrating your services to Python 3 will ensure that they continue working as expected. We’ve tested all of our more than 350 integrations to ensure they work with Python 3.

Monitor Cilium with Datadog

Cilium is an open source technology that delivers network security to large-scale containerized environments at the packet and application levels. Cilium integrates easily with your Kubernetes clusters, either self-managed or managed services (e.g., Amazon EKS, Google GKE, and Microsoft AKS). You can also deploy Cilium to Docker environments using Mesos.

Monitor systemd with Datadog

Systemd is an initialization program that manages processes on Linux systems. It was designed to improve the performance of its predecessors by creating a dependency tree of system components, initializing them only when needed, and using as much parallelization as possible. With systemd becoming ubiquitous in Linux distributions, it’s crucial that you monitor the health and performance of both systemd and the components that it manages.

Monitor Azure DevOps workflows and pipelines with Datadog

Microsoft Azure DevOps is a leading platform for planning, building, and deploying code. We are excited to announce a new integration with Azure DevOps, which helps organizations see the full picture as they build and deploy dynamic applications. Teams can get new insights into their builds, releases, work items, and code events; understand how deployments impact application performance; and even halt bad updates automatically.

Best practices for tagging your infrastructure and applications

Most modern platforms like AWS and Kubernetes create dynamic environments by quickly spinning up instances or containers with significantly shorter lifespans than physical hosts. In these environments, where large-scale applications can be distributed across multiple ephemeral containers or instances, tagging is essential to monitoring services and underlying infrastructure.