Datadog

New York City, NY, USA
2010
  |  By Ivan Ilichev
Processes—the service workloads that run on your infrastructure—are the building blocks of your application, and it’s critical to know how well they operate at every level of the stack. Degraded process performance can lead to downtime for your mission-critical services, resulting in loss of customer trust and potentially impacting revenue for the business.
  |  By David Lentz
So far in this series, we’ve walked through key etcd metrics and tools you can use to monitor etcd metrics and logs. In this post, we’ll show you how you can monitor etcd with Datadog, including how to: But first, we’ll show you how to set up and configure the Datadog Agent and Cluster Agent to send etcd monitoring data to your Datadog account.
  |  By David Lentz
In Part 1 of this series, we looked at how etcd works and the role it plays in managing the state of a Kubernetes cluster. We also explored key etcd metrics you should monitor to ensure the health and performance of your etcd cluster. In this post, we’ll show you how you can use tools like Prometheus, Grafana, and etcdctl to collect and visualize etcd metrics. We’ll also show you how to collect etcd logs that provide context for those metrics.
  |  By David Lentz
Etcd is a distributed key-value data store that provides highly available, durable storage for distributed applications. In Kubernetes, etcd functions as part of the control plane, storing data about the actual and desired state of the resources in a cluster. Kubernetes controllers use etcd’s data to reconcile the cluster’s actual state to its desired state. This series focuses on monitoring etcd in Kubernetes.
  |  By Nicholas Thomson
The Windows Registry is a centralized key-value database that stores permissions, user data, and configuration settings for the Windows operating system and many Windows native applications. The keys stored in the registry provide a granular view into the processes occurring on a Windows host, such as certificate expirations, security checks, and pending reboots.
  |  By Addie Beach
It’s relatively easy to study the immediate impact of new releases by analyzing short-term changes in user behavior or system activity. However, this information doesn’t tell you much about the long-term viability of your application, which depends less on the novelty of major application updates and more on sustained usability.
  |  By Kai Xin Tai
Complex systems require many different monitors to assess the health of their infrastructure and applications, creating a wealth of alerts that can be hard to track. Due to a lack of effective triage processes, many organizations page engineers for every alert that comes in, making it difficult to separate false positives from issues that actually require immediate attention.
  |  By Antoine Dussault
Tracing provides indispensable insights into the state and performance of distributed applications, but it can often be difficult to determine the root cause or ultimate business impact of issues indicated by traces. Translating visibility of individual microservices into broader performance insights often requires drawing complex correlations between spans. This can be a laborious process, which can complicate everything from troubleshooting and triage to tracking KPIs and managing costs.
  |  By Meghan Lo
Faulty deployments and other types of erroneous changes may account for around 70% of all application outages. With the prevalence of CI/CD workflows, engineering teams make changes to their applications, services, and infrastructure all the time, which can make it difficult to trace issues to specific changes.
  |  By Nicholas Thomson
The Windows operating system exposes metrics such as CPU, memory, and disk usage as built-in performance counters, which provide a unified way to observe performance, state, and other high-level facets of Windows subsystems, components, and native or third-party applications. As such, Windows Performance Counters can be invaluable for monitoring resource usage and the health of your infrastructure, as well as systems your services are using.
  |  By Datadog
Learn how the team at Complyt was able to integrate Cloud Cost Managament in a matter of hours and quickly pinpoint underutilized services to cut their cloud spend in half. CCM delivers cost data where engineers work and with resource-level context like CPU, memory, and requests — easily scoped to their services and applications — so that they can take action and spend effectively.
  |  By Datadog
ngrok delivers instant ingress to your applications in any cloud, private network, or devices with authentication, load balancing, and other critical controls using their global points of presence. Hear from Chad Tindel, Field CTO & VP WW Solution Architecture, on why Datadog was their most requested integration and how it provides an easy pathway to ship application and traffic logs into one unified observability platform.
  |  By Datadog
As the world’s largest automotive manufacturer and the leading software-first mobility company, Toyota leans on Datadog to achieve its goals of delivering value to customers and uplifting employees with new technologies and processes. Jason Ballard, IT Executive and General Manager, shares his top priorities for the enterprise in North America and offers his advice for how other leaders in the industry can transform their business.
  |  By Datadog
With Incident Management, Datadog provides a unified platform to seamlessly detect, investigate and manage incidents from end-to-end, helping you to streamline processes and quickly mobilize the right teams for faster incident resolution.
  |  By Datadog
Datadog, the observability platform used by thousands of companies, runs on dozens of self-managed Kubernetes clusters in a multi-cloud environment, adding up to tens of thousands of nodes, or hundreds of thousands of pods. Also, this infrastructure is used by a wide variety of engineering teams at Datadog, with different features and capacity needs that may also change overtime.
  |  By Datadog
Datadog is constantly elevating the approach to cloud monitoring and security. This Month in Datadog updates you on our newest product features, announcements, resources, and events. This month, we put the Spotlight on Dynamic Instrumentation..
  |  By Datadog
Learn how Wawa's engineering team incentivized customers to increase spending in stores using Datadog Real User Monitoring. RUM gave them a holistic view to understand customer patterns and provide the right in-store incentives.
  |  By Datadog
Morgan Goose, Autodesk, shares how he and his team have democratized observability and made it a default offering for all their engineers. Autodesk is a global leader in software for people who design and make the world. That includes software for architects, builders, engineers, 3D artists, and production teams. To ensure the best customer experience, Autodesk has partnered with Datadog and is taking advantage of products like DBM to quickly identify and maintain the systems they instrument.
  |  By Datadog
Businesses consolidate their tools with Datadog’s all-in-one observability and security solution in order to drive cost savings and resource efficiencies while accelerating time to market and delivering superior customer experiences.
  |  By Datadog
Improve the performance and reliability of your CI pipelines and test suites and accelerate developer velocity with Datadog Software Delivery In this demo, discover how Datadog empowers Platform Engineering & DevOps teams to enhance CI pipeline performance, improve test reliability, and boost development velocity.
  |  By Datadog
As Docker adoption continues to rise, many organizations have turned to orchestration platforms like ECS and Kubernetes to manage large numbers of ephemeral containers. Thousands of companies use Datadog to monitor millions of containers, which enables us to identify trends in real-world orchestration usage. We're excited to share 8 key findings of our research.
  |  By Datadog
The elasticity and nearly infinite scalability of the cloud have transformed IT infrastructure. Modern infrastructure is now made up of constantly changing, often short-lived VMs or containers. This has elevated the need for new methods and new tools for monitoring. In this eBook, we outline an effective framework for monitoring modern infrastructure and applications, however large or dynamic they may be.
  |  By Datadog
Where does Docker adoption currently stand and how has it changed? With thousands of companies using Datadog to track their infrastructure, we can see software trends emerging in real time. We're excited to share what we can see about true Docker adoption.
  |  By Datadog
Build an effective framework for monitoring AWS infrastructure and applications, however large or dynamic they may be. The elasticity and nearly infinite scalability of the AWS cloud have transformed IT infrastructure. Modern infrastructure is now made up of constantly changing, often short-lived components. This has elevated the need for new methods and new tools for monitoring.
  |  By Datadog
Like a car, Elasticsearch was designed to allow you to get up and running quickly, without having to understand all of its inner workings. However, it's only a matter of time before you run into engine trouble here or there. This guide explains how to address five common Elasticsearch challenges.
  |  By Datadog
Monitoring Kubernetes requires you to rethink your monitoring strategies, especially if you are used to monitoring traditional hosts such as VMs or physical machines. This guide prepares you to effectively approach Kubernetes monitoring in light of its significant operational differences.

Datadog is the essential monitoring platform for cloud applications. We bring together data from servers, containers, databases, and third-party services to make your stack entirely observable. These capabilities help DevOps teams avoid downtime, resolve performance issues, and ensure customers are getting the best user experience.

See it all in one place:

  • See across systems, apps, and services: With turn-key integrations, Datadog seamlessly aggregates metrics and events across the full devops stack.
  • Get full visibility into modern applications: Monitor, troubleshoot, and optimize application performance.
  • Analyze and explore log data in context: Quickly search, filter, and analyze your logs for troubleshooting and open-ended exploration of your data.
  • Build real-time interactive dashboards: More than summary dashboards, Datadog offers all high-resolution metrics and events for manipulation and graphing.
  • Get alerted on critical issues: Datadog notifies you of performance problems, whether they affect a single host or a massive cluster.

Modern monitoring & analytics. See inside any stack, any app, at any scale, anywhere.