New York City, NY, USA
Jun 11, 2021   |  By Thomas Sobolik
Migrating your on-prem infrastructure to the cloud offers a host of benefits, including scalability, mobility, security, and cost reduction. When it comes to cloud network monitoring, tracking the availability and performance of the cloud services your applications rely on becomes even more important. However, moving from self-managed infrastructure to third party–managed services introduces a number of challenges.
Jun 10, 2021   |  By Ryan Warrier
AWS Service Quotas helps you manage limits on the number of resources or API operations that are possible for a given AWS service. Hitting such limits could cause operational disruptions related to getting rate limited on the critical APIs that your applications rely on or being unable to provision additional AWS resources.
Jun 7, 2021   |  By Kai Xin Tai
BigPanda is a domain-agnostic AIOps platform that helps organizations detect and resolve incidents in their complex IT environments. By unifying and correlating data from monitoring, change, and topology tools, BigPanda enables teams to quickly pinpoint the root cause of issues and prevent costly outages.
May 28, 2021   |  By Kai Xin Tai
Arm processors have long been at the center of mobile computing, powering billions of smartphones, tablets, smartwatches, and other IoT devices. Today, these processors are beginning to see broader adoption in the cloud as they promise better performance, higher energy efficiency, and lower costs than their x86-based predecessors. Just this week, Oracle announced its new Oracle Cloud Infrastructure Ampere A1 Compute platform, built on the Ampere Altra Arm processor.
May 27, 2021   |  By Paul Gottschling
Amazon Elastic Container Service (ECS) is a managed compute platform for containers that was designed to be simple to configure, with opinionated defaults to help users get started quickly. ECS customers can run containerized workloads on either Amazon EC2 instances or the serverless Fargate platform without having to maintain a control plane—and can easily integrate ECS with other AWS resources, like Network Load Balancers, to architect their infrastructure.
May 24, 2021   |  By Stephen Pinkerton
AWS Lambda extensions enable you to seamlessly integrate third-party tooling with your Lambda environment so you can run custom code or monitoring agents alongside your functions. We’ve partnered with AWS to create a Lambda extension that offers a more cost-effective, simplified process for collecting data from your functions.
May 21, 2021   |  By Thomas Sobolik
Datadog’s infrastructure list provides a central, high-level view of every host in your environment and pulls together metadata and relevant metrics from across Datadog to help you get the full picture of each one. You can easily filter and sort the list using any host tags, letting you quickly view the status of the parts of your infrastructure you need.
May 20, 2021   |  By Mary Jac Heuman
Dashboards are a crucial tool in your monitoring arsenal, as they allow you to visualize and correlate telemetry data from across your stack in a single place. Historically, Datadog offered two dashboard types: Screenboards, for pixel-level control on a canvas, and Timeboards, for troubleshooting a specific point in time. Now, we’re excited to introduce a new dashboard layout that combines the best of Timeboards and Screenboards in a single, seamless editing experience.
May 20, 2021   |  By Emily Chang
When Kubernetes launches and schedules workloads in your cluster, such as during an update or scaling event, you can expect to see short-lived spikes in the number of Pending pods. As long as your cluster has sufficient resources, Pending pods usually transition to Running status on their own as the Kubernetes scheduler assigns them to suitable nodes. However, in some scenarios, Pending pods will fail to get scheduled until you fix the underlying problem.
May 20, 2021   |  By Stephanie Niu
Datadog Notebooks simplify the way teams across an organization find and share knowledge. By bringing together live data and rich Markdown text, Notebooks help teams create powerful, data-driven documents—from runbooks and support playbooks to incident postmortems and data reports. And with collaboration functionalities like real-time editing and commenting, team members can simultaneously make changes to a document and gather feedback along the way.
Jun 4, 2021   |  By Datadog
As you scale your applications, remaining resilient to underlying network failures, resource constraints introduced by other applications, or spikes in traffic can become exponentially more complex, even with very thorough testing and processes. Chaos engineering is a discipline that encourages experimenting in production and injecting controlled failures into the system to understand how the system will react in such conditions and to improve its reliability.
Jun 2, 2021   |  By Datadog
Datadog is constantly elevating the approach to cloud monitoring and security. This Month in Datadog updates you on our newest product features, announcements, resources, and events.
Jun 1, 2021   |  By Datadog
Justin Bodeutsch, Systems Administrator at Planning Center discusses how Datadog’s alerting, log management, serverless, and infrastructure monitoring tools have simplified internal processes and been instrumental in minimizing MTTR across the business.
Jun 1, 2021   |  By Datadog
Since originating at Google, site reliability engineering (SRE) has enabled countless teams to effectively manage large-scale systems, improve the stability of complex services, and automate operational tasks using software. In this SRE panel, Yuri Grinshteyn (Customer Reliability Engineer, Google) will speak about the core principles of SRE and how the culture is practiced at Google. He will be joined by Llywelyn Griffith-Swain (SRE Manager, Vodafone), who will share Vodafone’s story of adopting SRE, lessons learned, and their best practices for maintaining the cultural shift across teams.
May 5, 2021   |  By Datadog
At Datadog, customer trust and data security are of the utmost importance. As a high growth company, navigating the tradeoffs of security and development agility are especially critical. Our customers expect us to continually improve our platform, while providing a compliant, secure environment for their most critical data. Balance is key to rolling out features rapidly and keeping systems secure.
May 4, 2021   |  By Datadog
Datadog Live Containers provides multidimensional, real-time visibility into Kubernetes workloads, from Deployments and ReplicaSets down to individual Containers. Using Datadog's curated metrics, teams can track the health and performance of their Kubernetes resources in the appropriate context and surface critical information about every layer of their Cluster.
May 3, 2021   |  By Datadog
Datadog APM provides end-to-end application monitoring, from frontend browsers to backend database queries and code profiles, so you can monitor and optimize your stack at any scale—no sampling required. APM and distributed tracing are fully integrated with the rest of Datadog, giving you rich context for troubleshooting issues in real time.
Apr 30, 2021   |  By Datadog
Datadog offers a single unified platform to monitor your infrastructure, applications, networks, security threats, UX, and more. For full visibility, you can seamlessly navigate between metrics, traces, and logs. Built-in machine learning tools, clear visualizations, and a companion mobile app make it easy to monitor growing environments. See inside any stack, any app, at any scale, anywhere.
Apr 29, 2021   |  By Datadog
Datadog Serverless Monitoring tracks the health of all your functions in a single pane of glass. Whether your applications are completely serverless or use a mix of infrastructure components, you can quickly detect and troubleshoot performance issues with end-to-end tracing, auto-generated insights, deployment tracking, and more.
Apr 27, 2021   |  By Datadog
Datadog offers a single unified platform to monitor your infrastructure, applications, networks, security threats, UX, and more. See inside any stack, any app, at any scale, anywhere. Get started with a free 14-day trial:
Oct 29, 2018   |  By Datadog
As Docker adoption continues to rise, many organizations have turned to orchestration platforms like ECS and Kubernetes to manage large numbers of ephemeral containers. Thousands of companies use Datadog to monitor millions of containers, which enables us to identify trends in real-world orchestration usage. We’re excited to share 8 key findings of our research.
Oct 29, 2018   |  By Datadog
The elasticity and nearly infinite scalability of the cloud have transformed IT infrastructure. Modern infrastructure is now made up of constantly changing, often short-lived VMs or containers. This has elevated the need for new methods and new tools for monitoring. In this eBook, we outline an effective framework for monitoring modern infrastructure and applications, however large or dynamic they may be.
Oct 1, 2018   |  By Datadog
Where does Docker adoption currently stand and how has it changed? With thousands of companies using Datadog to track their infrastructure, we can see software trends emerging in real time. We’re excited to share what we can see about true Docker adoption.
Oct 1, 2018   |  By Datadog
Build an effective framework for monitoring AWS infrastructure and applications, however large or dynamic they may be. The elasticity and nearly infinite scalability of the AWS cloud have transformed IT infrastructure. Modern infrastructure is now made up of constantly changing, often short-lived components. This has elevated the need for new methods and new tools for monitoring.
Sep 1, 2018   |  By Datadog
Like a car, Elasticsearch was designed to allow you to get up and running quickly, without having to understand all of its inner workings. However, it’s only a matter of time before you run into engine trouble here or there. This guide explains how to address five common Elasticsearch challenges.
Aug 1, 2018   |  By Datadog
Monitoring Kubernetes requires you to rethink your monitoring strategies, especially if you are used to monitoring traditional hosts such as VMs or physical machines. This guide prepares you to effectively approach Kubernetes monitoring in light of its significant operational differences.

Datadog is the essential monitoring platform for cloud applications. We bring together data from servers, containers, databases, and third-party services to make your stack entirely observable. These capabilities help DevOps teams avoid downtime, resolve performance issues, and ensure customers are getting the best user experience.

See it all in one place:

  • See across systems, apps, and services: With turn-key integrations, Datadog seamlessly aggregates metrics and events across the full devops stack.
  • Get full visibility into modern applications: Monitor, troubleshoot, and optimize application performance.
  • Analyze and explore log data in context: Quickly search, filter, and analyze your logs for troubleshooting and open-ended exploration of your data.
  • Build real-time interactive dashboards: More than summary dashboards, Datadog offers all high-resolution metrics and events for manipulation and graphing.
  • Get alerted on critical issues: Datadog notifies you of performance problems, whether they affect a single host or a massive cluster.

Modern monitoring & analytics. See inside any stack, any app, at any scale, anywhere.