Mar 21, 2023
|
By Ara Pulido
When containers and container orchestration were introduced, they opened the possibility of helping companies utilize physical resources like CPU and memory more efficiently. But as more companies and bigger enterprises have adopted Kubernetes, FinOps professionals may wonder why their cloud bills haven’t gone down—or worse, why they have increased.
Mar 17, 2023
|
By Paul Gottschling
While the output of certain RabbitMQ CLI commands uses the term “slave” to refer to mirrored queues, RabbitMQ has disavowed this term, as has Datadog. When collecting RabbitMQ metrics, you can take advantage of RabbitMQ’s built-in monitoring tools and ecosystem of plugins. In this post, we’ll introduce these RabbitMQ monitoring tools and show you how you can use them in your own messaging setup.
Mar 17, 2023
|
By Paul Gottschling
RabbitMQ is a message broker, a tool for implementing a messaging architecture. Some parts of your application publish messages, others consume them, and RabbitMQ routes them between producers and consumers. The broker is well suited for loosely coupled microservices. If no service or part of the application can handle a given message, RabbitMQ keeps the message in a queue until it can be delivered.
Mar 9, 2023
|
By Nicholas Thomson
Modern applications running on distributed systems often complicate service ownership because of their ever-growing web of microservice dependencies. This complication challenges engineers’ ability to shepherd their software through every stage of the development life cycle, as well as teams’ ability to train new engineers on the application’s architecture. With increased complexity, clarity is key for quick, effective troubleshooting and delivering value to end users.
Mar 6, 2023
|
By Thomas Sobolik
Modern, high-scale applications can generate hundreds of millions of logs per day. Each log provides point-in-time insights into the state of the services and systems that emitted it. But logs are not created in isolation. Each log event represents a small, sequential step in a larger story, such as a user request, database restart process, or CI/CD pipeline.
Mar 3, 2023
|
By Meghan Lo
Many developers and product teams are iterating faster and deploying more frequently to meet user expectations for responsive and optimized apps. These constant deployments—which can number in the dozens or even hundreds per day for larger organizations—are essential for keeping your customer base engaged and delighted. However, they also make it harder to pinpoint the exact deployment that led to a rise in errors, a new error, or a performance regression in your app.
Mar 1, 2023
|
By Aaron Kaplan
Blocked queries are one of the key issues faced by database analysts, engineers, and anyone managing database performance at scale. Blocking can be caused by inefficient query or database design as well as resource saturation, and can lead to increased latency, errors, and user frustration. Pinpointing root blockers—the underlying problematic queries that set off cascading locks on database resources—is key to troubleshooting and remediating database performance issues.
Mar 1, 2023
|
By Guto Costa
As the world’s leading local delivery platform, Delivery Hero brings groceries and household goods to customers in more than 70 countries. Their technology stack comprises over 200 services across 20 Kubernetes clusters running on Amazon EKS. This cloud-based, containerized infrastructure enabled them to scale their operation to support increasing demand as the volume of orders placed on their platform doubled during the pandemic.
Feb 28, 2023
|
By Jesse Mack
StormForge Optimize Live is a machine learning-powered performance and resource optimization solution for Kubernetes workloads. Optimize Live ingests and analyzes production observability data and recommends specific actions to optimize CPU and memory utilization. You can take these actions manually or set them to occur automatically, making it easier to maintain a high level of application performance while minimizing cloud costs.
Feb 24, 2023
|
By Thomas Sobolik
In dynamic production environments, unpredictable traffic loads and frequent code changes can make it difficult for organizations to consistently optimize their cloud infrastructure, resulting in application performance issues, latency, and wasted cloud spend. Teams that manage large-scale cloud infrastructure deployments are often forced to tune their workloads’ configurations using a complicated mesh of script jobs—or worse, manual remediation by on-call engineers prompted by alerts.
Mar 17, 2023
|
By Datadog
Hear from a Glovo engineer how Datadog Database Monitoring helped their storage team reduce costs and time spent on databases, as well as peak CPU usage.
Mar 16, 2023
|
By Datadog
Hear firsthand how Datadog helped Restaurant Brands International get real-time metrics, correlate logs, create alerts, and reduce their downtime.
Mar 13, 2023
|
By Datadog
Hear why Seven.One Entertainment Group, a subsidiary of ProSiebenSat.1 Media SE , which is Germany’s top commercial broadcaster, chose Datadog Real User Monitoring and how the solution enabled them to better understand client-side issues.
Mar 7, 2023
|
By Datadog
There are many different ways to implement Site Reliability Engineering (SRE). From team structures to roles and responsibilities to planning and prioritization flows, there’s no golden path for how to organize things. As Datadog has shifted from a startup to a quickly-growing public company, we’ve seen our own SRE practice evolve. With over 22,000 customers sending trillions of data points each day, keeping Datadog reliable is critical to our business.
Mar 7, 2023
|
By Datadog
Hear firsthand from an engineer at Whatnot why they chose Datadog Synthetic Monitoring and how the solution supports an automated testing process.
Feb 28, 2023
|
By Datadog
Datadog is constantly elevating the approach to cloud monitoring and security. This Month in Datadog updates you on our newest product features, announcements, resources, and events.
Feb 14, 2023
|
By Datadog
Datadog Continuous Testing helps you accelerate releases while reducing bugs. Create scalable tests directly in the UI with our user-friendly codeless web recorder, and ensure fast, reliable troubleshooting across any environment with parallel, multi-location, and self-healing browser tests.
Feb 10, 2023
|
By Datadog
Hear how Datadog Log Management has enabled Whatnot, DoControl, and AVIV Group to gain greater visibility into their code and costs.
Feb 8, 2023
|
By Datadog
Hear how Klarna’s engineers utilize Datadog APM to analyze and optimize their services.
Jan 10, 2023
|
By Datadog
Canva is an online design platform with a mission to empower everyone in the world to design anything and publish anywhere. To guarantee our customers have the best experience using our products, Canva engineers rely on the tools and products provided by the Observability team to measure and quantify critical application health and performance metrics. Canva’s Observability team uses OpenTelemetry components to collect, transform and export standardised telemetry data from our applications and platforms. Canva has been an early adopter of OTel using OTel SDK for tracing and the collector gateway to process and export telemetry to various tools. In this talk we’ll take a deeper look at how Canva uses OTel in our current observability workflows.
Oct 29, 2018
|
By Datadog
As Docker adoption continues to rise, many organizations have turned to orchestration platforms like ECS and Kubernetes to manage large numbers of ephemeral containers. Thousands of companies use Datadog to monitor millions of containers, which enables us to identify trends in real-world orchestration usage. We're excited to share 8 key findings of our research.
Oct 29, 2018
|
By Datadog
The elasticity and nearly infinite scalability of the cloud have transformed IT infrastructure. Modern infrastructure is now made up of constantly changing, often short-lived VMs or containers. This has elevated the need for new methods and new tools for monitoring. In this eBook, we outline an effective framework for monitoring modern infrastructure and applications, however large or dynamic they may be.
Oct 1, 2018
|
By Datadog
Where does Docker adoption currently stand and how has it changed? With thousands of companies using Datadog to track their infrastructure, we can see software trends emerging in real time. We're excited to share what we can see about true Docker adoption.
Oct 1, 2018
|
By Datadog
Build an effective framework for monitoring AWS infrastructure and applications, however large or dynamic they may be. The elasticity and nearly infinite scalability of the AWS cloud have transformed IT infrastructure. Modern infrastructure is now made up of constantly changing, often short-lived components. This has elevated the need for new methods and new tools for monitoring.
Sep 1, 2018
|
By Datadog
Like a car, Elasticsearch was designed to allow you to get up and running quickly, without having to understand all of its inner workings. However, it's only a matter of time before you run into engine trouble here or there. This guide explains how to address five common Elasticsearch challenges.
Aug 1, 2018
|
By Datadog
Monitoring Kubernetes requires you to rethink your monitoring strategies, especially if you are used to monitoring traditional hosts such as VMs or physical machines. This guide prepares you to effectively approach Kubernetes monitoring in light of its significant operational differences.
- March 2023 (13)
- February 2023 (11)
- January 2023 (7)
- December 2022 (9)
- November 2022 (27)
- October 2022 (24)
- September 2022 (14)
- August 2022 (21)
- July 2022 (13)
- June 2022 (13)
- May 2022 (18)
- April 2022 (15)
- March 2022 (6)
- February 2022 (14)
- January 2022 (17)
- December 2021 (10)
- November 2021 (16)
- October 2021 (27)
- September 2021 (8)
- August 2021 (18)
- July 2021 (15)
- June 2021 (16)
- May 2021 (23)
- April 2021 (20)
- March 2021 (16)
- February 2021 (9)
- January 2021 (10)
- December 2020 (22)
- November 2020 (17)
- October 2020 (12)
- September 2020 (15)
- August 2020 (22)
- July 2020 (20)
- June 2020 (16)
- May 2020 (18)
- April 2020 (24)
- March 2020 (13)
- February 2020 (13)
- January 2020 (11)
- December 2019 (16)
- November 2019 (11)
- October 2019 (12)
- September 2019 (12)
- August 2019 (16)
- July 2019 (18)
- June 2019 (11)
- May 2019 (12)
- April 2019 (20)
- March 2019 (10)
- February 2019 (9)
- January 2019 (6)
- December 2018 (7)
- November 2018 (7)
- October 2018 (13)
- September 2018 (5)
- August 2018 (12)
- July 2018 (12)
- June 2018 (6)
- March 2018 (1)
- December 2017 (1)
- November 2017 (1)
Datadog is the essential monitoring platform for cloud applications. We bring together data from servers, containers, databases, and third-party services to make your stack entirely observable. These capabilities help DevOps teams avoid downtime, resolve performance issues, and ensure customers are getting the best user experience.
See it all in one place:
- See across systems, apps, and services: With turn-key integrations, Datadog seamlessly aggregates metrics and events across the full devops stack.
- Get full visibility into modern applications: Monitor, troubleshoot, and optimize application performance.
- Analyze and explore log data in context: Quickly search, filter, and analyze your logs for troubleshooting and open-ended exploration of your data.
- Build real-time interactive dashboards: More than summary dashboards, Datadog offers all high-resolution metrics and events for manipulation and graphing.
- Get alerted on critical issues: Datadog notifies you of performance problems, whether they affect a single host or a massive cluster.
Modern monitoring & analytics. See inside any stack, any app, at any scale, anywhere.