Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

Key ECS metrics to monitor

Amazon Elastic Container Service (ECS) is an orchestration service for Docker containers running within the Amazon Web Services (AWS) cloud. You can declare the components of a container-based infrastructure, and ECS will deploy, maintain, and remove those components automatically. The resulting ECS cluster lends itself to a microservice architecture where containers are scaled and scheduled based on need.

Tools for ECS monitoring

In Part 1, we introduced a number of key metrics that you can use for ECS monitoring. Monitoring ECS involves paying attention to two levels of abstraction: the status of your services, tasks, and containers, as well as the resource use from the underlying compute and storage infrastructure, monitored per EC2 host or Docker container. In this post, we’ll survey some techniques you can use to monitor both levels of your ECS deployment.

Monitoring ECS with Datadog

As we explained in Part 1, it’s important to monitor task status and resource use at the level of ECS constructs like clusters and services, while also paying attention to what’s taking place within each host or container. In this post, we’ll show you how Datadog can help you: Automatically collect metrics from every layer of your ECS deployment, Track data from your ECS cluster, plus its hosts and running services in dashboards, and more.

Performance monitoring with OpenTracing, OpenCensus, and OpenMetrics

If you are familiar with instrumenting applications, you may have heard of OpenMetrics, OpenTracing, and OpenCensus. These projects aim to create standards for application performance monitoring and collecting metric data. Although the projects do overlap in terms of their goals, they each take a different approach to observability and instrumentation.

2018 year in review

There were some big IT headlines this past year. Microsoft acquired GitHub and IBM bought Red Hat. Kubernetes graduated from the CNCF incubator program. And the biggest headline of all—at least to those of us at Datadog, where we live and breathe monitoring—we released Datadog Agent version 6, a completely new monitoring agent written in Go! As we start the new year, we’d like to take a moment to recognize some of the incredible things our engineers accomplished in 2018.

Join us in NYC for Dash 2019

We are thrilled to announce Dash 2019, the second year of Datadog’s conference on building and scaling the next generation of applications, infrastructure, and technical teams. This two-day conference will be attended by forward-thinking software developers and operations engineers who are taking the velocity, performance, reliability, and scale of their organizations to the next level.

Monitor all your CI pipelines with Datadog

With continuous integration becoming standard practice, getting full visibility into your CI pipelines has become a key part of monitoring and troubleshooting. Datadog gives you that visibility with out-of-the-box support for several continuous integration tools, including: GitLab, Jenkins, Travis CI, CircleCI and TeamCity. Monitoring your CI servers can help you identify bottlenecks in your pipelines.

Rethinking UX for AI-driven Alerting

I’ve been designing monitoring tools for almost 10 years now, and in that time a lot has changed. The infrastructure we build software on, for example, has been transformed multiple times—moving first from physical hosts to VMs in the cloud, then from VMs to containers, and now from containers to serverless and cloud service-based infrastructure.