Operations | Monitoring | ITSM | DevOps | Cloud

Deploying OpenTelemetry Organizationally: From Proof of Concept to In-Production at Scale

Observability involves telling a coherent story about an entire system. Over the years, video streaming service Pluto TV has had to navigate many storytellers in terms of observability vendors, tools, and formats before settling on OpenTelemetry to analyze and compare features across its many destination platforms. During this presentation, you'll see how Bharathi Ramachandran—Engineering Manager at Pluto TV—used OpenTelemetry to implement his initial proof of concept and get his entire organization shipping observability data at scale.

Changing Perspectives: A Deep Dive into the Security Posture of 600+ Real-World AWS Environments

Earlier this year, Datadog released the “State of AWS Security” study, which examined real-world data from more than 600 organizations and AWS accounts to understand the security posture of global AWS users who also leverage the Datadog Cloud Security Platform. Join Datadog’s Christophe Tafani-Dereeper and Andrew Krug as they explore some important insights from this study, such as the top ways organizations are breached on AWS and how tooling like Datadog Cloud Security Posture Management can help.

Auditing Your Automation's Access: Using More Automation

Between CI/CD pipelines, container orchestrators, and developer debugging tools, more and more automation is needed to scale your systems. But how do you know if that automation is accessing the right systems at the right time? And how do you ensure that your automation is safe from exploits by unauthorized users?

New GKE dashboards and metrics provide deeper visibility into your environment

Google Kubernetes Engine (GKE) is a managed Kubernetes service that enables users to deploy and orchestrate containerized applications on Google’s infrastructure. Datadog’s GKE integration, when paired with our Kubernetes integration, has always provided deep visibility into the health and performance of your clusters at the node, pod, container, and application levels.

Monitoring MongoDB performance metrics (WiredTiger)

This post is part 1 of a 3-part series about monitoring MongoDB performance with the WiredTiger storage engine. Part 2 explains the different ways to collect MongoDB metrics, and Part 3 details how to monitor its performance with Datadog. If you are using the MMAPv1 storage engine, visit the companion article “Monitoring MongoDB performance metrics (MMAP)”.

What is Kafka?

Apache Kafka is a popular open source platform for streaming, storing, and processing high volumes of data. In this video, we break down how Kafka works and how it’s able to provide you with a reliable, scalable, and highly performant service for managing events. We also touch on some key resources for effectively monitoring your Kafka deployments via Datadog.

Showcase dashboards securely and effortlessly with Skykit's offering in the Datadog Marketplace

For many organizations, making the most of the visibility Datadog offers into the health and performance of their infrastructure means displaying dashboards to stakeholders in various settings continuously and in real time. But the standard solutions for sharing dashboards to large-format displays can be onerous, involving sundry software and hardware and restrictive manual setups. These solutions can also pose significant security risks, since they tend to involve sharing passwords or devices.

Best practices for network perimeter security in cloud-native environments

Cloud-native infrastructure has become the standard for deploying applications that are performant and readily available to a globally distributed user base. While this has enabled organizations to quickly adapt to the demands of modern app users, the rapid nature of this migration has also made cloud resources a primary target for security threats.

When Cloud Native Stacks Misbehave - Pitfalls and Lessons Learned | Itiel Shwartz (Komodor)

In this session, Itiel Shwartz will demonstrate common failure scenarios - both app and infra related. We will laugh a little and cry a little, and then cover monitoring, observability & troubleshooting best practices methodologies such as metrics, distributed tracing, logging, network visualization and more. But cheer up! We’ll wrap up by introducing some helpful tools, in order to find and fix issues as fast as possible.

Bringing "Blameless" to Traffic Court | J. Paul Reed (Release Engineering Approaches)

What do modern incident analysis techniques and moving violations have in common? This Quick Bite tells the story of taking the same retrospective techniques the most innovative technology companies in the world use to understand their operational incidents... to traffic court, to help us all understand what really happened? What happened next? Come find out!