Datadog on Profiling in Production

Datadog on Profiling in Production

Feb 9, 2022

Depending on your chosen programming language and stack, you may have never used a profiler in production. The very idea of using a profiler in production for a web service may seem unrealistic, due to the amount of overhead involved. After all, aren’t profilers extremely computationally expensive to run?

Despite a reputation for being computationally expensive, many programming languages have examples of profilers built to run in production. The importance of seeing how your application behaves in production is critically important to understanding how it performs in the real world.

In this episode of Datadog On, we’ll learn how Datadog created a production ready profiler for Python using statistical sampling. We’ll explore the history of production profilers in other languages, and see how languages like Java have a rich history of profiling in production.

We’ll also see how profilers can be used to solve tricky memory leaks, save on cloud costs with more efficient CPU usage, and help you deploy better, more robust software to end users.

00:00 Introduction

02:17 Datadog's Approach to APM

07:15 Deterministic Profiling & Using Profilers in Development

11:12 Profiling in Python

11:40 Developer feedback loop when profiling in cPython

19:00 Building a Statistical Sampler for Python

25:51 CPU and Memory Profiling

30:01 Heap Live Size

32:46 Lock Wait Time

34:28 Profile comparison

35:51 Managing cloud costs with a profiler

38:24 What's next

40:55 Q&A