Building a flexible, realtime data warehouse at Sentry with Beam + Dataflow (Syd Ryan)

Building a flexible, realtime data warehouse at Sentry with Beam + Dataflow (Syd Ryan)

Feb 21, 2020

Syd Ryan describes two hard problems they've solved at Sentry with streaming Beam pipelines. The first solution combines Postgres change data capture and SQL views to produce a table that appears to be updating in real time within BigQuery. The second solution is aggregating 1000s of events per second and backfilling historical data effectively with Beam's unified batch/streaming interfaces.

Syd is a data engineer at Sentry, an open-source error monitoring tool that helps developers ship better software, faster. Most recently they have been replacing batch ETL jobs with streaming data pipelines - because fast data is better than slow data.