Open Source Data Lakehouse Architecture with Spark and Kyuubi: Engineering Deep Dive

📍Subscribe. Fuel your curiosity.

This webinar gives a detailed exploration of an open source data lakehouse architecture and how we implement it at Canonical.

Watch to discover how Spark’s scalable processing engine and Kyuubi’s user-friendly SQL gateway enable efficient, secure, and high-performance analytics on unified data sets. Let’s dig deeper into how this combination simplifies big data storage, interactive analytics, and ETL – all through a single, streamlined open source lakehouse architecture.

In this webinar you will learn more about:

  • Apache Spark and Apache Kyuubi
  • Data lakehousing practical implementations
  • Our reference architecture for a truly open source data lakehouse

Learn more or contact our team: https://canonical.com/data/spark

Time codes:

00:00 - 02:25 introduction

02:25 - 05:45 charming components

05:45 - 08:01 data lakehouse architecture

08:01 - 08:59 integrations

08:59 - 12:35 data lakehouse overview

12:35 - 14:50 benefits of cloud-native data lakehouse

14:50 - 16:50 services for the data lakehouse

16:50 - 48:20 demo: open source data lakehouse

48:20 - 52:59 closing words