PlusOne

Telling the ASF's Stories

Driving dynamic Beam pipelines Alex Van Boxel

September 13, 2019
timothyarthur

Using Apache Beam to get data in your data lake? In a agile company you don’t want to re-compile your ingestion pipeline every time a sprint finished. In this talk we go over all mechanisms and building blocks you need to make dynamic pipelines really work.
We’ll see why schemas are so important. How do we get these schemas in our pipelines and discuss methods to protect ourselves from data corruption and incompatible schema evolution.
The new features like schema aware PCollection get a thorough deep dive and finally we go over real world examples and position Apache Beam in the new PLT (Push Load Transform) world.

Leave a Reply

Powered by WordPress.com.

Discover more from PlusOne

Subscribe now to keep reading and get access to the full archive.

Continue reading