Data Pipelines with Apache Airflow 2E (MEAP) by Julian de Ruiter(.PDF)

File Size: 50.6 MB

Data Pipelines with Apache Airflow, Second Edition (MEAP 18) by Julian de Ruiter, Ismael Cabral, Kris Geusebroek, Daniel van der Ende, Bas Harenslak
Requirements: .ePUB, .PDF reader, 50.6 MB
Overview: This book focuses on Apache Airflow, a batch-oriented framework for building data pipelines. Airflow’s key feature is that it enables you to easily build scheduled data pipelines using Python, while also providing many building blocks that allow you to stitch together the many different technologies encountered in modern technological landscapes. In Airflow, you define your DAGs using Python code in DAG files, which are essentially Python scripts that describe the structure of the corresponding DAG. As such, each DAG file typically describes the set of tasks for a given DAG and the dependencies between the tasks, which are then parsed by Airflow to identify the DAG structure. One advantage of defining Airflow DAGs in Python code is that this programmatic approach provides you with a lot of flexibility for building DAGs. For example, as we will see later in this book, you can use Python code to dynamically generate optional tasks depending on certain conditions or even generate entire DAGs based on external metadata or configuration files. For DevOps, data engineers, Machine Learning engineers, and sysadmins with intermediate Python skills.
Genre: Non-Fiction > Tech & Devices

Free Download links:

https://trbt.cc/ngo0ck1pesxf.html

https://upfiles.com/lg6sTbju