Machine Learning Zoomcamp: Free ML Engineering course. Register here!

DataTalks.Club

Understanding ETL

by Matt Palmer

The book of the week from 15 Apr 2024 to 19 Apr 2024

Whether your title is data engineer, scientist, or analyst, you’ve likely heard the term ETL. There’s a good chance ETL is a part of your life, even if you don’t know it.

Short for extract, transform, load, ETL is used to describe the foundational workflows most data practitioners are tasked with—taking data from a source system, changing it to suit their needs, and loading it to a target.

  • Want to help product leaders make data-driven decisions? ETL builds the critical tables for your reports.
  • Want to train the next iteration of your team’s machine learning model? ETL creates quality datasets.
  • Are you trying to bring more structure and rigor to your company’s storage policies to meet compliance requirements? ETL will bring process, lineage, and observability to your workflows.

If you want to do anything with data, you need a reliable process or pipeline. This holds true from classic business intelligence (BI) workloads to cutting-edge advancements, like large language models (LLMs) and AI.

In Understanding ETL, we walk through the components of ETL, step-by-step, discussing architecture, maintainability, and scalability. With a focus on brevity, we’ll give you the tools you need to understand the basics about the pattern that drives data processing at scale.

To take part in the book of the week event:

  • Register in our Slack
  • Join the #book-of-the-week channel
  • Ask as many questions as you'd like
  • The book authors answer questions from Monday till Thursday
  • On Friday, the authors decide who wins free copies of their book

To see other books, check the the book of the week page.

Subscribe to our weekly newsletter and join our Slack.
We'll keep you informed about our events, articles, courses, and everything else happening in the Club.


DataTalks.Club. Hosted on GitHub Pages. We use cookies.