Curriculum
The Data Engineering Zoomcamp covers six modules over seven weeks plus the final project. Each module has video lectures, reading material, and a homework assignment.
For the canonical curriculum (videos, code, exact homework questions), see the GitHub repository.
Modules
Module 1: Containerization and Infrastructure as Code
- Docker and Postgres.
- Setting up a development environment locally and on GCP.
- Terraform for infrastructure provisioning.
- Module 1 gets two weeks because environment setup can be the trickiest part.
Module 2: Workflow Orchestration
- Workflow orchestration with Kestra.
- Building data pipelines that schedule, retry, and backfill.
- Loading data to Google Cloud Storage and BigQuery.
Workshop: Data Ingestion with dlt
- Slotted between modules. Hands-on workshop on the dlt library for ingestion.
- Has its own homework. See Workshops for the workshop logistics.
- BigQuery as a data warehouse.
- Partitioning and clustering for performance.
- Cost optimization.
Module 4: Analytics Engineering
- dbt for analytics engineering.
- Building staging, intermediate, and mart models.
- Testing and documentation.
- Connecting BigQuery to a dashboard.
- Apache Spark fundamentals.
- PySpark with the NYC taxi dataset.
- Running on Google Dataproc.
- Apache Kafka.
- Stream processing with Kafka and Flink.
- Schema registry and data serialization.
- Three weeks at the end of the cohort.
- Build an end-to-end pipeline of your choice.
- See the Project page.
What changes between cohorts
Most modules are stable across cohorts. Notable past changes:
- 2023: Mage replaced Prefect as the workflow orchestrator.
- 2025: Kestra replaced Mage.
- 2026: New workshop on dlt continues; some modules use updated tooling.
If a video references a tool you do not see in the current code, check the cohort folder (cohorts/2026/) for the current version.
Pace
A typical week:
- Watch the module videos (3 to 5 hours).
- Work through the code examples (3 to 5 hours).
- Complete the homework (2 to 5 hours).
Plan for 10 to 15 hours per week. Module 1 takes more (two weeks instead of one).
For the final project, plan 2 to 3 weeks of focused work.