Data Engineering Zoomcamp: Free Data Engineering Course and Certification

Q: How does the Data Engineering Zoomcamp certificate work?

To earn a certificate , you need to complete one capstone project by building an end-to-end data pipeline. After submitting your project, you must also review at least 3 other students’ projects by the deadline and provide constructive feedback.

Q: Who runs the Data Engineering Zoomcamp?

DE Zoomcamp is run by DataTalks.Club , a global online community of data professionals and learners. While the initial idea and most of the content were created by Alexey Grigorev , members of the DataTalks.Club community contribute as instructors and maintainers. DataTalks.Club is often referred to as “the DataTalks Club”, “data talks club”, or “datatalks club”.

Become a Data Engineer: Master Modern Data Engineering with Hands-On Training

25 Nov 2025 by Valeriia Kuka

Breaking into data engineering takes real, hands-on experience with production tools, but most courses stop at theory.

The Data Engineering Zoomcamp changes that. It’s a free data engineering course that teaches you how to build production-grade data pipelines from start to finish. You’ll work with Docker, Terraform, BigQuery, dbt, Spark, and Kafka, and graduate with a portfolio project and a certificate.

Free Data Engineering Zoomcamp 2026 course curriculum - 9-week program covering Docker, Terraform, BigQuery, dbt, Spark, and Kafka for building production data pipelines — Complete Data Engineering Zoomcamp curriculum: from infrastructure setup to stream processing

It’s ideal for beginners and career switchers preparing for junior data engineer roles, as well as experienced professionals who want to refresh their knowledge, expand their network, or test themselves as a mentor for less experienced professionals.

Unlike most courses, DE Zoomcamp helps you build your public portfolio, share your work confidently, and connect with a global community for feedback, mentorship, and career opportunities.

TL;DR: Data Engineering Zoomcamp is a free 9-week course on building production-ready data pipelines. The next cohort starts in January 2026. Join the course here.

What is the Data Engineering Zoomcamp?
Why Learn Data Engineering?
Who Can Learn Data Engineering?
Course Prerequisites
Course Curriculum: What You’ll Learn
How the Data Engineering Zoomcamp Works
Zoomcamp vs. Bootcamp
How to Get Started
Testimonials
Frequently Asked Questions

What is the Data Engineering Zoomcamp?

Data Engineering Zoomcamp is a 9-week program that follows a clear progression: infrastructure setup, workflow orchestration, data warehousing, analytics engineering, batch processing, streaming, and a final capstone project.

Data Engineering Zoomcamp GitHub repository - free course materials, homework assignments, and video lectures for learning data engineering — Data Engineering Zoomcamp GitHub repository showing the course materials

What makes it different is the community. You’ll join an active Slack workspace where thousands of learners troubleshoot together, share progress, and connect for jobs and collaborations. The course encourages learning in public: sharing your work earns bonus points and builds your online presence.

The final three weeks focus on your capstone that gets peer-reviewed, and you graduate with a polished GitHub portfolio that proves you can ship real data systems.

Why Learn Data Engineering?

Typically, data science teams are comprised of data analysts, data scientists, and data engineers. Among different data roles, data engineers are the specialists who connect all the pieces of the data ecosystem within a company or institution.

Data team roles list - data analyst, data scientist, data engineer, analytics engineer, and machine learning engineer positions explained — Data engineers are the specialists who connect all the pieces of the data ecosystem.

But why would you want to become one? Here are some of the main reasons that make data engineering roles rewarding and valuable:

You become the builder behind every data product that keeps information flowing.
You increase your earning potential by joining a smaller, high-value pool of professionals.
You develop transferable skills that are valuable across industries and provide long-term career flexibility. These skills are also foundational for roles in machine learning and MLOps.
You future-proof your career by building a mindset that will stay essential even as tools change and processes get automated.

Who Can Learn Data Engineering (and What Roles It Leads To)

The great thing about data engineering is that you don’t need to be a computer science graduate or have years of experience in data to start.

If you understand basic programming and have curiosity about how data systems work, you can learn data engineering.

If you tick most of the following boxes, data engineering might be a good fit for you:

Enjoy solving technical, logical problems and building things that work reliably.
Like Python, SQL, or scripting and want to use those skills for something impactful.
Want to understand how data moves inside organizations — from raw sources to analysis and AI (including LLM applications).
Prefer practical, system-level work over purely theoretical or statistical modeling.
Appreciate clear structure and step-by-step learning (which is how DE Zoomcamp is built).

Course Prerequisites

As mentioned earlier, learning data engineering doesn’t require prior data engineering experience.

The course is designed to be accessible to beginners.

The only requirements are:

Comfort with the command line (basic navigation and file operations)
Basic SQL knowledge (SELECT, WHERE, JOIN statements)
Python experience is helpful but not required

If you’re completely new to programming, consider spending a few weeks learning the basics before starting the course.

Course Curriculum: What You’ll Learn in the Data Engineering Zoomcamp

The course follows a logical progression from infrastructure setup to advanced data processing, culminating in an end-to-end project.

Module	What You'll Learn	Tools & Technologies
1. Infrastructure & Prerequisites	• Set up your development environment with Docker and PostgreSQL • Learn cloud basics with GCP • Master infrastructure-as-code using Terraform	Docker, PostgreSQL, GCP, Terraform
2. Workflow Orchestration	• Master data pipeline orchestration with Mage.AI • Implement and manage Data Lakes using Google Cloud Storage • Build automated, reproducible workflows	Mage.AI, Google Cloud Storage
3. Data Warehouse	• Deep dive into BigQuery for enterprise data warehousing • Learn optimization techniques like partitioning and clustering • Implement best practices for data storage and retrieval	BigQuery
4. Analytics Engineering	• Transform raw data into analytics-ready models using dbt • Develop testing and documentation strategies • Create impactful visualizations with modern BI tools	dbt, BI tools
5. Batch Processing	• Process large-scale data with Apache Spark • Master Spark SQL and DataFrame operations • Optimize batch processing workflows	Apache Spark, Spark SQL
6. Stream Processing	• Build real-time data pipelines with Kafka • Develop streaming applications using KSQL and Faust • Implement stream processing patterns	Kafka, KSQL, Faust
Final Project	• Build an end-to-end data pipeline from ingestion to visualization • Apply all learned concepts in a real-world project • Create a portfolio-ready project with documentation	Cloud platforms (GCP/AWS/Azure), Terraform, Spark, Kafka, dbt, BigQuery

Capstone Project

The final three weeks are dedicated to applying your knowledge in a real-world project that showcases everything you’ve learned throughout the course.

You’ll build an end-to-end data pipeline using a dataset of your choice, implementing both data lake and warehouse solutions with proper documentation.

Your project will be peer-reviewed by fellow participants and you’ll peer-review at least three other projects.

Project Requirements	Deliverables	Evaluation Criteria
• Select and process a dataset that interests you • Build end-to-end data pipelines (batch or streaming) • Implement both data lake and warehouse solutions • Create analytical dashboards	• Production-ready data pipeline • Documented data models • Interactive dashboard • Project presentation	• Peer review of at least three other projects • Technical implementation quality • Documentation completeness • Solution architecture design

Join the next cohort →

How the Data Engineering Zoomcamp Works

The course runs for 9 weeks in cohort format, providing structure and community support throughout your learning journey.

While you can access all course materials at your own pace without joining a cohort, participating in the structured program offers significant advantages: graded homework assignments, project submission and evaluation, peer interaction, and the opportunity to earn a certificate.

Below, we list the key features of the course and how they work.

GitHub Repository: The Central Hub of the Course

Data Engineering Zoomcamp course materials on GitHub - complete learning resources for data pipeline development and data engineering certification — Data Engineering Zoomcamp GitHub repository showing the course materials

All course materials live in the GitHub repository. The lectures are pre-recorded and available on YouTube, so you can watch at your own pace.

Data Engineering Zoomcamp YouTube playlist - free video lectures on Docker, Terraform, BigQuery, dbt, Spark, Kafka, and data pipeline development — Data Engineering Zoomcamp YouTube playlist with pre-recorded lectures

Homework Assignments

To reinforce your learning, we release homework assignments for each week of the course. You can submit a homework assignment at the end of each week.

Data Engineering Zoomcamp 2026 schedule - weekly course timeline with homework submission deadlines for 9-week data engineering certification program — Data Engineering Zoomcamp schedule showing the course schedule and submission deadlines

It doesn’t count toward your certificate, but it helps you practice and appears on an optional anonymous leaderboard.

Your scores are added to an anonymous leaderboard, creating friendly competition among learners and motivating you to do your best.

Data Engineering Zoomcamp homework leaderboard - anonymous student rankings and scores showing course progress and achievements — Course leaderboard displaying student progress and achievements anonymously

You can earn bonus points by learning in public — sharing your work on blogs, YouTube, or social media. The next section explains how it works in practice.

Learning in Public: Build Your Online Presence

A unique feature is our “learning in public” approach, inspired by Shawn @swyx Wang’s article. We believe that everyone has something valuable to contribute, regardless of their expertise level.

Learning in public concept by Shawn Wang - building online portfolio and sharing data engineering projects publicly for career growth — An extract from Shawn @swyx Wang's article about learning in public

Throughout the course, we actively encourage and incentivize learning in public. By sharing your progress, insights, and projects online, you earn additional points for your homework and projects.

Data Engineering Zoomcamp leaderboard with bonus points - students earn extra credit by sharing data engineering projects and learning publicly on social media — Previous cohort's leaderboard highlighting bonus points earned through learning in public activities

This not only demonstrates your knowledge but also builds a portfolio of valuable content. Sharing your work online also helps you get noticed by social media algorithms, reaching a broader audience and creating opportunities to connect with individuals and organizations you may not have encountered otherwise.

Many of our graduates have shared that their social media presence has helped them attract job offers and collaborations.

How to Get a Certificate

Data Engineering Zoomcamp certificate 2026 - free data engineering certification awarded after completing capstone project and peer reviews — Data Engineering Zoomcamp certificate showing the certificate requirements

To earn your certificate, you must complete the course with a live cohort and fulfill three key requirements:

Build a capstone project: Create an end-to-end data pipeline that demonstrates your mastery of the course concepts
Submit on time: Meet the project submission deadline to qualify for certification
Peer review: Evaluate and provide feedback on 3 fellow students’ projects during the peer review process

What is DataTalks.Club Community? A Place to Connect and Learn with Other Data Professionals

DataTalks.Club Slack community - data engineering students and professionals discussing course topics, sharing projects, and helping each other learn data pipelines — Active discussions and peer support in our dedicated Slack community channel

DataTalks.Club is a global community of 80,000+ data professionals who connect on Slack to share knowledge, ask career questions, and discuss everything from analytics and visualization to machine learning and data engineering. As one of the largest digital groups dedicated to data, it’s where you’ll find data scientists, ML engineers, data analysts, and enthusiasts at all career stages.

When you join a cohort, the dedicated course channel becomes your home base. Here, you’ll troubleshoot problems with peers working through the same challenges and share your progress and insights.

Data Engineering Zoomcamp FAQ - frequently asked questions about course registration, homework, certification, and technical troubleshooting — DataTalks.Club FAQ repository showing common questions and technical issues

Beyond peer support, there are two ways to get help: our FAQ repository covers common questions and technical issues, and the @ZoomcampQABot in Slack provides quick answers when you need them.

Data Engineering Zoomcamp QA Bot on Slack - automated assistant answering course questions about data pipelines, homework, and technical issues — DataTalks.Club Zoomcamp QABot in Slack providing quick answers when you need them

Zoomcamp vs. Bootcamp: What’s the Difference?

We often get asked what the difference is between the Data Engineering Zoomcamp and paid bootcamps.

Below, we list the key features of the Data Engineering Zoomcamp and how they compare to paid bootcamps.

Feature	DE Zoomcamp (Cohort)	DE Zoomcamp (Self-paced)	Paid Bootcamps
Cost	Free	Free	$2,000–$10,000+
Format	9-week cohort with fixed schedule	Learn anytime at your own pace	Fixed schedule, instructor-led
Homework	Weekly scored assignments	Available but no scoring	Weekly with instructor feedback
Projects	Capstone project with peer review and scoring	Build on your own, no evaluation and scoring	Instructor-reviewed projects
Certificate	Yes, after completing project + peer reviews	No certificate	Certificate of completion
Community Support	Active Slack + optional live Q&A sessions	Slack community only	Instructor-led, 1:1 or group mentorship
Learning in Public	Encouraged with bonus points	Optional	Rarely emphasized
Timeline	9 weeks (Jan–Mar 2026)	Flexible, self-paced	Typically 12–24 weeks
Best For	Career switchers or experienced data engineers wanting community and accountability	Self-motivated learners exploring data engineering	Those needing intensive structured mentorship and guidance

How to Get Started with the Data Engineering Zoomcamp

To get started with the Zoomcamp, you can either join a live cohort or learn at your own pace.

All materials are freely available in the Data Engineering Zoomcamp GitHub repository. Each module has its own folder (e.g., 01-docker, 03-data-warehouse), and cohort-specific homework and deadlines are in the cohorts directory. Lectures are pre-recorded and available in our official Data Engineering Zoomcamp YouTube Playlist, so you can watch at your own pace.

Learn at Your Own Pace

The self-paced mode lets you start immediately and progress on your own schedule. You’ll get access to the course materials and the community and can complete the course at a pace that works for you.

All you need is to go to the Data Engineering Zoomcamp GitHub repository and start learning. It serves as a central hub for the course for easier navigation through the course materials. All the lectures are pre-recorded and available on YouTube in our official Data Engineering Zoomcamp YouTube Playlist, so you can watch whenever it suits you.

You can also join the DataTalks.Club Slack community to get help and support from the community in the #course-data-engineering channel. Homework and solutions are available on the course platform, and you can build a project for your portfolio.

Remember, self-paced learning does not include homework submissions, project evaluations, or the ability to earn a certificate. To receive certification, you need to join an active cohort.

Join a Live Cohort

2026 Cohort: Starts January 2026. Register here: Fill in this form

When you join a live cohort, you’ll work through the same materials as self-paced learners, but with the added structure of a published schedule and the energy of hundreds of peers progressing alongside you.

The course runs once per year (starts around January). Each module typically spans one week: you watch the lectures, complete hands-on exercises, and submit a homework assignment. Your submissions get scored and appear on an anonymous leaderboard.

After completing all six modules, the capstone project phase begins. You’ll build your own end-to-end pipeline, submit it through our form, and peer-review at least three other students’ projects while yours gets reviewed by your peers. This reciprocal process gives you valuable feedback on your work and exposes you to different approaches and solutions you might not have considered.

Even if you join after the official start date, you can still follow along — but note that some homework forms may already be closed. All active deadlines are listed on the course platform.

To earn a certificate, you’ll need enough time to complete one final project and the required peer reviews. Details are in the Projects and Certificate sections.

Ready to join DE Zoomcamp? Here’s how it works:

Register for the course, you’ll be automatically accepted into the next cohort
Join the DataTalks.Club Slack community and the #course-data-engineering channel for updates, questions, and peer support
(Optional) Get a head start by exploring the GitHub repository and watching lectures before the cohort officially begins
When the cohort starts, you’ll receive an email with the full schedule and submission deadlines
Follow the weekly rhythm: watch lectures, complete exercises, submit homework
During the final three weeks, build and submit your capstone project, then peer-review three other projects
Receive your certificate once your project passes peer review

The entire journey takes 9 weeks from start to certificate, and you’ll be part of a global cohort tackling the same challenges at the same time.

Comparison

We summarized the key differences between the two joining options in this table:

Feature	Self-Paced	Live Cohort
Timing	Learn at your own pace, start anytime	Fixed 9-week schedule (January–March each year)
Course Materials	Full access to GitHub repository and YouTube lectures	Full access to GitHub repository and YouTube lectures
Community	Access to Slack community (`#course-data-engineering`)	Access to Slack community (`#course-data-engineering`)
Homework	Available but not scored	Scored automatically, appears on leaderboard
Projects	Build on your own, no evaluation	Submit 1 final project (capstone) with peer review
Certificate	Not available	Available after completing project and peer reviews
Structure	Flexible, no deadlines	Weekly rhythm with deadlines and peer accountability

Join the next cohort →

Testimonials

We’ve collected some testimonials from our students who have completed the Data Engineering Zoomcamp.

Thank you for what you do! The Data Engineering Zoomcamp gave me skills that helped me land my first tech job.

— Tim Claytor (Source)

Three months might seem like a long time, but the growth and learning during this period are truly remarkable. It was a great experience with a lot of learning, connecting with like-minded people from all around the world, and having fun. I must admit, this was really hard. But the feeling of accomplishment and learning made it all worthwhile. And I would do it again!

— Nevenka Lukic (Source)

One of the significant things I inferred from the Zoomcamp is to prioritize fundamentals and principles over ever-evolving tools and tech stacks. Hugely grateful to Alexey Grigorev for putting together this incredible course and offering it for free.

— Siddhartha Gogoi (Source)

Such a fun deep dive into data engineering, cloud automation, and orchestration. I learned so much along the way. Big shoutout to Alexey Grigorev and the DataTalksClub team for the opportunity and guidance throughout the 3 months of the free course.

— Assitan NIARE (Source)

If you’re serious about breaking into data engineering, start here. The repo’s structure, community, and hands-on focus make it unparalleled.

— Wady Osama (Source)

Frequently Asked Questions

Yes. The Data Engineering Zoomcamp is a free, project-based data engineering course that covers pipelines, data warehouses, batch and streaming, and orchestration—without any tuition fees.

Bootcamps usually charge tuition, offer full-time schedules, and sometimes include dedicated career services. Zoomcamps are part-time, community-run, open-source, and free, with an emphasis on self-directed learning and peer support.

During an active cohort, you can earn a certificate by completing the final project, reviewing peers’ projects, and meeting all deadlines. Self-paced learners don’t receive certificates but still build the same projects.

To earn a certificate, you need to complete one capstone project by building an end-to-end data pipeline. After submitting your project, you must also review at least 3 other students’ projects by the deadline and provide constructive feedback.

The next cohort of the DE Zoomcamp starts around January each year. Register here: https://airtable.com/appzbS8Pkg9PL254a/shr6oVXeQvSI5HuWD before the course starts.

No, the 2025 cohort is closed. You can join the 2026 cohort when it starts in January 2026. Register here before the course starts.

DE Zoomcamp is run by DataTalks.Club, a global online community of data professionals and learners. While the initial idea and most of the content were created by Alexey Grigorev, members of the DataTalks.Club community contribute as instructors and maintainers.

DataTalks.Club is often referred to as “the DataTalks Club”, “data talks club”, or “datatalks club”.

You should be comfortable with basic coding (Python or similar), the command line, and basic SQL. No prior data engineering experience is required.

Expect to spend 5-15 hours per week, depending on your background. This includes watching videos, completing homework, and working on the capstone project. More time may be needed during the final project weeks.

Yes! All course materials, videos, and recordings remain available after the cohort ends, and you can learn at your own pace. You’ll have access to the Slack community for support. However, self-paced learning does not include homework submissions, project evaluations, or the ability to earn a certificate. To receive a certificate, you need to join an active cohort.

You have multiple support channels available. Join the Slack community where you can ask questions and get help from instructors and fellow students. We also have an FAQ repository with answers to common questions and a @ZoomcampQABot in Slack for quick help.

The GitHub repository is https://github.com/DataTalksClub/data-engineering-zoomcamp.

You can find the full syllabus in the readme of the Data Engineering Zoomcamp GitHub repository.

Yes, people use these names interchangeably. Throughout this page we’ll use “Data Engineering Zoomcamp” as the canonical name.

The course focuses on building real-world pipelines and infrastructure with tools like Docker, BigQuery, Spark, and Kafka. Many learners have used the final project and GitHub portfolio as part of their data engineering job search.

You can request an invite via the Slack signup form. After confirming your email, you’ll receive an invitation link to join the workspace.

The main community hub is Slack. Data Engineering Zoomcamp also has a Telegram announcement channel, but all core discussion happens in Slack.

Join the next cohort →

Subscribe to our weekly newsletter and join our Slack.
We'll keep you informed about our events, articles, courses, and everything else happening in the Club.

DataTalks.Club

Data Engineering Zoomcamp: Free Data Engineering Course and Certification

Become a Data Engineer: Master Modern Data Engineering with Hands-On Training

Table of Contents

What is the Data Engineering Zoomcamp?

Why Learn Data Engineering?

Who Can Learn Data Engineering (and What Roles It Leads To)

Course Prerequisites

Course Curriculum: What You’ll Learn in the Data Engineering Zoomcamp

Capstone Project

How the Data Engineering Zoomcamp Works

GitHub Repository: The Central Hub of the Course

Homework Assignments

Learning in Public: Build Your Online Presence

How to Get a Certificate

What is DataTalks.Club Community? A Place to Connect and Learn with Other Data Professionals

Zoomcamp vs. Bootcamp: What’s the Difference?

How to Get Started with the Data Engineering Zoomcamp

Learn at Your Own Pace

Join a Live Cohort

Comparison

Testimonials

Frequently Asked Questions

Related Posts

Free DataTalks.Club Courses: ML, Data Engineering, MLOps, LLM & AI Dev Tools Zoomcamps

ML Zoomcamp: Free Machine Learning Engineering Course and Certification

MLOps Zoomcamp: Free MLOps Course and Certification

LLM Zoomcamp: Free LLM Engineering Course and Certification

AI Dev Tools Zoomcamp: Free Course to Master AI Tools for Developers