Final Project
Build an end-to-end data pipeline with a dashboard using any dataset, tools, and cloud provider you choose.
What You’ll Build
Over two weeks, create a pipeline that:
- Ingests data from your chosen source
- Processes and stores it in a data lake
- Moves it to a data warehouse
- Transforms it for analysis
- Visualizes results in a dashboard
Your Choices
You decide:
- Dataset (any source you’re interested in - see datasets for suggestions)
- Processing approach (batch or stream)
- Cloud provider (AWS, GCP, Azure, etc.)
- Tools (Terraform, Airflow, Spark, Kafka, dbt, etc.)
Dashboard
Create a dashboard with at least two visualizations. Use Looker Studio, Streamlit, or any BI tool.
Submission
You get 2 attempts to submit your project. If you fail the first attempt or need more time, you can use the second attempt. See the project submission guide for details on how to submit, commit IDs, certificate name, and other platform mechanics.
You are not restricted to the technologies covered in the course. If the course uses GCP, you are free to use AWS, Azure, or anything else. If you use tools outside the course content, make sure to document everything in detail so your peer reviewers can understand your choices.
Evaluation
Peer review is mandatory to pass the project. After submitting, you must review 3 projects from your cohort. See the peer review guide for evaluation tips and process details.
Your project will be assessed on:
- Problem description and clarity
- Cloud and infrastructure-as-code usage
- Data ingestion pipeline (batch or stream)
- Data warehouse optimization
- Data transformations
- Dashboard quality
- Code reproducibility and documentation
Resources
- README.md - Complete requirements and evaluation criteria
- datasets - Curated dataset suggestions
- 2025 projects - Last year submissions
- 2024 projects - 2024 cohort submissions
- Projects Gallery - Community projects for inspiration
- 2023 examples
- 2022 examples