Final Project

Build an end-to-end data pipeline with a dashboard using any dataset, tools, and cloud provider you choose.

What You’ll Build

Over two weeks, create a pipeline that:

  • Ingests data from your chosen source
  • Processes and stores it in a data lake
  • Moves it to a data warehouse
  • Transforms it for analysis
  • Visualizes results in a dashboard

Your Choices

You decide:

  • Dataset (any source you’re interested in - see datasets for suggestions)
  • Processing approach (batch or stream)
  • Cloud provider (AWS, GCP, Azure, etc.)
  • Tools (Terraform, Airflow, Spark, Kafka, dbt, etc.)

Dashboard

Create a dashboard with at least two visualizations. Use Looker Studio, Streamlit, or any BI tool.

Submission

You get 2 attempts to submit your project. If you fail the first attempt or need more time, you can use the second attempt. See the project submission guide for details on how to submit, commit IDs, certificate name, and other platform mechanics.

You are not restricted to the technologies covered in the course. If the course uses GCP, you are free to use AWS, Azure, or anything else. If you use tools outside the course content, make sure to document everything in detail so your peer reviewers can understand your choices.

Evaluation

Peer review is mandatory to pass the project. After submitting, you must review 3 projects from your cohort. See the peer review guide for evaluation tips and process details.

Your project will be assessed on:

  • Problem description and clarity
  • Cloud and infrastructure-as-code usage
  • Data ingestion pipeline (batch or stream)
  • Data warehouse optimization
  • Data transformations
  • Dashboard quality
  • Code reproducibility and documentation

Resources

Project guidelines on GitHub