LLM Zoomcamp 2025: Free Large Language Models Course & Certification

Master Modern LLM Applications: From Fundamentals to Production-Ready Systems

11 Nov 2024 by Valeriia Kuka

LLM Zoomcamp is a free online course to get started with real-life applications of LLMs. In 10 weeks, you will learn how to build an AI system that answers questions about your knowledge base.

Join the 2025 cohort →

Who the Course is For and Prerequisites

Before we get into the details, it’s important to know what skills you should have to comfortably join the course.

Here are the main prerequisites for the course:

Comfortable with programming and Python
Comfortable with command line
Comfortable with Docker
No previous exposure to AI or ML is required

Course Curriculum

Module 1: Introduction to LLMs and RAG
Module 2: Open-source LLMs
Module 3: Vector Databases
Module 4: Evaluation and Monitoring
Module 5: LLM Orchestration and Ingestion
Module 6: Best Practices
Module 7: Bonus: End-to-End Project Example

Let’s quickly review each module, focusing on the main points and the tech you’ll use.

Module 1: Introduction to LLMs and RAG

Screenshot of the lecture slides from module 1

We introduce the core ideas behind Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG). You’ll set up your development environment, learn how retrieval works, and start experimenting with APIs and search tools. By the end of this module, you’ll have a basic RAG setup and be familiar with text search fundamentals.

You will learn to:

Set up your environment for LLM and RAG experimentation
Understand the basics of retrieval and search
Use the OpenAI API to integrate LLM capabilities
Build a simple RAG system
Implement basic text search with Elasticsearch

Module 2: Open-source LLMs

We dive into the world of open-source LLMs, providing hands-on experience with popular, freely available models. You’ll learn how to configure a GPU environment, access models from the Hugging Face Hub, and even run LLMs on CPUs when GPUs aren’t available. This module ends with creating a simple UI to see your model in action.

You will learn to:

Set up and optimize a GPU environment
Access and use open-source models from Hugging Face
Run models on a CPU using Ollama when GPUs aren’t available
Create a basic, interactive UI with Streamlit for testing your model

Module 3: Vector Databases

Screenshot of the lecture slides from module 3

This module covers how to use vector databases for effective search and retrieval. You’ll learn to create embeddings (vector representations of text), index them, and use vector search to improve RAG performance.

You will learn to:

Create and index embeddings for vector-based retrieval
Implement vector search using Elasticsearch
Conduct offline evaluations to assess your retrieval system
Work hands-on with dlt to practice embedding indexing and search

Module 4: Evaluation and Monitoring

Screenshot of the lecture slides from module 4

We focus on evaluating your RAG system and setting up monitoring tools. You’ll explore different metrics to judge your system’s performance and set up a feedback loop for continuous improvement. Grafana dashboards will help you visualize insights and track system usage.

You will learn to:

Perform offline evaluations of your RAG pipeline
Use Cosine Similarity and LLM-as-a-Judge metrics to assess retrieval
Track chat history and collect user feedback for iterative improvement
Build Grafana dashboards to monitor performance in real-time

Module 5: LLM Orchestration and Ingestion

This module teaches you how to efficiently manage data ingestion for LLMs.

You will learn to:

Ingest data seamlessly
Set up a smooth data pipeline for LLM projects
Prepare data for scalable and efficient processing in RAG systems

Module 6: Best Practices

We dive into advanced techniques for refining your RAG pipeline, from improving retrieval quality to enhancing search relevance. You’ll practice hybrid search methods, document reranking, and explore using LangChain for more complex applications.

You will learn to:

Apply best practices to optimize your RAG pipeline
Use hybrid search techniques to increase retrieval accuracy
Implement document reranking to enhance search results
Set up a hybrid search with LangChain for advanced retrieval tasks

Module 7: End-to-End Project

Screenshot of the lecture slides from module 7

You’ll bring everything together in a practical project. You’ll apply all the skills you’ve learned to complete an end-to-end project, from data preprocessing to deploying your solution.

You will learn to:

Build an end-to-end project using RAG techniques
Practice preprocessing text data for specific use cases
Apply learned techniques in a real-world project

The course description on GitHub provides a detailed overview of the topics covered each module. To dive into the content, you can see the video lectures, slides, code, and community notes for each course module.

Join the 2025 cohort →

Course Assignments and Scoring

Homework and Getting Feedback

Examples of the homework assignments from the 2024 cohort of the LLM Zoomcamp

To reinforce your learning, we offer regular homework assignments. Your scores are added to a leaderboard, creating friendly competition among course members and motivating you to do your best.

For support, we have an FAQ section with quick answers to common questions. If you need more help, our Slack community is always available for technical questions, clarifications, or guidance. Additionally, we host live Q&A sessions called “office hours” where you can interact with instructors and get immediate answers to your questions.

Learning in Public

A unique feature is our “learning in public” approach, inspired by Shawn @swyx Wang’s article. We believe everyone has something valuable to contribute, regardless of their expertise level.

Throughout the course, we actively encourage and incentivize learning in public. By sharing your progress, insights, and projects online, you earn additional points for your homework and projects.

Sharing your work online also helps you get noticed by social media algorithms, reaching a broader audience and creating opportunities to connect with individuals and organizations you may not have encountered otherwise.

Course Projects for Your Portfolio

If you’ve ever participated in an interview or conducted online research, you likely understand the significance of personal projects. To receive a certificate, you’ll need to finalize and submit an end-to-end RAG application. It allows you to choose a problem that interests you, find a suitable dataset, and develop your model.

Example project from Rileen Sinha, one of the students of the course

DataTalks.Club Community

DataTalks.Club has a supportive community of like-minded individuals in our Slack. It is the perfect place to enhance your skills, deepen your knowledge, and connect with peers who share your passion. These connections can lead to lasting friendships, potential collaborations in future projects, and exciting career prospects.

Quick Start Guide for LLM Zoomcamp

LLM Zoomcamp is a structured and practical introduction to applying Large Language Models in real-world contexts. Over 10 weeks, you gain hands-on experience, from setting up retrieval systems to building a complete RAG application.

Each module is crafted to build useful skills step-by-step, ensuring you can put what you learn into practice. If you’re interested in learning about and applying LLMs, joining the next cohort is a good way to start.

Join the 2025 cohort →

Frequently Asked Questions

Can I still join if I just discovered the course?

Yes, but if you want to receive a certificate, you need to submit your project while we're still accepting submissions.

Do I need a confirmation email after registration?

You don't need a confirmation email. You are automatically accepted into the course. You can even start learning and submitting homework (while the form is open) without registering, as submissions are not checked against any registered list. Registration is primarily used to gauge interest before the start date.

How do I join the "Office Hours" or live sessions?

Office hours are conducted via YouTube Live where students can participate by submitting questions through Slido (a link will be pinned in the chat). The video URL for each session is posted in the announcements channel on both Telegram and Slack before it begins. You can also find the live stream on the DataTalks.Club YouTube Channel. Please note that questions should be submitted through Slido rather than the chat, as chat messages may be missed due to high activity.

Can I get a certificate in self-paced mode?

No, certificates are only available when completing the course with a live cohort. This is because the certification process requires you to peer-review three capstone projects, and these peer reviews only happen during the active course period. Additionally, the submission form closes after the peer-review list is compiled.

Can I still get a certificate if I missed the first homework?

Yes! The only requirement for receiving a certificate is passing the Capstone project. Homework assignments are not mandatory, though they are recommended for reinforcing concepts and understanding the material better. Your homework points will count towards your position on the course leaderboard, but they don't affect certificate eligibility.

Quick Links

Ready to start your LLM journey? Here’s everything you need:

Subscribe to our weekly newsletter and join our Slack.
We'll keep you informed about our events, articles, courses, and everything else happening in the Club.

DataTalks.Club

LLM Zoomcamp 2025: Free Large Language Models Course & Certification

Master Modern LLM Applications: From Fundamentals to Production-Ready Systems

What You’ll Learn in This Guide

Who the Course is For and Prerequisites

Course Curriculum

Module 1: Introduction to LLMs and RAG

Module 2: Open-source LLMs

Module 3: Vector Databases

Module 4: Evaluation and Monitoring

Module 5: LLM Orchestration and Ingestion

Module 6: Best Practices

Module 7: End-to-End Project

Course Assignments and Scoring

Homework and Getting Feedback

Learning in Public

Course Projects for Your Portfolio

DataTalks.Club Community

Quick Start Guide for LLM Zoomcamp

Frequently Asked Questions

Quick Links

Related Posts