LLM Zoomcamp FAQ
Table of Contents
General Course-Related Questions
# I just discovered the course. Can I still join?
Yes, but if you want to receive a certificate, you need to submit your project while we’re still accepting submissions.
# Course: I have registered for the LLM Zoomcamp. When can I expect to receive the confirmation email?
You don't need it. You're accepted. You can also just start learning and submitting homework (while the form is open) without registering. It is not checked against any registered list. Registration is just to gauge interest before the start date.
# What is the video/zoom link to the stream for the “Office Hours” or live/workshop sessions?
The zoom link is only published to instructors/presenters/TAs.
Students participate via YouTube Live and submit questions to Slido (link is pinned in the chat when live). The video URL should be posted in the announcements channel on Telegram and Slack before it begins. You can also watch live on the DataTalksClub YouTube Channel.
Don’t post questions in chat as they may be missed if the room is very active.
# How should I start the course and follow the weekly workflow?
Start with the LLM Zoomcamp docs, the general Zoomcamp logistics docs, and the LLM Zoomcamp GitHub repository.
You can start whenever you want. The videos and GitHub materials are available, and the deadlines are listed in the course management platform.
A typical workflow is:
- Watch the lesson videos.
- Work through the lesson notebooks/code.
- Read the homework instructions on GitHub.
- Submit answers through the course platform before the deadline.
Homework is similar to the lesson flow, but uses a different dataset or slightly different task.
# Leaderboard: I am not on the leaderboard / how do I know which one I am on the leaderboard?
When you set up your account, you are automatically assigned a random name, such as “Lucid Elbakyan.” Click on the "Jump to your record on the leaderboard" link to find your entry.
If you want to see what your Display name is, click on the "Edit Course Profile" button.

- First field: This is your nickname/displayed name. You can change it if you want to be known by your Slack username, GitHub username, or any other nickname of your choice. This is useful if you want to remain anonymous.
- Second field: Change this to your official name as in your identification documents—passport, national ID card, driver's license, etc. This is mandatory if you do not want "Lucid Elbakyan" on your certificate. This name will appear on your Certificate!
# Certificate: Can I follow the course in a self-paced mode and get a certificate?
No, you can only get a certificate if you finish the course with a "live" cohort.
We don't award certificates for the self-paced mode. The reason is you need to peer-review 3 capstone(s) after submitting your project.
You can only peer-review projects at the time the course is running; after the form is closed and the peer-review list is compiled.
# I missed the first homework - can I still get a certificate?
Yes, you need to pass the Capstone project to get the certificate. Homework is not mandatory, though it is recommended for reinforcing concepts, and the points awarded count towards your rank on the leaderboard.
# Homework: Why does the content keep changing?
If the homework title contains [DRAFT], it means the homework is not ready yet.
The homework is ready only when both are true:
- The homework form is open on the course management platform.
- The homework title does not contain
[DRAFT].
Until then, the content can still change. Working on the material or homework in advance is at your own risk, because the final version can be different.
# When will the course be offered next?
Summer 2027.
# Are there any lectures/videos? Where are they?
Use the LLM Zoomcamp GitHub repository as the main entry point.
Open the lesson folders in the repo. Each lesson page has the relevant videos linked at the top.

When in doubt, follow the GitHub repo first, because it is easier to keep updated than the YouTube playlist.
# Where can I track the LLM Zoomcamp syllabus, deadlines, homework, and progress?
Use the LLM Zoomcamp course management platform.
It contains the current cohort structure, homework, deadlines, and progress tracking. The process is the same as in other DataTalks.Club Zoomcamps.
# Are there live sessions or office hours for each module?
There are no separate live sessions for every module by default. Module materials are pre-recorded and available in the course repo.
Live sessions are announced separately when they happen. If you are stuck, ask your question in Slack and follow the asking questions guidelines.
Optional extra support is available through AI Shipping Labs, a paid community that includes regular Zoom office hours and additional structure. This is optional; the DataTalks.Club course content remains free.
# Can I use Bluesky for learning in public credits?
Yes. Bluesky posts can be used for learning in public credits.
# Where is the LLM Zoomcamp Telegram channel?
The Telegram channel is https://t.me/llm_zoomcamp.
Use it for announcements. For technical discussion and questions, use the course Slack channel.
Module 1: RAG
# Why are we not using Langchain in the course?
LangChain is a framework for building LLM-powered apps. In this course, we first build the core pieces ourselves: prompting, retrieval, indexing, and evaluation.
Think of it like learning HTML, CSS, and JavaScript before using React or Angular. Frameworks are easier to use well once you understand what they automate.
# OpenAI: Error when running OpenAI responses.create command
You may receive the following error when running the OpenAI responses.create command due to insufficient credits in your OpenAI account:
OpenAI API Error: Insufficient credits
# OpenAI: Error: RateLimitError: Error code: 429 -
RateLimitError: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: [https://platform.openai.com/docs/guides/error-codes/api-errors.](https://platform.openai.com/docs/guides/error-codes/api-errors.)', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}
The above errors are related to your OpenAI API account’s quota. There is no free usage of OpenAI’s API, so you will need to add funds using a credit card (see pay-as-you-go in the OpenAI settings at platform.openai.com). Once added, re-run your Python command and you should receive a successful return code.
Steps to resolve:
Add credits to your account here (min $5).
In
responses.create(model='gpt-4o', …)specify one of the models available to you:
You might need to recreate an API key after adding credits to your account and update it locally.
# OpenAI: How much will I have to spend to use the Open AI API?
Using the OpenAI API for the course should cost very little. You can recharge starting from $5, but initial usage is usually fractions of one cent.
# OpenAI: Do I have to subscribe and pay for Open AI API for this course?
No, you don't have to pay for this service in order to complete the course homeworks. You can use free or low-cost alternatives listed in the course GitHub repo.
See the course list of OpenAI API alternatives.
# Authentication: Why is my OPENAI_API_KEY not found in the Jupyter notebook?
Make sure you installed and used python-dotenv.
pip install python-dotenv
Then load the .env file in the notebook before creating the OpenAI client:
from dotenv import load_dotenv
load_dotenv()
Also check that the variable name in .env is exactly OPENAI_API_KEY.
# How to store and load API keys using .env file
Store API keys in a .env file and load them with python-dotenv, as recommended in the course.
Add .env to .gitignore so keys are never committed:
.env
Create a .env file:
OPENAI_API_KEY=sk-...
GROQ_API_KEY=gsk_...
GEMINI_API_KEY=...
Install python-dotenv if needed:
pip install python-dotenv
Load the keys in Python:
import os
from dotenv import load_dotenv
load_dotenv()
openai_api_key = os.getenv("OPENAI_API_KEY")
groq_api_key = os.getenv("GROQ_API_KEY")
# Can I use a model or provider different from the one recommended in homework?
Yes. The recommended model is not mandatory. You can use OpenAI, Gemini, Groq, OpenRouter, Azure OpenAI, local models, or another provider.
The homework is designed so you do not need a paid service. You may need to adapt the code for your provider, because response formats, tool schemas, and tokenizers differ.
For provider ideas, see the course list of OpenAI API alternatives.
# How do I start using Google Gemini models in the Module 1 notebook through the OpenAI-compatible endpoint?
To get started you need three things:
- A Gemini API key saved in your
.envfile, for example asGEMINI_API_KEY. - An OpenAI client pointed at Google’s OpenAI-compatible base URL.
- Your selected Google Gemini model name in your request.
Example code (loads the API key from .env, creates the Gemini client, and defines the llm helper):
import os
from dotenv import load_dotenv
from openai import OpenAI
load_dotenv()
client = OpenAI(
api_key=os.getenv("GEMINI_API_KEY"),
base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)
def llm(instructions, user_prompt, model="gemini-3.1-flash-lite"):
message_history = [
{"role": "developer", "content": instructions},
{"role": "user", "content": user_prompt}
]
response = client.chat.completions.create(
model=model,
messages=message_history
)
return response.choices[0].message.content
This uses the older chat completions style via the OpenAI-compatible endpoint, whereas many course examples use the newer Responses format. That means you will need to change the notebook code in a few places, especially where it reads the model response and where it handles tools or function calls.
# Ollama: How to install Ollama?
First, install Ollama by visiting https://ollama.com/download and choosing your operating system:
macOS: Download the
.pkgand install it.Windows: Download the
.msiand install it.Linux: Run the following command in the terminal:
curl -fsSL https://ollama.com/install.sh | sh
Once installed, open a terminal and type:
ollama run llama3
This command will:
- Download the LLaMA 3 model (~4GB).
- Start the model locally.
- Open a chat-like interface where you can type questions.
To test the Ollama local server, run the following command:
curl http://localhost:11434
You should receive a response similar to:
{"models": [...]}
Then, install the Python client with:
pip install ollama
Here is a minimal Python example:
import ollama
response = ollama.chat(
model='llama3',
messages=[{"role": "user", "content": your_prompt}]
)
print(response['message']['content'])
# Connection refused error when prompting Ollama RAG
If you encounter this error while doing the homework, you can resolve it by restarting the Ollama server using the following command:
!nohup ollama serve > nohup.out 2>&1 &
Make sure to rerun the cell containing ollama serve if you stop and restart the notebook cell.
# OpenAI: Why does my token count differ from what OpenAI reports?
When using tiktoken.encode() to count tokens in your prompt, you might see a difference compared to OpenAI’s API response. For instance, you might get 320 tokens, while OpenAI reports 327. This is due to internal tokens added by OpenAI’s chat formatting.
Here’s what happens:
- Each message in a
chat.completions.create()call (e.g.,{role: "user", content: "..."}) adds 4 structural tokens (role, content, separators). - The API adds 2 tokens globally to mark the start of assistant response generation.
- Extra newlines, whitespace, or uncommon Unicode characters in your content may slightly increase the token count.
Thus, even if your visible text is 320 tokens, OpenAI may count 327 due to these internal additions.
# API keys: how do I set them once and not re-export every terminal?
Use dirdotenv. It is like direnv, but works with both .env and .envrc, and is more portable across shells and operating systems.
uv tool install dirdotenv
# add this line to your ~/.bashrc or ~/.zshrc:
eval "$(dirdotenv hook bash)" # or zsh
# inside your project:
echo 'OPENAI_API_KEY=sk-...' > .env
echo '.env' >> .gitignore
After that, the key is loaded automatically when you cd into the project directory.
Important: always add .env and .envrc to .gitignore so keys never land on GitHub.
direnv is also fine if you already use it. In that case, create .envrc, add your exports there, and run direnv allow.
For GitHub Codespaces, use the built-in Codespaces secrets instead of files in the repo.
For Python scripts, the equivalent is python-dotenv:
from dotenv import load_dotenv
load_dotenv() # loads .env from project root
# How should I choose field weights for minsearch or another search engine?
The systematic approach is to evaluate different weight settings against a ground-truth dataset.
For example:
- Create a small set of representative questions.
- Mark which documents should be retrieved for each question.
- Try different field weights.
- Compare retrieval metrics such as hit rate, precision@k, recall@k, or MRR.
You can tune weights by trial and error for small projects, but evaluation is the more reliable approach. The course covers this topic more directly in the evaluation module.
# How should I prepare documents for RAG?
Prepare the data so it is clean, structured, and easy to chunk and retrieve.
Common steps:
- Remove obvious noise such as broken HTML, duplicate text, boilerplate, OCR errors, repeated headers, and repeated footers.
- Preserve useful context such as titles, section names, dates, page numbers, speaker names, and Q&A structure.
- Store the result in a structured format that is easy to process. JSON is often convenient, but it is not mandatory.
- Chunk the documents in a way that keeps related context together.
- Keep metadata that may help filtering or ranking later.
The exact format depends on the source data. The goal is not just to make the text shorter, but to make retrieval more accurate.
# uv says Failed to hardlink files in Codespaces. Is it an error?
No. This warning can happen in GitHub Codespaces when uv cannot hardlink files between the cache and the target environment.
The package still installs. uv falls back to copying files.
To suppress the warning for the current shell:
export UV_LINK_MODE=copy
To make it persistent:
echo 'export UV_LINK_MODE=copy' >> ~/.bashrc
source ~/.bashrc
See the uv documentation for more details: https://docs.astral.sh/uv/.
# dotenv is not recognized. What should I install?
Install python-dotenv:
uv add python-dotenv
Then import and use it in Python:
from dotenv import load_dotenv
load_dotenv()
The package is documented here: python-dotenv.
# Can I run the course locally instead of Codespaces?
Yes. Codespaces is just the easiest way for everyone to start with the same environment.
You can run the course locally if you are comfortable setting up Python, uv, Jupyter, Docker, and any other tools needed for the module.
If you run locally, make sure you document your setup and keep your environment reproducible.
# What happens to code saved in Codespaces if I do not commit it?
The code is saved inside the Codespaces Linux VM.
However, you should still commit your work regularly. Codespaces can stop, disconnect, or be deleted later, and committing makes sure your work is stored in your GitHub repository.
# WSL2: ResponseError: model requires more system memory (X.X GiB) than is available (Y.Y GiB). My system has more than X.X GiB.
Your WSL2 is set to use Y.Y GiB, not all your computer memory. To allocate more RAM, follow these steps:
Create a
.wslconfigfile under your Windows user profile directory:C:\Users\YourUsername\.wslconfig.Include the desired RAM allocation in the file:
[wsl2] memory=8GBRestart WSL using the command:
wsl --shutdownRun the
freecommand in WSL to verify the changes.
For more details, read this article.
Module 1: Agentic RAG
# What are tools and functions in agentic RAG?
In the context of this course, tools and functions are closely related terms. Do not worry too much about the naming difference.
A tool/function is something the model can call when it needs external help, such as searching documents, querying a database, calling an API, or running a calculation.
The important idea is that in agentic RAG, the model can decide when to call a tool/function. In basic RAG, retrieval is usually a fixed step that happens before the model answers.
# Any free models with tool use support?
Several Groq models offer tool use, such as Deepseek R1 or Llama 4, all of which can be used for free for development.
Other providers also support tool or function calling, including Mistral, Gemini, and some local Ollama models.
You'll typically need to adapt the code when not using OpenAI, because tool schemas and response shapes differ between providers.
For more details, see the Groq Tool Use Documentation.
# Agents: "AttributeError: 'str' object has no attribute 'output'" when using OpenAI's Responses API on a non-OpenAI model
The new OpenAI Responses API (client.responses.create(...), accessed via response.output) is OpenAI-specific. Other providers (Mistral, Groq, Gemini, etc.) don't implement it.
For non-OpenAI providers, use the chat-completions API and read response.choices[0].message.content:
response = client.chat.completions.create(
model="<provider-model>",
messages=[{"role": "user", "content": prompt}],
tools=tools_schema, # may need adapting per provider
)
return response.choices[0].message.content
You'll also have to adapt the tools schema to whatever shape your provider expects.
It is okay to use the older chat.completions API for homework or projects if your provider supports that interface better than the Responses API.
# I am using Azure OpenAI and I am still getting an error of Error code: 400 - {'error': {'message': "Missing required parameter: 'tools[0].function'.", 'type': 'invalid_request_error', 'param': 'tools[0].function', 'code': 'missing_required_parameter'}}?
Modify the get_weather_tool JSON to be the following:
get_weather_tool = {
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather for a specific city or generate fake data",
"parameters": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "The name of the city to get the weather for."
}
},
"required": ["city"],
"additionalProperties": false
}
}
}
Module 1 Homework
# Where can I find the homework questions?
Homework links are available in the course GitHub repo and in the course management platform.
For the 2026 Module 1 homework, use:
The course platform is useful for submission and deadlines, but the GitHub homework instructions often contain important extra context.
# Homework: Returning Empty list after filtering my query (HW Q3)
This is likely to be an error when indexing the data. First, you need to add the index settings before adding the data to the indices, then you will be good to go applying your filters and query.
# OpenRouter: Error code 402 when calling responses.create (max_output_tokens)
OpenRouter can return APIStatusError with code 402 when responses.create() is called without a reasonable max_output_tokens limit. This happens because OpenRouter bills/limits checks against the maximum possible output (which can be very large, around 65536 tokens), so a free or low-limit key can be rejected before the model runs. This is different from a direct OpenAI endpoint (which typically returns 429 for insufficient quota).
Fix
Pass a lower limit in your responses.create() call:
import os
from dotenv import load_dotenv
from openai import OpenAI
load_dotenv()
client = OpenAI() # uses OPENAI_API_KEY and OPENAI_BASE_URL from .env
response = client.responses.create(
model=os.environ["OPENAI_MODEL"],
input=message_history,
max_output_tokens=1024,
)
For Module 1 homework with rag_helper.py, add the same parameter in the llm() method.
response = client.responses.create(
model=os.environ["OPENAI_MODEL"],
input=message_history,
max_output_tokens=1024,
)
For Module 1 homework Q6 with ToyAIKit:
from toyaikit.llm import OpenAIClient
llm_client = OpenAIClient(
model=os.environ["OPENAI_MODEL"],
extra_kwargs={"max_output_tokens": 1024},
)
1024 is enough for homework answers; you can raise it later if needed.
If it still fails
- Confirm your
.envpoints at OpenRouter:
OPENAI_API_KEY=sk-or-v1-...
OPENAI_BASE_URL=https://openrouter.ai/api/v1
OPENAI_MODEL=openai/gpt-oss-120b:free
Check your OpenRouter key limit and remaining credits at https://openrouter.ai/settings/keys
Prefer a pinned model (for example
openai/gpt-oss-120b:free) instead ofopenrouter/free, which can route to models with different limits.
Note: This behavior is different from OpenAI’s typical 429 handling for insufficient quota. If you still encounter issues after these steps, double-check the model and endpoint configuration to ensure the key is valid and has sufficient credits.
# What does it mean to point RAG at the chunk index in Module 1 homework?
It means you should build and search the index using the chunked documents, not the original full documents.
For example, if your original records are in documents and your split records are in chunks, fit the search index on chunks:
index.fit(chunks)
not:
index.fit(documents)
Then the RAG pipeline retrieves relevant chunks and puts only those chunks into the prompt. This is what reduces the amount of context sent to the model.
# Do homework answers need to match the options exactly?
Not always. If your numeric answer is close to one of the options, choose the closest option.
Small differences can come from:
- Slightly different filtering.
- Different dataset versions.
- Floating-point or rounding differences.
- Different model/provider behavior.
If your answer is far from every option, re-check the question, the dataset version, and the GitHub homework instructions.
See the general homework guidance: homework logistics.
# What should I do if homework questions feel unclear?
First read the GitHub homework instructions, not only the course platform page.
For Module 1 in the 2026 cohort, start here:
The homework follows the lesson workflow, but usually uses a different dataset or asks you to apply the same idea in a slightly different way.
If it is still unclear, ask in Slack and include the code or command output as text, not a screenshot. Follow the asking questions guidelines.
# How do I get token counts for Module 1 homework if I use a different provider?
For the current Module 1 homework, get the token count from the model response object.
For example, OpenAI-compatible clients usually return usage information on the response, such as response.usage.input_tokens or response.usage.prompt_tokens, depending on the API style.
If you use a non-OpenAI provider, check the provider's response object for its usage fields and adapt the code. Do not use tiktoken or cl100k_base as a generic tokenizer for Gemini, Mistral, Hugging Face, Groq, or other providers because tokenization differs by model.
If your provider does not expose token usage, use that provider's native tokenizer as an approximation. For multiple-choice homework questions, choose the closest option.
Module 2: Vector Search
# What are embeddings?
Embeddings refer to the process of converting non-numerical data into numerical data while preserving meaning and context. When similar non-numerical data is input into an embedding algorithm, it should yield similar numerical data. The proximity of these numerical values allows for the use of mathematical semantic similarity algorithms. Related concepts include the "vector space model" and "dimensionality reduction."
# Warning: 'model "multi-qa-mpnet-base-dot-v1" was made on sentence transformers v3.0.0 bet' how to suppress?
To suppress the warning, upgrade sentence-transformers to version 3.0.0 or higher. You can do this by running the following command:
pip install sentence-transformers>=3.0.0
# Why was .dot(...) used directly to compute cosine similarity in the lesson, but normalization is emphasized in the homework?
In the lesson, .dot(...) was used under the assumption that the embeddings returned by the model (e.g., model.encode(...) from OpenAI) are already normalized to have unit length. In that case, the dot product is mathematically equivalent to cosine similarity.
In the homework, however, we use classic embeddings like TF-IDF + SVD, which are not normalized by default. This means that the dot product does not represent cosine similarity unless we manually normalize the vectors.
# Vector search: should I embed the question, the answer, or both?
There's no single right answer — it's an experiment to run on your dataset. The course shows three options:
- Embed the answer (
text) only — works because the model captures semantic similarity between questions and their answers. - Embed the question only — works because user queries look like the indexed questions.
- Embed
question + " " + text— often the best, but produces longer input and slightly more cost.
Pick whichever gives the best hit rate / MRR on your ground-truth set. The course materials include a side-by-side comparison.
# What is the cosine similarity?
Cosine similarity is a measure used to calculate the similarity between two non-zero vectors, often used in text analysis to determine how similar two documents are based on their content. This metric computes the cosine of the angle between two vectors, which are typically word counts or TF-IDF values of the documents. The cosine similarity value ranges from -1 to 1, where 1 indicates that the vectors are identical, 0 indicates that the vectors are orthogonal (no similarity), and -1 represents completely opposite vectors.
# Why does cosine similarity reduce to a matrix multiplication between the embeddings and the query vector?
Cosine similarity measures how aligned two vectors are, regardless of their magnitude. When all vectors (including the query) are normalized to unit length, their magnitudes no longer matter. In this case, cosine similarity is equivalent to simply taking the dot product between the query and each document embedding. This allows us to compute similarities efficiently using matrix multiplication.
# Do I need a new GitHub repo for Module 2, or just a new codespace?
Just a new codespace. A codespace is an environment (see Can I run the course locally instead of Codespaces?); you create it from your existing repository, so you don't need a new GitHub repo.
Use a separate codespace for Module 2 because the vector-search dependencies are fairly heavy. Keeping them isolated means you can simply stop or delete that codespace when you're done, rather than leaving the extra weight in your Module 1 environment. Setup is quick.
Since you may delete this codespace later, commit your work to your repository (see What happens to code saved in Codespaces if I do not commit it?).
Module 4: Evaluation
# Evaluation: "JSONDecodeError: Expecting value" when generating ground-truth questions with the LLM
The LLM sometimes wraps the JSON in a markdown code fence or adds prose around it, so json.loads(response) fails with:
JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Force JSON output with OpenAI's response_format:
response = openai_client.chat.completions.create(
model='gpt-4o-mini',
messages=[{"role": "user", "content": prompt}],
response_format={"type": "json_object"},
)
parsed = json.loads(response.choices[0].message.content)
Also be explicit in the prompt about the expected shape:
Output a JSON object with a single key "questions" whose value is a list of 5 strings.
Do not include any extra text, explanation, or formatting.
Most providers have an equivalent (Gemini's response_mime_type="application/json", Groq's response_format, etc.).
# Evaluation: Jupyter kernel crashes when embedding the ground-truth set
Small-RAM machines (Codespaces default, low-end laptops) run out of memory when an embedding model is loaded alongside the rest of the notebook state.
Workarounds:
- Switch to a smaller embedder.
sentence-transformers/all-MiniLM-L6-v2(384-dim) is a common drop-in. Note: switching models will change your hit-rate / MRR numbers, so re-run the eval after the switch. - Move the embedding step into a separate Python script that you run from the terminal, then load the saved vectors back into the notebook.
- Use a Codespaces machine type with more RAM (Settings → "Machine type" on a Codespace), or run locally.
- Process the ground-truth set in batches and free memory between batches (
del,gc.collect()).
# Evaluation: hitting rate limits while generating the ground-truth dataset
Free-tier Gemini limits both per-minute and per-day requests. Adding time.sleep(4) only fixes the per-minute side — a long tqdm loop can still blow through the per-day quota in one run.
Options when this happens:
- Spend ~$5 on OpenAI and use
gpt-4o-mini. It's cheap enough to embed/generate the entire ground-truth set and has higher rate limits. - Use Groq's free tier (
llama-3.3-70b-versatile) — generous request-per-minute limits. - Lower concurrency for thread-pool calls. Use a smaller pool size (2–3 workers) instead of pushing the API hard.
- Resume from where you stopped. Save progress periodically (e.g. dump the partial results to a JSONL file) so a hit limit doesn't lose all work.
Module 5: Monitoring
# In Windows OS: OSError: [WinError 126] The specified module could not be found. Error loading "C:\Users\USER\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\lib\fbgemm.dll" or one of its dependencies.
Solution 1: Install Visual C++ Redistributable.
Solution 2: Install Visual Studio, not Visual Studio Code.

For more details, please follow this link: discuss.pytorch.org
# OperationalError when running python prep.py: psycopg2. OperationalError: could not translate host name "postgres" to address: No such host is known. How do I fix this issue?
To resolve this error, update the .env file:
- Change the
POSTGRES_HOSTvariable tolocalhost.
POSTGRES_HOST=localhost
# How set Pandas to show entire text content in a column. Useful to view the entire Explanation column content in the LLM-as-judge section of the offline-rag-evaluation notebook
By default, Pandas truncates text content in a column to 50 characters. To view the entire explanation provided by the judge LLM for a non-relevant answer, use the following instruction:
pd.set_option('display.max_colwidth', None)
- Option:
display.max_colwidth - Type:
intorNone - Description: Sets the maximum width in characters of a column in the representation of a pandas data structure. When a column overflows, a "..." placeholder is used in the output. Setting it to 'None' allows unlimited width.
- Default: 50
Refer to the official documentation for more details.

# How to normalize vectors in a Pandas DataFrame column (or Pandas Series)?
To normalize vectors in a Pandas DataFrame column, you can use the following approach:
import numpy as np
normalize_vec = lambda v: v / np.linalg.norm(v)
df["new_col"] = df["org_col"].apply(normalize_vec)
# How to compute the quantile or percentile of Pandas DataFrame column (or Pandas Series)?
To compute the 75% percentile or 0.75 quantile:
quantile = df["col"].quantile(q=0.75)
# How can I remove all Docker containers, images, and volumes, and builds from the terminal?
- Delete all containers (including running ones):
docker rm -f $(docker ps -aq)
- Remove all images:
docker rmi -f $(docker images -q)
- Delete all volumes:
docker volume rm $(docker volume ls -q)
# Session State: I want the user to only be able to give feedback once per submission (+1 or -1). When I submit text using the ask button, the buttons should be disabled if `st.session.submitted` is False. The issue is mainly with `st.session.submitted`, which gets reassigned to True again despite one feedback button being pressed.
Module 6: Best Practices
# Docker: When trying to run a streamlit app using docker-compose, I get: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec: "streamlit": executable file not found in $PATH: unknown. The app runs fine outside of docker-compose
To resolve this issue:
Ensure you have created a
Dockerfile.Add
streamlitto thedocker-composeconfiguration.Run the following command to rebuild and start the application:
docker-compose up --build
Capstone Project
# Is it a group project?
No, the capstone is an individual project.
You can collaborate or discuss a larger idea with other students, but each submitted project must stand on its own. A shared system can work only if it is clearly decomposed into independent projects, where each person has a separate qualifying component and a separate repository.
If the work cannot be evaluated independently for each participant, it does not satisfy the project requirement.
# Do we submit 2 projects, what does attempt 1 and 2 mean?
You only need to submit one project. If the submission at the first attempt fails, you can improve it and re-submit during the attempt#2 submission window.
- If you want to submit two projects for the experience and exposure, you must use different datasets and problem statements.
- If you can’t make it to the attempt#1 submission window, you still have time to catch up to meet the attempt#2 submission window.
Remember that the submission does not count towards the certification if you do not participate in the peer-review of three peers in your cohort.
# Does the competition count as the capstone?
No, it does not. You can participate in the math-kaggle-llm-competition as a group if you want to form teams; but the capstone is an individual attempt.
# How is my capstone project going to be evaluated?
Each submitted project will be evaluated by three randomly assigned students who have also submitted the project.
You will also be responsible for grading the projects from three fellow students yourself. Please be aware that not complying with this rule implies you may fail to achieve the Certificate at the end of the course.
The final grade you receive will be the median score of the grades from the peer reviewers. The peer review criteria for evaluation must follow the guidelines defined here (TBA for link).
# When and how will we be assigned projects for review/grading?
After the submission deadline has passed, an Excel sheet will be shared with 3 projects being assigned to each participant.
# I’ve already submitted my project. Why can’t I review any projects?
Once the project submission deadline has passed, projects will be assigned to you for evaluation. You can't choose which projects to evaluate, and you can’t review before the list has been released.
# How can I find some good ideas or datasets for the project?
Please check this GitHub page for several ideas and datasets that could be used for the project, along with tips and guidelines.
# Project: do I need an orchestration tool (Airflow, Mage, Kestra) for the capstone?
No. A plain Python script that ingests and indexes your data is enough for full points on the "ingestion pipeline" criterion. A Jupyter notebook with the same steps is worth 1 point instead of 2.
Use an orchestrator only if it actually fits your project — for example, recurring ingestion of a feed that updates daily. Don't add one just to score the criterion.
# Project: how do I evaluate a recommender-style RAG (no obvious Q&A ground truth)?
Two complementary approaches that both score for the evaluation criterion:
Synthetic ground truth (same idea as the course, adapted). For each item in your dataset, prompt the LLM with the item's description and ask it to generate ~5 user queries that should return that item as the top result. Then run those queries through your retrieval and measure hit rate / MRR / NDCG.
LLM-as-a-judge for end-to-end quality. Sample queries, run the full RAG, and have an LLM rate the result for relevance/usefulness on a fixed rubric (e.g. 1–5 scale, with criteria you specify in the prompt).
NDCG is often a better fit than hit-rate for ranking-style problems where multiple items are acceptable answers — it rewards getting good items high in the list, not just first.
# Project: my corpus is large (long PDFs, many paragraphs). What's a good chunking strategy?
Don't try to find the perfect chunker upfront — iterate.
- Start simple: fixed-size chunking (~1000 tokens with some overlap) and run a small ground-truth eval.
- Try smart chunking: ask an LLM to split each document into logical sections, then index each section.
- Add a short LLM-generated summary per chunk and index it alongside, or use it to boost retrieval.
- For long, structured documents (legal, financial), prefer hybrid search (BM25 + dense) so exact wording isn't lost during semantic matching.
Useful tools for parsing PDFs to clean markdown before chunking:
pymupdf4llm— fast, decent quality.- Docling — slower but higher quality on tables/figures.
- GROBID — for academic papers, extracts structure (sections, refs, etc.).
Run the eval again after each change. The goal is measurable improvement on hit rate / MRR for your ground-truth set, not a "perfect" chunker in the abstract.
# Project: what does "reproducibility" mean — do reviewers need access to my API keys?
Never share API keys or hosted-service credentials in your repo. Reproducibility means a peer reviewer can clone the repo and follow your README to recreate the system from scratch — using their own credentials.
Concretely:
- Provide a script (or notebook) that ingests the dataset and (re)builds the search index locally.
- Ship a
.env.examplewith the variable names but no values; have the reviewer create their own.envwith their own keys. Keep.envin.gitignore. - Use a cheap model (
gpt-4o-mini, Groq, etc.) so reviewers don't burn through credits when running your project. - Pin dependency versions (
requirements.txt/pyproject.tomllock file) and document the Python version (and Docker version, if used).
# Can I use something other than Python for homework or the project?
In most cases, use Python. The course materials, examples, and reviewer expectations are Python-based, so Python is the easiest path.
Using another language or stack is technically possible, but do it only if you have a strong reason. We do not want to restrict your choice of technology, but using a different stack makes reproducibility and review harder.
If you use another language, for example Go, your documentation must be very thorough. Assume the reviewer has no knowledge of that language or ecosystem. Your README should explain how to install dependencies and run the homework or project on Windows, macOS, and Linux.
For Go, include steps at the level of:
go mod tidy
go run .
The submission must still be easy to reproduce and evaluate.
# Where can I find previous LLM Zoomcamp projects?
You can browse previous LLM Zoomcamp project submissions here:
These pages show submitted repositories and can help you understand the expected scope and quality of capstone projects.
# Do I need to announce or reserve my project idea?
No. You do not need to announce or reserve your project idea before starting.
If you want feedback, you can ask in Slack. You can also look at previous projects to understand scope:
Similar topics are okay as long as your implementation is your own and your submission meets the project requirements.
Workshop: Open-Source Data Ingestion (dlt)
# Can I use the workshop materials for my own projects or share them with others?
Since dlt is open-source, you can use the content of this workshop for a capstone project. As the main goal of dlt is to load and store data easily, you can even use it for other Zoomcamps (like the MLOps Zoomcamp project). Feel free to ask questions or use it directly in your projects.
# How to set up a new dlt project when loading from cloud?
Start with the following command on the command line:
dlt init filesystem duckdb
More directions can be found at dlthub.com
# There is an error when opening the table using `dbtable = db.open_table("notion_pages___homework")`: `FileNotFoundError: Table notion_pages___homework does not exist. Please first call db.create_table(notion_pages___homework, data)`
The error indicates that you have not changed all instances of "employee_handbook" to "homework" in your pipeline settings.
# There is an error when running main(): FileNotFoundError: Table notion_pages___homework does not exist. Please first call db.create_table(notion_pages___homework, data)
Make sure you open the correct table in line 3:
dbtable = db.open_table("notion_pages___homework")
# How do I know which tables are in the db?
You can use the db.table_names() method to list all the tables in the database.
# Does DLT have connectors to ClickHouse or StarRocks?
Currently, DLT does not have connectors for ClickHouse or StarRocks but is open to contributions from the community to add these connectors.
# Notebook does not have secret access or 401 Client Error: Unauthorized for url: [api.notion.com](https://api.notion.com/v1/search)
If you encounter this error, it typically indicates an authorization issue with the Notion API. Here’s how you can resolve it:
- Check API Key: Ensure that you are using the correct API key with appropriate permissions.
- Verify API Endpoint: Confirm that you are hitting the correct Notion API endpoint.
- Token Expiry: Check if your token has expired and regenerate it if necessary.
- Configurations: Double-check all access configurations in your application.
If the error persists, review the API documentation and make sure all necessary authentication steps are correctly implemented.
# Error: How to fix requests library only installs v2.28 instead of v2.32 required for lancedb?
If you encounter a 401 Client Error, it may indicate the need to grant access to the key or that the key is incorrect.
To install the correct version directly from the source, use the following command:
pip install "requests @ https://github.com/psf/requests/archive/refs/tags/v2.32.3.zip"