GitHub Codespaces
These notes are based on the video ML Zoomcamp 1.6 - GitHub Codespaces
GitHub Codespaces is a cloud-based development environment that requires minimal configuration. It provides a remote environment with most of the tools needed for the Machine Learning Zoomcamp course. The main advantages include:
- Almost no configuration required
- Remote environment with pre-installed tools
- Seamless integration with GitHub
- Accessible from anywhere
Setting Up a Repository with Codespaces
Creating a New Repository
- Create a new repository on GitHub
- Name it appropriately (e.g., “machine-learning-zoomcamp-homework”)
- Add a README file
- Make it public
- Add a Python .gitignore file
- Click “Create repository”
Launching Codespaces
- Navigate to the repository
- Click on the “Code” button
- Select the “Codespaces” tab
- Click “Create codespace on main”
This will create a Visual Studio Code instance within your browser. You can either:
- Use it directly in the browser
- Open it in VS Code desktop by clicking the button in the corner labeled “Open in VS Code Desktop”
Working with Codespaces
Basic Operations
- The environment feels like local development
- Files can be created and edited as usual
- Terminal is accessible via:
- Ctrl+` (Control+Tilda)
- View > Terminal menu
Terminal Tips
For a cleaner terminal prompt, you can use:
PS1="> "
This shortens the prompt to just a “>” sign, giving you more space to see your commands.
Git Operations
Git is pre-configured in Codespaces:
git status
git commit -am "message"
git push
Installing Required Libraries
Install the necessary Python libraries using pip:
pip install jupyter numpy pandas scikit-learn seaborn
Additional libraries like XGBoost and TensorFlow can be installed the same way when needed.
Using Jupyter Notebooks
Starting Jupyter
Launch Jupyter Notebook:
jupyter notebook
Codespaces automatically detects the running service on port 8888 and forwards it to your local machine.
Accessing Jupyter
- In Codespaces, look for the “Ports” tab
- Find the forwarded port 8888
- Click on the link to open Jupyter in your browser
- Copy the token from the terminal or the full URL and paste it in the browser if prompted
Working with Notebooks
- Create folders for organization (e.g., “01-intro”)
- Create new notebooks
- Import libraries and start working:
import pandas as pd df = pd.read_csv('file.csv')
Completing and Submitting Homework
- Create and complete your homework notebook
- Rename files as needed (can be done directly in VS Code)
- Commit and push your changes:
git add . git commit -m "homework" git push
- Submit the GitHub repository URL in the course homework submission form
Additional Tips
- Install the VS Code Python extension for better Python support
- When first launching VS Code desktop, it will prompt you to install the Codespaces extension
- If not prompted, you can install it manually:
- Go to Extensions
- Search for “GitHub Codespaces”
- Install the extension
Conclusion
GitHub Codespaces provides a convenient, pre-configured environment for the Machine Learning Zoomcamp course. It eliminates most setup issues and allows you to focus on learning machine learning concepts rather than environment configuration.