Data Engineering Zoomcamp: Free Data Engineering course. Register here!

DataTalks.Club

Machine Learning Zoomcamp

Learn Machine Learning engineering in 4 months

17 Aug 2023 by Valeriia Kuka

In this article, we will talk about Machine Learning Zoomcamp. It’s a four-month-long program to get started with machine learning engineering.

We will cover different aspects of this course so you can learn more about it:

  • Course curriculum: what topics and technologies are covered by the course
  • Course assignments and scoring
    • Homework and getting feedback
    • Learning in public approach
    • Course projects for your portfolio
  • DataTalks.Club community

Course curriculum: where theory meets practice

The course consists of two parts.

Part 1 of the course covers machine learning algorithms implemented in Python, including Linear Regression, Classification, Decision Trees, Ensemble Learning, and Neural Networks.

Part 2 focuses on deploying models using frameworks like Flask, TensorFlow, and Kubernetes, enabling practical application of machine learning in real-world scenarios.

Part 1: Machine learning algorithms and their implementation

Part 1 focuses on the main machine learning algorithms and their practical application using Python. The topics covered include:

  • Linear Regression: feature engineering, handling categorical variables, and the importance of regularization.
  • Classification: logistic regression and feature importance.
  • Decision Trees and Ensemble Learning: gradient boosting technique and XGBoost, a popular ensemble learning algorithm.
  • Neural Networks and Deep Learning: Convolutional Neural Networks (CNNs) and transfer learning techniques for tackling complex problems with deep learning.
  • Python and Jupyter Notebooks: working efficiently with code.
  • NumPy and Pandas: linear algebra concepts like matrices and data manipulation and analysis.
  • Matplotlib and Seaborn: data visualization and graphical representations.
  • Scikit-Learn: application of various machine learning algorithms to real-world datasets.
  • TensorFlow and Keras: popular frameworks for building neural networks and deep learning models.

Technologies used in the first part of the course

For the deep learning part, we need to use a GPU. Luckily, we partnered with Saturn Cloud, so any Machine Learning Zoomcamp student can get extra 150 GPU hours in Saturn Cloud. To get it, message support and say “I’m enrolled in ML Zoomcamp”.

Part 2: Deployment

Part 2 is dedicated to model deployment, which involves putting machine learning models into production. In this section, you’ll gain practical skills using popular frameworks and tools. The topics covered include:

  • Flask, Pipenv, and Docker: machine learning models deployment, enabling you to move your models from notebooks to services and applications.
  • AWS Lambda and TensorFlow Lite: serverless deep learning, understanding how to efficiently operate within this paradigm.
  • Kubernetes and TensorFlow Serving: automating deployment, scaling, and management of containerized applications.
  • KServe (optional): an additional topic for those seeking advanced knowledge, offering insights into further enhancing deployment capabilities.

Technologies used in the second part of the course

Course description on GitHub provides a detailed overview of the topics covered each week, enabling you to delve deeper into the content. By the end of the course, you will have acquired the fundamental skills necessary for a career as a machine learning engineer.

Theory and practice

Our lectures aim to make machine learning theory accessible and engaging through real-world examples. Code demonstrations are provided directly in the lectures to show the implementation of concepts, enabling easier application in your projects.

For instance, in one of the lectures about a linear algebra refresher, the lecturer switches between screens. Firstly, they explain the concept of the dot product of two vectors, and then they demonstrate its implementation using Python.

Extract from one of the lectures about linear algebra refresher

Course assignments and scoring

Homework and getting feedback

To reinforce your learning, we offer regular homework assignments, reviewed and scored by industry professionals. Your scores are added to an anonymous leaderboard, creating friendly competition among course members and motivating you to do your best.

Anonymous leaderboard with scored homework

For support, we have an FAQ section with quick answers to common questions. If you need more help, our Slack community is always available for technical questions, clarifications, or guidance. Additionally, we host live Q&A sessions called “office hours” where you can interact with instructors and get immediate answers to your questions.

A screenshot of an FAQ document

Learning in public approach

A unique feature is our “learning in public” approach, inspired by Shawn @swyx Wang’s article. We believe that everyone has something valuable to contribute, regardless of their expertise level.

An extract from the Shawn @swyx Wang's article about learning in public

Throughout the course, we actively encourage and incentivize learning in public. By sharing your progress, insights, and projects online, you earn additional points for your homework and projects.

Anonymous leaderboard from previous cohort of the course. On the right, you can see the bonus points for learning in public

This not only demonstrates your knowledge but also builds a portfolio of valuable content. Sharing your work online also helps you get noticed by social media algorithms, reaching a broader audience and creating opportunities to connect with individuals and organizations you may not have encountered otherwise.

Course projects for your portfolio

If you’ve ever participated in an interview or conducted online research, you likely understand the significance of personal projects for a machine learning engineer role. Especially in case you don’t have any previous experience in machine learning and it’s your first job.

To receive a certificate, you’ll need to finalize and submit two projects: one during the midterm (Midterm project) and another at the end (Capstone project 1 and/or Capstone project 2). These projects allow you to choose a problem that interests you, find a suitable dataset, and develop your model. For the capstone project, you are also required to deploy your model into a web service, with an option for local deployment or on the cloud, earning bonus points.

Screenshot of the GitHub repository for the project of Pastor Soto. Pastor scored first in the final leaderboard for the 2022 year cohort

For proactive participants, there’s an exciting opportunity to engage in an optional project and write an article. The article will require you to conduct research on a topic not covered in the course, encouraging you to explore beyond the curriculum’s confines.

You can find all the projects from the year 2022 in the final leaderboard. These projects allow you to apply everything you’ve learned and make a great addition to your GitHub profile.

DataTalks.Club community

DataTalks.Club has a supportive community of like-minded individuals in our Slack. It is the perfect place to enhance your skills, deepen your knowledge, and connect with peers who share your passion. These connections can lead to lasting friendships, potential collaborations in future projects, and exciting career prospects.

Course channel in our Slack community

Conclusion

The Machine Learning Zoomcamp offers covers key machine learning concepts, algorithms, and deployment techniques. With a practical hands-on approach focused on real-world application, this 4-month program provides the essential skills to kickstart a career in machine learning engineering.

The combination of theory, coding implementation, and project work develops proficiency across the machine learning pipeline. Supported by a motivated community and led by experienced instructors, the Machine Learning Zoomcamp delivers an engaging learning experience. For anyone seeking to gain industry-relevant machine learning skills and build an impressive portfolio in just 4 months, this course provides an accelerated path to launch or advance your career.

The next cohort starts on September 11, 2023!

Register for the Machine Learning Zoomcamp: https://airtable.com/shryxwLd0COOEaqXo

Subscribe to our weekly newsletter and join our Slack.
We'll keep you informed about our events, articles, courses, and everything else happening in the Club.


DataTalks.Club. Hosted on GitHub Pages. We use cookies.