Machine Learning Zoomcamp: Free ML Engineering course. Register here!

DataTalks.Club

Comet for Data Science

by Angelica Lo Duca

The book of the week from 24 Oct 2022 to 28 Oct 2022

This book provides concepts and practical use cases which can be used to quickly build, monitor, and optimize data science projects. Using Comet, you will learn how to manage almost every step of the data science process from data collection through to creating, deploying, and monitoring a machine learning model.

The book starts by explaining the features of Comet, along with exploratory data analysis and model evaluation in Comet. You’ll see how Comet gives you the freedom to choose from a selection of programming languages, depending on which is best suited to your needs. Next, you will focus on workspaces, projects, experiments, and models. You will also learn how to build a narrative from your data, using the features provided by Comet. Later, you will review the basic concepts behind DevOps and how to extend the GitLab DevOps platform with Comet, further enhancing your ability to deploy your data science projects. Finally, you will cover various use cases of Comet in machine learning, NLP, deep learning, and time series analysis, gaining hands-on experience with some of the most interesting and valuable data science techniques available.

By the end of this book, you will be able to confidently build data science pipelines according to bespoke specifications and manage them through Comet.

Questions and Answers

Angelica Lo Duca

Hello everyone! It’s a pleasure for me to meet you! If you have any questions, I’ll be glad to answer them.

Ifra

Hi Angelica Lo Duca , great to have you here! I came to know about comet platform recently and very curious to know that how ML on comet is different from traditional machine learning?

Angelica Lo Duca

Hi Ifra thanks for your question! Comet helps you track your ML experiments and see the results directly in the Comet dashboard. For example, let’s suppose that you want to choose the best model for a classification task. You need to test different models, for example KNN, RandomForest and so on. Once you have tested your models, you choose the one with the best performance. This is the standard way. In Comet, the flow is the same, but you track each experiment in Comet, by adding just one or two lines of code. As a result, you can perform the comparison between the different experiments directly in Comet. You could watch this video to have an idea of how Comet works: https://www.youtube.com/watch?v=GHyX9VeuPn0&t=2s

Angelica Lo Duca

Comet also provides other features, for example, you can add it directly to your deployment environment

Angelica Lo Duca

I hope that I’ve answered your question. For more details, you can also watch the other episodes of the Series 11 weeks of Comet for Data Science https://www.youtube.com/watch?v=GHyX9VeuPn0&list=PLLT9c0_6kiT6rRQHs1TVA0ZzaE5B3v8KF

Ifra

Yes you did Angelica Lo Duca, thank you so much! I’ll surely check out the read!🙂

Alexey Grigorev

Just curious, why did you decide to write a book about comet? Did you compare it with other ML platform before making this decision?

Angelica Lo Duca

Hi Alexey! Thanks for your question! Indeed, before meeting Comet, I organized my ML experiments in a directory on my local computer, and I compared the different models manually. This process took a lot of time, so I decided to search for some tools that could help me solve my problem. I discovered Comet and other similar platforms. I started to test Comet and I realized that working with Comet is simpler than doing things manually. To be honest, I didn’t try other platforms because I was really excited by Comet! Then, I wrote an article about Comet on my blog. Next, someone from Packt Publication contacted me asking if I was interested in writing a book about Comet. And so the book was born.

Eunice

Hi, Angelica Lo Duca; thank you so much for joining. In your experience, how much did using Comet help you save time in the ML/DL delivery process?
Would you recommend using Comet for an MVP or more advanced projects?

Angelica Lo Duca

Hi Eunice! Thanks for your question! I think that using Comet in the ML/DL lifecycle saves a lot of time because you can use Comet both for model tracking and model updating. For example, you can store the best model in the Comet Registry and then use it in production. When you update your model, you just need to update it in the Comet Registry. The figure attached shows how you can integrate Comet with the GitLab CI/CD pipeline. I hope that I answered your question.

Angelica Lo Duca

Regarding your second question, recommending using Comet for an MVP project, yes, absolutely!

Hareesh

Hi Angelica Lo Duca,
I am just curious to know if Comet has capabilities to put ML models into production ?
My follow-up question is more regarding EDA, is comet automating EDA ? If so, are there options that would help us identify outliers in data ?
Thanks.
HT

Angelica Lo Duca

Hi Hareesh, thanks for your questions! Yes, you can integrate Comet in the production workflow. Comet provides a Registry where you can store the best model. Thanks to the Comet REST API, you can download the best model in your production environment. In addition, Comet is fully integrated with GitLab thus you can use the GitLab CI/CD workflow to move your ML to production. Chapters 6 and 7 of the book explain this aspect.

Angelica Lo Duca

Regarding EDA, Comet is not designed to support it natively. However, Comet is integrated with some external libraries such as pandas profiling to perform EDA in Comet.

Angelica Lo Duca

I hope that I have answered to your questions.

To take part in the book of the week event:

  • Register in our Slack
  • Join the #book-of-the-week channel
  • Ask as many questions as you'd like
  • The book authors answer questions from Monday till Thursday
  • On Friday, the authors decide who wins free copies of their book

To see other books, check the the book of the week page.

Subscribe to our weekly newsletter and join our Slack.
We'll keep you informed about our events, articles, courses, and everything else happening in the Club.


DataTalks.Club. Hosted on GitHub Pages. We use cookies.