Data Engineering Zoomcamp: Free Data Engineering course. Register here!

DataTalks.Club

Grokking Machine Learning

by Luis Serrano

The book of the week from 09 Aug 2021 to 13 Aug 2021

It’s time to dispel the myth that machine learning is difficult. Grokking Machine Learning teaches you how to apply ML to your projects using only standard Python code and high school-level math. No specialist knowledge is required to tackle the hands-on exercises using readily-available machine learning tools!

Questions and Answers

Kashan Ahmed

Will this book cover unsupervised learning as well?

Luis Serrano

Hi Kashan Ahmed !
No, the book only covers supervised learning. Perhaps a future book will cover unsipervised

Luis Serrano

I mention some things about unsupervised learning at the beginning, but never cover the algorithms in detail

Kashan Ahmed

Thank you for answering. 🙂

Kashan Ahmed

will it cover scikit learn?

Alper Demirel

Hi Luis Serrano, Firstly thank you for your time.
What makes this book special from other related books on the market?

Luis Serrano

Hi Alper Demirel , thank you for your interest!
I wrote this book to be understandable by readers without an extensive background in math and programming, so the algorithms are explained in a more conceptual way with examples, figures, and stories, as opposed to formulas. (The formulas are there too, but they appear after the conceptual explanations)

Alper Demirel

Thanks a lot for the answer, I’m so glad it’s not a book full of formulas. I hope I get the chance to read it!

Wendy Mak

hi Luis, how is a ‘grokking xxx’ book different to the other manning books? and what made you write a book in this style?

Luis Serrano

Hi Wendy Mak !
The grokking books tend to make a big effort in making the topic understandable for most people, not just those with a super technical background. I always like to explain things in that manner, as that is the way I like to understand things. When I started writing this book it wasn’t meant to be a grokking book, but when the editors saw it had that style, they asked me if I wanted to make it part of the series and I thought it was a good idea.

David Cox

Riffing off this thread, how did you go about identifying (a) that you wanted to write this book and (b) connect with Manning for publishing?

Wendy Mak

also, how did you get into teaching ML, and what do you enjoy about teaching?

Luis Serrano

Wendy Mak I have always loved teaching. One reason is that I am a slow learner, so I always have to digest everything I learn until I make it very understandable for myself (with figures, examples, etc), and so when I explain it to others it has already been digested in my mind, so teaching comes out easier.
I started as a mathematician, my PhD was in math and I was teaching and working on research. The ML bug bit me when I started hearing about it and I decided to switch careers and started working at google, that’s when I started learning ML more seriously.

Neal Lathia

❔ What does grokking actually mean? 😂

Luis Serrano

Hi Neal Lathia ! To grok is to understand something very well in a conceptual and intuitive manner. This is why the grokking books try to explain things in such a way for most people to understand even if they don’t have a technical background.

Luis Serrano

Thanks Alexander Seifert !

Eric Sims

Do you have a favorite ML algorithm at the moment? 🙂

Luis Serrano

Hi Eric Sims !
Great question! I like many algorithms, I gotta say I really enjoy the kernel method (SVMs) because it’s so clever. Also naive Bayes, and anything that is purely probabilistic, since all they do is playing with conditional probabilities and they can do wonders.

Eric Sims

How did you determine where to draw the line between “essential” math for understanding ML and “extra stuff” that can be learned later if needed? I often worry that I will miss something important if I don’t have a decent grasp on what’s actually happening. Of course, my decent grasp is not necessarily super strong - I don’t really know any calculus.

Luis Serrano

Eric Sims , yes, that’s a huge part of the book, and the reason is that I believe that to start in ML, one doesn’t need that much understanding of math. The math can be learned as you learn ML, hand in hand. Most experts recommend learning a lot of math and then starting ML, which I don’t fully agree with, which is why I wrote this book.
As for essential math to read this book, high school math is enough. Having an intuition of what a formula is, basic equations like the equation of a line, and basic probability (e.g., being able to calculate a probability as a ratio of numbers) is enough, as everything else can be developed as we go.

Chetna

Hi Luis Serrano, thanks for this QnA.
Is this book about ML system design?

Luis Serrano

Hi Chetna ! Great question, no, the book doesn’t talk about system design. It talks about the algorithms, how they work, and how to apply them. There is code, but mostly to study datasets, not to put in production.

Chetna

got it, thanks 🙂

Luis Serrano

Hello everyone! It’s such a pleasure being here, and thank you for your great questions! Answering them now in the threads.

Kashan Ahmed

Will the person who completely read and implement all the example from the book be able to consider themselves intermediate level or will need more practice and familiarity with high level libraries?

Luis Serrano

Hi Kashan Ahmed! Great question, yes, the person who finishes the book will have a working knowledge of most supervised learning algorithms, and the packages to use them (including a deep learning package), so they’re definitely an intermediate level ML practitioner.

Luis Serrano

More can always be learned after the book, including other similar libraries, other fields such as unsupervised learning, generative learning, and reinforcement learning. Also, ML production and system design is a field people can learn more about outside of the book, in case they want to implement models in production.

Kashan Ahmed

Thank you for detailed answer.

Krzysztof Ograbek

Hi Luis Serrano. I have to admit I never heard of your book. Thanks for doing this!! Which libraries are you using for projects in your book?

Luis Serrano

Thanks Krzysztof Ograbek!
The ML libraries are scikit learn, Turi create, Keras (Tensorflow), and xgboost. For other stuff, matplotlib, pandas, numpy

xnot

How much important would you place on developing geometric intuition behind linear algebraic concepts? Do you think you can do without it when developing / maintaining ML projects in production ?

Luis Serrano

Hi xnot!
I think geometric intuition is more important than the formulas and the math. And yes, I think this can be done while learning and developing ML.
When I talk about geometric intuition, I mean things like drawing lines that pass close to a group of points, lines and planes that separate points of different colors, rotations, transformations, etc. Most people have this type of intuition, it’s only a matter of tying it to the algorithms and the applications.

xnot

Thanks Luis Serrano. Does the book cover ways of approaching this?

Quynh Le

Hi Luis Serrano, I am glad to know about the book. Thanks for writing it! I learned about regressions in school but have never done machine learning yet. How would you suggest me approach machine learning in general? Can I implement machine learning with only Python?

Luis Serrano

Thanks Quynh Le!
If you learned regression, then you know machine learning. Most of the main algorithms are similar to that one, only with some small tweaks. I think a book or introductory course in ML can get you up to speed. And yes, one can implement everything only in Python, as a matter of fact, it’s the one I recommend the most, since most of the important packages are written there.

Quynh Le

Do you suggest any intro ML course or book (other than Andrew Ng course)? Can I read and implement projects from your book using Python?

Luis Serrano

I have a bunch of videos here that you may enjoy: https://serrano.academy

Luis Serrano

Also, there are some interesting courses at Udacity. Here is one on deep learning that I teach with a few other people:
Intro to Deep Learning with PyTorch

Luis Serrano

Quynh Le ^^

Quynh Le

Luis Serrano Thanks for the suggestions, I am a big fan of Udacity courses! I’ll check your website as well!

WingCode

Hi Luis Serrano, Your book looks like a fun read!
Do you find it difficult to strike the right balance between complicating vs oversimplifying while explaining a data science concept?

WingCode

Can your book used for communicating data science concepts to higher stakeholders who necessarily don’t understand the nitty gritty of data science?

WingCode

What is the topic you found the most difficult to grok? Have you ever felt that any data science topic cannot be just “grokked” ? 😅

Luis Serrano

Hi WingCode, great questions!

  1. It is challenging to strike the right balance between oversimplifying while explaining, this is why I try my explanations on people, including experts and non-experts. If I can get a non-expert to understand it while not boring an expert, I feel that I’ve hit the right spot.
  2. Yes absolutely, the book can be used to communicate data science to stakeholders, since it has lots of real-life applications, which they can use to see the value of these algorithms. It also explains the details, which can show them they’re not that complicated or hard to implement, and that they’re not voodoo, just simple math used with the purpose of solving the problem.
  3. Ah interesting. There are many topics I’ve found hard to ‘grok’. In the book, some of the ensemble methods such as xgboost or gradient boosted trees took me a long time to understand and to simplify, since they have lots of ins and outs, and some of them sound a bit arbitrary. Once you really get into the details, they are not arbitrary, but it takes a while to realize. Other topics outside of the book scope, such as reinforcement learning, take me quite a long time, for the same reason. But I have a believe that everything can be ‘grokked’, and the ones that haven’t yet, is because we don’t fully understand them. 🙂
WingCode

Thank you Luis for the great answers! :)

WingCode

Luis Serrano I have few more questions 🙂

  1. What are the next upcoming topics under your “grokking” radar (excluding the ones in MEAP) ?
  2. How do you perform candidate selection of a topic for “grokking”? Is it the hype or general consensus that something is inherently difficult?
  3. Any plans for quantum mechanics or quantum computation grok? I asked because in pop culture adding “quantum” before any word is generally a good way to mystify a topic.
Luis Serrano

WingCode great questions, keep them coming! 🙂

  1. Unsupervised learning is a big one, since it covers generative learning. Aside from that, reinforcement learning is one.
  2. For this book, I picked the most popular algorithms of supervised learning, so it was pretty straightforward. As for other type of content creation (videos, etc), I normally have a list of things that I’m interested on because I’m trying to understand them, in topics like probability, ML, statistics, etc., so as I understand them, I create content about them. Some of them come out of projects at work, and others from things I watch or read, etc. It’s pretty random. 🙂
  3. YES!!! Quantum is definitely next on the list. Right now I’m working on quantum ML, and learning a lot of stuff. My goal is to understand it in a simple way, just like it is with ML. Definitely keep an eye for that material, because I’ll be grokking quantum a lot in the near future.
WingCode

Haha, thanks Luis. It is a pleasure to ask you questions & also to get your answers :)

Alex

Hola Luis Serrano, super excited to have you in here!
What is actually the target reader of your book? Who is it aimed for?
Muchas gracias!

Luis Serrano

Gracias Alex !
The book aims to be a one-size-fits-all, as it offers beginners a chance to get into ML, and the more experts a different and simpler view of the algorithms that perhaps they haven’t seen before.
But in a nutshell, the reader who’ll enjoy this book the most is the beginner in ML who comes without a very heavy knowledge of mathematics and programming. The reason is because in the book we explain the methods and algorithms from a more intuitive perspective, where the math is there, but more in drawings and examples than in formulas.

Lavanya M K

Hi Luis Serrano What are the tools you use to create illustrations in the book?

Luis Serrano

Hi Lavanya M K!
For the illustrations I used mostly keynote. For some of the plots and graphs I used matplotlib.

Utkarsh Agrawal

I recently read Machine Learning Yearning by Andrew NG and I particularly liked how he describes multiple scenarios and then suggest strategies for tackling them.
Would you guys recommend similar books?

Alexey Grigorev

I like “Rules of machine learning” from folks at Google. I find it somewhat similar to ML Yearning

Andrea Mordenti

Hi Luis Serrano
it is really exciting to have the possibility to chat with you. I am an adjunct professor both at university in Italy and high school and I believe the way you explain everything is straightforward for an audience that can span from very entry levels to people that are already approaching the world of AI. I’d like to suggest you book to my classroom and use it as a reference for the course 🙂 where will be available the printed version?
And, are you thinking about also a book for advanced users? If so, what would you like to focus on? Thanks!

Luis Serrano

Thank you Andrea Mordenti, glad you enjoy the explanations, and thank you for considering the book for the course!
The printed version will be available soon, hopefully in a month or two, I’ll keep you posted.

luckylittle

Welcome Luis Serrano - I like the grokking series of Manning and your book definitely caught my attention. I would be very interested in reading it. I checked your website https://serrano.academy/ to learn more about you. It is very impressive that you were part of the Google video recommendations team at YouTube, where you trained machine learning algorithms to recommend videos. My questions are:

  1. Quickly looking inside the book live preview, I see the chapter about overfitting/underfitting. The Fukushima power plant disaster when predicting the probability of a very strong earthquake is a devastating example of overfitting. What is the best way to train a predictive model that does not “follow” the training data too closely (and thus prevent potential disaster)?
  2. Has any of your experience from the YouTube recommendation team reflected chapters in the book?
  3. You have been writing it since ~2019, do you enjoy writing the book and is it going to be released soon?
Luis Serrano

Hi luckylittle, thank you for your interest in the book and the page! I hope you had a chance to check out the youtube videos.

  1. For anomaly detection, such as Fukushima, I would use unsupervised learning techniques, such as clustering, etc. Some of them, like DBSCAN, have a way to detect outliers.
  2. I add examples of recommendation engines such as youtube in several places in the book. For example, you can use linear regression to predict how long a video will be watched by a user based on how long that user has watched other videos. Also, classification methods can be used similarly to predict if the video will get watched or not. Also, there is an example of analyzing the text in Netflix reviews of movies, using python.
  3. The book will be released very soon, it’s in the last stage of production. I think there’ll be a physical version in about 1-2 months. I have enjoyed the writing (and re-writing) process, as it has taught me to understand things in a different way.
luckylittle

Luis Serrano Yes, I have checked out your YouTube videos - very impressive!

Luis Serrano

Hello everyone here! As I’ve been speaking to people about other types of content, I want to invite you to check out my youtube channel, where I have videos about machine learning explained in friendly conceptual ways (not with crazy formulas).

Krzysztof Ograbek

Luis Serrano What are other grokking books you would recommend to aspiring Data Scientists?

Luis Serrano

Krzysztof Ograbek I recommend grokking deep learning, grokking deep reinforcement learning and grokking algorithms. All very good. DL and DRL are written by two friends and former coworkers of mine (Andrew trash and Miguel morales), who are great at explaining
I also recommend advanced algorithms and data structures by Marcella la Rocca, also from manning, great for the fundamentals

Matthew Emerick

Hey, Luis Serrano! Thanks for doing this.
Do you talk about graph machine learning in your book?

Luis Serrano

Thanks Matthew Emerick !
No, the book focuses on supervised learning, and there is no graph machine learning
The books I recommend after this would be grokking deepnleanring and grokking deep reinforcement learning. They are a good continuation of the topic, and written in a similar style

Matthew Emerick

What do you consider the best follow up book after reading yours?

Alexey Grigorev

Hi Luis, you’ve had quite an interesting career so far. I’m curious to know a bit more about your experience as the content lead at Udacity and lead educator at Apple. How did you transition to this kind of role from being a ML engineer?

Alexey Grigorev

And as a follow-up, how did your day back then look like?

Luis Serrano

Thanks Alexey Grigorev!
Being at udacity was great because I really enjoy teaching and that’s what I was doing. Being able to lead a team that produces educational content was great because I would design the courses and was able to propagate my style to others, who would also inject their style into the courses, getting the best of both worlds. The transition to Apple was smooth since at Apple I was teaching workshops, it was actually fun because they were in person, and I enjoy that.
Coming from ML engineer to educator was very rewarding, because I think as a teacher more than as an engineer. The engineering mindset is more of a “get things done” mindset, while I like to understand things slowly and thoroughly, which is more suitable for teaching.

Luis Serrano

My day back then had a lot of meetings with instructors in my team, I would check their material and give them feedback. There were team meetings to decide on curriculum etc. I would try to crunch the meetings in some days, and leave some days empty so that i could sit down and create educational content for hours, as this can be a lengthy process.

Alexey Grigorev

Thanks for sharing it! Sounds quite fun

Luis Serrano

Thank you Alexey Grigorev for the invitation, and thank you all so much for your great questions! It’s an honor to be a part of this group!
Here is my contact for anyone who’d like to stay in touch, or has any further questions/comments!
email: luisgui.serrano@gmail.com
Webpage
LinkedIn

To take part in the book of the week event:

  • Register in our Slack
  • Join the #book-of-the-week channel
  • Ask as many questions as you'd like
  • The book authors answer questions from Monday till Thursday
  • On Friday, the authors decide who wins free copies of their book

To see other books, check the the book of the week page.

Subscribe to our weekly newsletter and join our Slack.
We'll keep you informed about our events, articles, courses, and everything else happening in the Club.


DataTalks.Club. Hosted on GitHub Pages. We use cookies.