MLOps Zoomcamp: Free MLOps course. Register here!

DataTalks.Club

Everyday ML Questions

by Santiago Valdarrama, Vladimir Haltakov

The book of the week from 16 May 2022 to 20 May 2022

A different way to learn

This book aims to teach you something new, one question at a time.

There are 20,000 + 1 machine learning and data science books out there. That’s great, but we wanted something different.

In April, we started publishing one machine learning question every day. A quick story with a problem and a few possible answers. Something quick, practical, and fun that you could solve in a few seconds. In just 30 days the site attracted more than 80,000 visitors and every day over 1,500 people answer the question and it keeps growing.

Many reached out asking for previous questions.

This book is a compendium of the 30 questions we published online from April 1 to April 30, 2022.

Questions and Answers

Alexey Grigorev

Hi Santiago and Vladimir! Welcome!
How did you come up with the idea of bnomial? Which problem did you see with other similar websites and blog posts?

Santiago

Thanks for inviting us!
I’m always been fascinated by multi-choice questions because they are an excellent way to exercise your brain.
When I’m studying something new, I need to find ways to practice. Sometimes building projects is the way, but sometimes that’s not an option.
So we decided to build a site that would help people with their machine learning and data science journeys. Not to overwhelm people and make them sit through a long exam, but to go at it in different way: a little bit every day.
Bnomial tries really hard to do what I want for my life: to build a learning habit.

Santiago

Our goal is not to compete with blog posts, tutorials, videos, or books. Those are great!
Our goal is to become a different component of your learning journey. One that’s fun, doesn’t take a long time, and will stick with you for a long time.

Alexey Grigorev

Have you thought of doing something like duolingo eventually? I use it for learning German

Santiago

This is an interesting question.
Here is where we are headed in the short term: creating a scoring system that encourages people to show up and answer more questions.
Letting you know where you stand, and helping you track your improvements over time is key.
Right after that, we’ll see where we want to take this. We are taking it one step at a time.

Alexey Grigorev

By the way, that’s what duolingo does as well - they really built-in gamification into learning and did it pretty well. If you haven’t used it previously, check it out

Santiago

Yeah, I’ve used it. My daughter learns French through it.

Shaksham Kapoor

Santiago & Vladimir Haltakov Thank you both for such a unique initiative and for writing the first volume of this book 🙂 🙌
I have a few questions for you:-

  • The questions that you post on your website, for instance, today’s question, it is a mix of an algorithm taught in AI + another taught in Data Science. So, this mix-and-match thing, is it something that you intend to do? Trying to show similarities and differences between concepts from similar fields.
  • Curiosity - how do you come up with these questions? Are these some sort of interview questions or something that you come up with while studying?
  • Are you planning to cover the recent advancements in NLP, CV etc., too via these questions?
    All the best for your future endeavors 👍
Santiago

Hey Shaksham Kapoor!

  1. You’ll see purely theoretical and practical questions. They range from probabilities and statistics, to algorithms, machine learning, data analysis, game theory and pretty much anything under the Artificial Intelligence umbrella. This is on purpose: our goal is not to tackle one specialized narrow area, but to find ways to increase the knowledge we all have.
  2. Some of these are problems we deal with every day. Some are the result of what we read or conversations we have with others.
  3. Absolutely! If it’s useful, there’s a place for it here.
Shaksham Kapoor

Thank you for your response!!
Another question, we have certifications for data science like “AWS Certified Machine Learning – Specialty”, any plans to cover questions asked in these exams in the future?
PS: of course not just focused on AWS but in general if anyone wants to prepare for a certification exam.
> I like the way the questions are posted, like storytelling, so it helps connect things quickly + we don’t forget it easily 🙂

Santiago

I’d love to cover some of those questions—at least the core learnings they are trying to evaluate!

Shaksham Kapoor

That would really be helpful. I know these certifications do have suggested learning paths to help prepare for the exams, but they are more focused around their ecosystem and how it is used for performing data science.
As you mention, if there is a possibility to cover generic things that any of these certifications evaluate on, then it’s worth looking at 🙂

Santiago

definitely!

Alexey Grigorev

What were the easiest and the most difficult questions so far?

Santiago

The hardest question so far was the one we published on April 19. It was about the Softmax activation function.
The easiest question was the one we published on April 30. It was about the definition of hyperparameter tuning.

Alexey Grigorev

What was the question on april 19? maybe you could share the pages from the book with it? really curious about it

Santiago

Here is the question:

Alexey Grigorev

Hmm I initially wanted to say it’s one, but softmax doesn’t get the input to the network. It gets the output of the previous dense layer.
It also doesn’t return the index or the max value, so 2 and 4 are also our.
The closest is 3, but it doesn’t sort the probabilities…
So which one is it? =)

Santiago

2 and 4 don’t say that softmax returns that thing. They say that softmax is a version of the function that returns that thing.

Santiago

Basically, Option 2 says that Softmax is a smooth version of argmax
Option 4 says that Softmax is a smooth version of max.

Alexey Grigorev

I guess then it’s probably that this value is max… then it’s option 4?
But I understand now why it was the most difficult one =)
Thanks for sharing it!

Santiago

Here is part of the answer 🙂

Alexey Grigorev

Nice, thanks!

Alexey Grigorev

How many people got this one right?

Santiago

Can’t tell. We were’t storing individual data back then. We were only tracking correct choices (every question has up to 4 correct choices.)

Santiago

So I can tell you the overall performance, but not how many people got it correctly.

Santiago

But yeah, it’s been by far the most difficult question. Very tricky + dense wording = HARD

Alexey Grigorev

How much time do you spend preparing questions and answers? Is it something you do daily or you do it once per week?

Alexey Grigorev

And also curious to know how you work together on this. How does the process look like?

Santiago

I write one question every day and review and schedule another question.
Vladimir does the first round of reviews to everything I write. When he finishes, I let it sit for a couple of weeks before I do the final review.
Leaving some time between the writing and the editing session is key to get a much stronger question.
In summary, we write a question, then they go through a couple editing sessions before they get scheduled.

Vladimir Haltakov

When I write questions I try to write a couple at once. However, I have to admit recently Santiago is writing almost all of them…
One interesting thing is that we do this fully asynchronously. We have a GitHub repo and we use the issues. One issue is one question. The one who writes it assigns it to the other one for review. We use several labels to track where the question is in the pipeline.

adanai

I enjoy the case-study based exercises, thank you for doing this 😄
The questions in some way make one feel that they are planning a decision than just giving out an answer, it’s nice!
Would you….consider ….doing a freemium model for students? It would be helpful!

Santiago

Hey Adanai, answering the daily question is completely free and it will stay that way.
We are charging money for the book with every monthly question, but that’s a byproduct of what we really want to accomplish: get people to show up every day and learn something new.

Merve Unlu

Hello! Thank you Santiago and Vladimir Haltakov for this interesting and fun way of learning. The detailed explanation about the answers is very helpful.
I am not sure if there is one, but maybe a discussion forum could be useful. Could other people contribute to publish/generate questions?

Santiago

Thanks, Merve!
We are planning to enable a Discord server so people can get together and discuss individual questions.
Regarding collaborations, we are taking it very slow because we aren’t sure how to compensate people yet. Right now, I’m working with a person that reached out to me to try and explore how collaborations would look like. Based on the experience, we will open it up to more people in the future.
(The main issue with collaborations is compensation and rights. We are using this question as comercial content, so we need to ensure we retain the rights to the questions. That’s why we want to compensate those who contribute.)

Merve Unlu

Thank you for the answer.

Gur Hevroni

Hi Santiago & Vladimir Haltakov, I have a non-technical question.
The concept of daily problems is a great choice (very much like the awesome https://thedailybyte.dev/), and your mission statement is aligned with that idea (by mentioning the overwhelming amount of machine learning and data science books out there).
But in essence, when I look at your subscription plans, it seems that your subscribers are getting a ML-DS book every month…
Is there a more creative way to keep your subscribers engaged (and paying) instead of overwhelming them with more books?
I don’t mean to sound sarcastic or patronizing, I’m truly curious!

Santiago

This is a great question. Just for now, the subscription will get people every question we publish, but we will not stop there.
We have several ideas to improve that subscription over time. Without promising much, I can tell you that there’s a lot of value in a community of machine learning and data scientist professionals, so we are thinking on ways we can tap on that and connect companies with individuals.

Vladimir Haltakov

I like to think about the books as a reference. The goal is to have people coming to the website every day and answering the questions. As Santiago mentioned, we are working on a kind of a leaderboard to gamify the process a little bit 🙂
The book is if you want to go through the old questions. Maybe you had a problem at work and remembered that you saw the question on Bnomial and want to check it again and follow the reference link. Or maybe you are preparing for an interview and want to test your knowledge on past questions? Then you need the book.

Santiago

By the way, Gur Hevroni I didn’t know about TheDailyByte. Seems to be pretty similar.

Alexey Grigorev

Where and how do you get inspiration? It’s quite challenging to come up with a question and the explanation for the answer every day

Santiago

That’s my super power 🙂
I draw inspiration all around me, from the problems I face, the books I read, the conversations I have.

Alexey Grigorev

I guess this is what happens when you tweet multiple times per day for more than a year? 😃

Shaksham Kapoor

Do you think it would be helpful to include the concept of ELI5 in your current model? For instance, currently, the questions assume that anyone attempting it has background in AI (in general); however, that is not always the case.
Therefore, it would be helpful if in future, you can explain the important concepts from stats, probability, ML, DL etc. in an ELI5 manner. I believe it will be very helpful, since it is difficult to find ELI5 examples of such concepts. Of course, for a few of these, you can find ELI5 explanations, but those are scattered across the web. So, a common place for anything like this would definitely be an extremely useful resource.

Vladimir Haltakov

I think we already kind of do this sometimes. There are certain questions on more basic concepts and the explanation gives you more background about.
Like for example, questions about the bias and variance trade-off.

Gregor

Are you in any way affiliated with Lindsey Martin? She posts the #DailyML questions (as far as I know from r/dailymachinelearning). It seems like your questions are more of case-study-type (from what I see at today.bnomial.com) while hers are very short statements. I still like the overall concepts as people start interacting with each other, discussion the questions and edge cases and even seemingly simple questions often offer some new insight.

Vladimir Haltakov

I didn’t know about Lindsey Martin and #DailyML - thanks for sharing! It certainly looks very interesting and I’ll check it out in more detail.
I couldn’t find the subreddit you mentioned, is it maybe r/learnmachinelearning?
And regarding discussions - I totally agree! This is something we have on the roadmap - a community where people can discuss about the questions.

Arsen Poghosyan

I just love the picture and the font of the cover, it’s so different from the covers of other ML related books! Did any of you guys (Santiago and Vladimir Haltakov) participate in making this picture and in designing the book?

Vladimir Haltakov

Thanks 🙂 The image is in fact AI-generated using VQGAN + CLIP. I wrote more how this is done here: https://twitter.com/haltakov/status/1455982555610636291
After that Santiago did some more processing to get the final look 🙂

Arsen Poghosyan

Thanks for the link! AI-generated picture for the book about AI, that totally makes sense 🙂

Vladimir Haltakov

Yeah, that was the goal 🙂

Allan

Thanks for answering questions here. Are all of the daily questions multiple choice format? That seems like one of the challenging parts in creating these questions - taking complex concepts and preparing answers choices that are at just the right level of difficult for learning purposes. Also, love the provided recommended reading links in the solution section!

Vladimir Haltakov

Thank you! Yes, creating this type of question is more difficult than the usual trivia questions, but they teach you much more.
They are all multiple choice for now (and usually more than one choice can be correct), but we also have an idea how we can extend them a bit so you will have to write some simple code.

Santiago

Interesting. We are not affiliated with her, but I’ll check her work out!

Allan

Thanks!

Sandhya G

Awesome concept. Wordle for ML. I think the website/ app + chat works better for this than a book format. The information is just right-sized.
I like the setup of today’s question, Riley’s speed dating match. It’s fun, light hearted.
One thing is that the UI needs a bit more clarity. Once I submitted the answers, it was not clear from the > and * are the ones that I chose.

Santiago

You are right.
Funny story: a few weeks into this, we decided to change the way the colors and symbols work. For some reason, I talk myself into believing that the new way was better.
We released it. People hated it so much that we reverted back a few hours later.
I do agree with you: we need to think about it a little bit more carefully.

Tim Becker

Very interesting Santiago and Vladimir Haltakov. Do you attempt to switch between more theory and more case-study related questions? Also, do you try to evenly cover different areas of ML or is the selection based on your experience/preference or just randomly?

Santiago

For now, it’s mostly random, although I try to ensure that questions covering similar topics don’t go out close to each other.
We have a little bit of everything: there are purely theoretical questions, and other more practical ones. We are planning to keep it that way with a slight bias towards practicality.
And undoubtedly, the flavor of the questions highly depends on our own knowledge and experience. For example, I don’t have time-series analysis experience, so it’s hard for me to write about that.

Prateek Joshi

Santiago and Vladimir Haltakov: Kudos on building bnomial. Love the concept. The questions are engaging. What’s the future plan for this as it continues to grow? Would this be a full fledged learning platform for ML? Also will you be expanding into other types of content to support the learning needs of different types of ML students?

Santiago

Hey Prateek! Thanks for the comment!
For now, we are focused on bringing a community of people that want to learn while answering these practical questions.

Santiago

My focus is 100% on helping them build a new habit. Something they can do every day and make improvements without even noticing.

Prateek Joshi

Santiago Thank you

Batul Bombaywala

Santiago Vladimir Haltakov great website - its a fun way to learn!! My observation is that i just dont remember to go to the website everyday - is it possible for you guys to send email/notification everyday?

Vladimir Haltakov

Oh, we have this feature already 🙂
You can click LOGIN in the menu of the main page and enter your email. We will send you the new question every day then 🙂

Santiago

This might help:

Santiago

But point well taken: we need to make sure this is more obvious.

Arsen Poghosyan

I have registered 2 days ago and receive daily just one email with the question, no more and no less. To me everything is very convenient 🙂

Batul Bombaywala

oh, i missed this! thank you!

Santiago

No worries. I need to do a better job at making sure that’s obvious.

To take part in the book of the week event:

  • Register in our Slack
  • Join the #book-of-the-week channel
  • Ask as many questions as you'd like
  • The book authors answer questions from Monday till Thursday
  • On Friday, the authors decide who wins free copies of their book

To see other books, check the the book of the week page.

Subscribe to our weekly newsletter and join our Slack.
We'll keep you informed about our events, articles, courses, and everything else happening in the Club.


DataTalks.Club. Hosted on GitHub Pages. We use cookies.