Season 20, episode 2 of the DataTalks.Club podcast with Alexander Guschin
The transcripts are edited for clarity, sometimes with AI. If you notice any incorrect information, let us know.
Alexey: This week, we’ll talk about competitive machine learning and teaching. Our special guest today is Alexander Guschin, a machine learning engineer with 10+ years of experience, a Kaggle Grandmaster ranked 5th globally, and a teacher to over 100,000 students. He leads data science and software engineering teams and contributes to open-source machine learning tools. Welcome, Alexander! (3:50)
Alexander: Hi, thank you for having me. It’s great to talk to you today. (5:38)
Alexey: I also want to thank Marina for helping us get in touch. I’ve been following your work for over 10 years, so it’s a great moment to finally talk to you. (5:42)
Alexey: The questions for today’s interview were prepared by Johanna Bayer. Thank you, Johanna, for your help. (5:42)
Alexey: Before we dive into teaching and competitive machine learning, let’s start with your background. Can you tell us about your career journey so far? (5:42)
Alexander: Sure. My career started about 10 years ago when I was a student at a university in Moscow. I was looking for an opportunity to learn machine learning, which was a new and exciting field at the time. I found that opportunity on Kaggle. (6:28)
Alexander: After Kaggle, I moved to the industry. I worked as a data analyst at Avito, then as a data scientist at Yandex, handling ride-sharing services. Later, I worked at Mechanica AI, a startup focused on bringing machine learning to industrial enterprises. We even traveled to small cities in Siberia to find product engineers and see if they needed our machine learning solutions. (6:28)
Alexander: Eventually, I worked at Iterative.ai, a San Francisco-based startup that creates open-source tools for machine learning engineers and data scientists. After that, I moved back to Moscow to work at Central University, where I now focus on teaching. (6:28)
Alexey: Since you mentioned Iterative.ai, I want to thank them for supporting this community. Their support helped grow the community to where it is today. (8:06)
Alexander: Yes, I was part of the team. My friend Mikhail Sveshenkov gave a talk at one of your events about a tool we were building called MLEM. (8:36)
Alexey: MLEM? Is that the sound a cat makes? (8:55)
Alexander: Yeah, it’s based on an internet meme where a dog or cat sticks its tongue out. We were trying to be funny. (9:07)
Alexey: Are you still involved in open-source projects? (9:25)
Alexander: Not right now. I mostly worked on open-source tools while at Iterative. (9:31)
Alexey: You mentioned starting your career as a student 10 years ago with Kaggle. For me, Kaggle was also a gateway into data science. I thought I was good at machine learning, but my first Kaggle competition showed me how much I still had to learn. (9:39)
Alexander: I started Kaggle while studying at university. I chose a machine learning lab for my bachelor’s thesis but quickly realized how challenging it was. I found Kaggle as a place to test my practical skills. (11:41)
Alexander: There was a strong Kaggle community in Moscow, with weekly sessions where people shared solutions and discussed ideas. I spent a lot of time on Kaggle, even skipping lectures to focus on competitions. (11:41)
Alexey: Skipping lectures? Did you get kicked out of university? (14:18)
Alexander: No, the educational system allowed me to skip most of the semester and focus on exams. (14:33)
Alexey: How did you balance Kaggle and university? (14:47)
Alexander: I spent a lot of time on Kaggle, but I managed to pass my exams. It was a year of intense focus before I got my first job. (14:55)
Alexey: In Germany, it’s similar—what matters is how well you pass the exams, not how much you study during the semester. (15:24)
Alexander: Yeah, the system is changing now, but back then, it worked for me. (15:42)
Alexey: You said the secret to success on Kaggle is putting in a lot of time. I can confirm that. In one competition, I spent 3 months working on it, putting in 5-10 hours per week. It was exhausting, but I learned a lot. (15:56)
Alexander: Kaggle broadens your perspective on machine learning. You learn different domains, models, and frameworks, which is incredibly valuable when you start working in the industry. (17:10)
Alexey: How long did you compete on Kaggle? (17:34)
Alexander: About a year. After that, I got my first job and continued competing, but it became harder to balance. (17:45)
Alexey: Was your first job as a data analyst? (18:14)
Alexander: Yes, at Avito. (18:18)
Alexey: Kaggle helped me throughout my career. I worked on a competition about duplicate detection, and that experience was useful in my job at OLX, which was the parent company of Avito. (18:20)
Alexander: My first serious competition was on a platform called CrowdAnalytix. The training set was small, so it was more about luck than skill. (18:54)
Alexey: Kaggle solutions often involve complex stacking, which isn’t practical for production. But you still learn a lot from competing. (20:25)
Alexander: Yes, you learn to explore the data carefully, do EDA, and build robust validation strategies. (20:57)
Alexey: Did you do anything special to prepare for competitions? (21:42)
Alexander: Iteration is key. The more competitions you solve, the easier it gets. For hackathons, it’s useful to set up infrastructure and create baselines in advance. (21:53)
Alexey: Do you still take part in hackathons? (22:41)
Alexander: Not anymore. Now I’m usually on the organizing side. (22:45)
Alexey: How did Kaggle help you transition to the industry? (22:55)
Alexander: Kaggle broadened my perspective on machine learning. It helped me understand different domains and frameworks, which was valuable in job interviews. (23:51)
Alexander: Kaggle also taught me to spot data leaks, which is useful in production. Plus, it helped me build a public profile and connect with others in the community. (23:51)
Alexey: Do you think Kaggle is still useful for starting a data science career today? (25:51)
Alexander: It depends on your location and opportunities. In places like Kazakhstan, where educational resources are limited, Kaggle is still valuable. But in the US or Europe, there are other ways to start a career. (26:18)
Alexey: In Eastern Europe, competitions like Kaggle are more popular, possibly because of the strong focus on mathematics in schools. (27:49)
Alexander: That could be a factor. (28:35)
Alexey: Should we do competitive machine learning alone or in teams? (28:57)
Alexander: It’s better to form a team or at least have a community to discuss ideas. Different people explore different solutions, which can lead to better results. (29:08)
Alexey: Teams can combine models through stacking, which is often more effective. (29:54)
Alexander: Yes, even without stacking, having multiple predictions can be beneficial. (30:07)
Alexey: I remember a competition where the winning team combined expertise in different areas. (30:45)
Alexander: Yes, they found a data leak, which was key to their success. (31:14)
Alexey: You teach school children about machine learning? (31:24)
Alexander: Yes, I teach 16-18-year-olds as part of an international AI Olympiad. (31:30)
Alexey: It’s impressive that they can solve competitions at such a young age. (32:22)
Alexander: They learn machine learning on their own. (33:02)
Alexey: How did you start teaching? (33:19)
Alexander: I joined a class at my university and was asked to teach instead of being a student. (33:25)
Alexey: So you wanted to join as a student but ended up teaching? (34:09)
Alexander: Yes, I was overqualified for the class. (34:11)
Alexey: Teaching helps you learn. (34:38)
Alexander: Absolutely. Teaching forces you to understand the material deeply. (34:42)
Alexey: How did your teaching journey continue? (35:15)
Alexander: I found teaching fulfilling and continued. I also realized I needed to improve my understanding of machine learning theory to teach effectively. (35:25)
Alexey: The best way to learn something is to teach it. (36:21)
Alexander: Totally agree. (36:30)
Alexey: When I studied machine learning, we focused on theory, but it wasn’t very practical. (37:08)
Alexander: I prefer teaching practical skills, like production machine learning and MLOps. (37:57)
Alexey: How do you approach teaching now? (39:51)
Alexander: I design problems that students find interesting and build the curriculum around them. For example, I have a class where students build a classifier to detect bots in dialogues. (40:02)
Alexey: That’s a great approach. (41:00)
Alexander: Another class focuses on machine learning system design, where students work on real-world problems. (41:10)
Alexey: How did you develop your production machine learning skills? (42:33)
Alexander: After Kaggle, I worked as a data analyst and realized I needed to improve my Python and machine learning skills. I moved to Yandex, where I learned more about statistics and Python. (42:45)
Alexander: Later, I worked at Mechanica AI, where I learned about Docker, Kubernetes, and FastAPI. This helped me become a full-stack machine learning engineer. (42:45)
Alexey: It’s interesting how you and Mikhail took different paths but ended up in similar roles. (45:42)
Alexander: Yes, we started from different positions but ended up doing similar work. (45:56)
Alexey: How do you approach teaching now? (46:41)
Alexander: I design problems that students find interesting and build the curriculum around them. For example, I have a class where students build a classifier to detect bots in dialogues. (46:50)
Alexey: Do students work individually or in teams? (50:02)
Alexander: They work in teams to learn collaboration and communication. (50:10)
Alexey: Where is the competitive aspect in your classes? (51:16)
Alexander: Students deploy their classifiers on a platform, and we have leaderboards for both ML metrics and technical metrics like uptime and response time. (51:22)
Alexey: That’s an amazing idea. (52:22)
Alexander: It helps students focus on building reliable solutions. (52:27)
Alexey: Do you have a leaderboard? (53:00)
Alexander: Yes, we have two leaderboards: one for ML metrics and one for technical metrics. (53:04)
Alexey: How many students are in your classes? (53:44)
Alexander: Around 80 students per class. (53:50)
Alexey: You also teach online courses, right? (54:05)
Alexander: Yes, I created a Coursera course on winning Kaggle competitions, which has been taken by over 100,000 students. (54:10)
Alexey: How do you manage teaching so many students online? (54:48)
Alexander: Online courses are a different format. You design the class, create automated checks, and have a team to answer questions. (54:55)
Alexey: Your approach to teaching is inspiring. (55:41)
Alexander: Thank you. (55:49)
Alexey: Do you develop the platform yourself? (56:16)
Alexander: No, I oversee the development, but students and mentors handle the coding. (56:22)
Alexey: Are students developing it, or is it professional software engineers? (56:38)
Alexander: Students develop it as part of a university project. (56:47)
Alexey: How do you manage the turnover of students? (57:28)
Alexander: We document everything and have mentors to guide new students. (57:35)
Alexey: Do you bring in industry mentors? (58:03)
Alexander: Yes, we have mentors from companies like Tinkoff. (58:10)
Alexey: Do you have a few more minutes for questions? (59:40)
Alexander: Of course. (59:50)
Alexey: One question is about convincing managers of the value of Kaggle. (59:51)
Alexander: It’s tricky, but showing examples of successful Kagglers in the company or industry can help. (1:00:08)
Alexey: Another question: Is there a secret trick that works in competitions? (1:01:28)
Alexander: There’s no single trick, but careful data exploration and validation are key. (1:01:48)
Alexey: So the secret is to be thorough. (1:02:46)
Alexander: Exactly. (1:02:53)
Alexey: How does generative AI affect competitive machine learning? (1:03:11)
Alexander: Tools like ChatGPT can boost productivity, but they can’t replace the need for careful problem-solving. (1:03:20)
Alexey: Do people use autoML in competitions? (1:03:57)
Alexander: Yes, tools like AutoML can provide good baselines, but they can’t deliver winning solutions alone. (1:04:34)
Alexey: What’s your current Kaggle rank? (1:05:13)
Alexander: It’s not great. I don’t compete actively anymore. (1:05:47)
Alexey: People still recognize you from your Kaggle days. (1:06:20)
Alexander: Yes, Kaggle was a big part of my career. (1:06:22)
Alexey: Do you still compete? (1:06:47)
Alexander: No, I focus on teaching now. (1:06:49)
Alexey: There are still people who make a living from Kaggle. (1:07:31)
Alexander: Yes, some people are still very active and successful. (1:07:37)
Alexey: The Kaggle leaderboard has changed a lot. (1:08:18)
Alexander: Yes, it’s a different landscape now. (1:08:22)
Alexey: Alexander, thank you for joining us today. It was great to talk to you. (1:08:54)
Alexander: Thank you, Alexey. Thank you, everyone. (1:08:57)
Alexey: Have a great week! (1:08:59)
Subscribe to our weekly newsletter and join our Slack.
We'll keep you informed about our events, articles, courses, and everything else happening in the Club.