Wiki
Teaching
How DataTalks.Club guests teach data, ML, and AI through projects, feedback, community, documentation, bootcamps, and public explanation.
Related Wiki Pages
Teaching in the DataTalks.Club archive is practical education for people trying to do real data, ML, and AI work. Guests rarely describe it as lecture delivery. They tie it to curriculum design, mentoring, and project work. Feedback, documentation, community support, and public explanation also do real teaching work.
For role-specific study paths, use the Data Engineering Roadmap, AI Engineering Roadmap, and MLOps Roadmap. For public work that proves skill, use Open Source Portfolio Evidence, Data Engineering Portfolio Projects, and Machine Learning Portfolio Projects.
Teach Through Real Work
DataTalks.Club guests tend to make projects the center of technical education. In Teaching Data Engineers, Jeff Katz explains a bootcamp model that uses employer research, active learning, and repeated labs. Students also give feedback during the course. Around 23:35, he names Python, SQL, and cloud fundamentals as the core skills for junior data engineers.
Around 38:05, Jeff explains why his junior curriculum drops Spark, Kafka, and Kubernetes early. Learners need interviewable fundamentals before platform breadth. That same episode connects teaching to career transition because the curriculum exists to move students from study into hired work.
Erum Afzal gives the community-scale version in Community Building and Teaching in AI & Tech. Omdena turns real AI projects into courses. Collaborators work on practical problems, and the academy turns what they learned into structured lessons, foundational courses, and instructor pathways. Around 10:19, she describes the project-to-course model directly.
Around 46:33, Erum discusses basic, intermediate, and advanced course tiers. Those tiers keep a large community from forcing every learner through the same path.
DataTalks.Club Behind the Scenes explains the DataTalks.Club course model. The ML Bookcamp and Machine Learning Zoomcamp discussion around 38:22 centers project-based, end-to-end learning. Around 50:31, he connects learning with projects and notes. He also connects it with READMEs and GitHub. This makes teaching inseparable from community building because office hours, public deadlines, and peer questions help learners keep moving.
Build Curriculum From Constraints
The archive’s strongest curriculum discussions begin with constraints. Teachers need to know the target role, learner background, and employer expectations. They also need to account for data access, tools, and available time.
Jeff’s bootcamp episode shows this clearly around 9:58, when he describes market research and employer validation. Around 11:44, he discusses syllabi, labs, and reinforcement cycles.
Around 56:46, Jeff says most junior data engineering course time should stay on Python and SQL. Tools get a smaller share. That guidance fits the data engineering roadmap because a roadmap should sequence skills, not collect every current tool.
Irina Brudaru adds a domain curriculum example in Teaching and Mentoring in Data Analytics. For a FinTech analytics program, she collected market practices, datasets, and exercises. She then found teachers for modules such as fraud, chargebacks, BigQuery, and data storytelling. The program also included business skills. Around 25:43, she explains hands-on cloud teaching with BigQuery access and shared datasets.
Around 58:08, Irina returns to analyst fundamentals by naming SQL, visualization, and product tracking. She also includes soft skills and communication.
Alexander Guschin connects curriculum design to production ML in Competitive Machine Learning and Teaching. His teaching arc moves from Kaggle competition rigor to production ML, MLOps, and system design. His assignments also include communication and engineering quality. Around 46:50, the episode covers problem-centered assignments such as bot detection.
Around 50:10, the episode discusses dual evaluation across model quality and technical execution. Competition-style learning becomes useful when the curriculum also teaches maintainable systems and collaboration.
Use Feedback and Mentoring as Teaching Infrastructure
Teaching in these episodes depends on feedback. Jeff says around 3:56 that teachers should constantly check what students actually learned, not assume a clear lecture became understanding. Around 5:44, he contrasts passive and active learning. Students learn more when they do the work, ask questions, and expose misunderstandings.
Irina’s mentoring discussion shows the same idea at one-to-one scale. Around 9:34, she describes adapting explanations to the learner, including visual explanations for tables and databases. Around 45:24, she discusses finding technical reviewers and asking for feedback from engineering, domain, and community peers. Mentoring isn’t only encouragement here. It’s a feedback channel that helps a learner correct SQL, modeling, product logic, and career direction.
Community gives feedback a place to happen. Erum’s Omdena discussion around 22:29 covers live sessions, selection, and graduation. Around 37:26, she connects communities to skill discovery and faster learning. DataTalks.Club’s own community episode adds the same structure. Events, office hours, answered questions, and mentors make learning more durable than a course watched alone.
That connects teaching directly to Community and Community Building.
Teach Reproducibility, Documentation, and Open Work
Johanna Bayer treats teaching as research culture change in Teaching Open Science and Reproducible Research. Around 5:27, she describes teaching open science with Git, homework support, and course structure. Around 7:39, she references Carpentries-style beginner curricula. Around 27:38, the episode names coding practices students should learn. They include packaging, environments, formatting, and tests.
That episode belongs with Documentation because reproducible work needs more than code. Around 49:46, Johanna discusses README files and contribution guides for open-source projects. She also discusses issues and project communication.
Around 36:05, Johanna connects open code to citations, collaboration, and career visibility. For data and ML learners, those are also portfolio signals. A good project explains how to run, review, and extend the work.
Elle O’Brien makes a similar bridge from university teaching to developer relations in DevRel for Data Science. Around 7:50, she discusses applied data science teaching and research reproducibility. Around 52:06, she links teaching with DevRel through curriculum design and reusable video content.
The same work overlaps with Developer Relations, Open Source and Developer Relations, and Technical Writing because tutorials teach users through explanation. Demos teach through examples, and video lessons do the same. These materials also send product feedback back to maintainers.
Make Concepts Visible and Explainable
Meor Amer focuses on explanation craft in Using Visualizations to Explain Machine Learning. Around 11:40, he describes visuals as a way to build intuition before the math. Around 17:33, he discusses visualizing the verb and using metaphors. Around 43:37, he recommends breaking and modifying code to understand the machine learning behavior.
This visual-first approach complements Irina’s teaching of cohort analysis and fraud. It also complements her SQL teaching and Eugene Yan’s writing advice in Technical Writing for Data Scientists.
Eugene Yan presents writing as a way to share, learn, and build career proof. Around 54:00, he covers decision logs, rationales, and team memory. Around 56:30, he recommends clear portfolio READMEs, quick starts, and repo tours. Around 58:30, he connects practical writing habits with learning by teaching.
Public explanation also appears in Learn in Public. Shawn Swyx Wang frames public learning as honest progress, correction, and earned expertise around 23:53. Around 47:14, he references open knowledge projects such as collaborative docs and cheat sheets. Learners can help others with notes, corrections, tutorials, and project writeups. Those public explanations can also turn study into open-source portfolio evidence.
Teach for Access and Career Mobility
Several guests connect teaching to access. Jeff’s bootcamp discussion includes part-time models, affordability, and career services. It also includes internships with employer projects. Irina teaches through NGOs, bootcamps, FrauenLoop, and AI Guild.
Around 41:16, she discusses recruiting more women into Zoomcamps through targeted outreach, partnerships, and scheduling choices. Erum’s Omdena Academy discussion adds free learner courses, organizational partnerships, and scholarships. It also adds women-focused support and pathways from courses into projects.
That access work matters for career transitions in data. Teaching helps career changers when it gives them a role target, repeated practice, feedback, and visible work. Irina’s learner moving into analytics needs SQL, visualization, product context, and communication. Jeff’s learner moving into data engineering needs Python, SQL, and data modeling. They also need cloud basics and interview practice.
Alexander’s learner moving into ML needs problem framing, validation, system design, and engineering quality.
Guests don’t tell learners to take more courses. They ask learners to study a focused concept and apply it in a realistic project. Then learners explain the work clearly, ask for feedback, and publish enough for another person to evaluate.
People and Episodes
Start with these teaching-focused episodes:
- Teaching Data Engineers with Jeff Katz for bootcamp curriculum, active learning, fundamentals, and job-aligned data engineering education.
- Teaching and Mentoring in Data Analytics with Irina Brudaru for FinTech analytics curriculum, BigQuery labs, mentoring, diversity, and analyst fundamentals.
- Community Building and Teaching in AI & Tech with Erum Afzal for project-to-course AI education and community-based learning paths.
- Teaching Open Science and Reproducible Research with Johanna Bayer for Git, reproducible manuscripts, open-source onboarding, and research software engineering practices.
- Using Visualizations to Explain Machine Learning with Meor Amer for intuition-first machine learning explanations.
- Competitive Machine Learning and Teaching with Alexander Guschin for large-scale ML education, competition rigor, MLOps, and system design assignments.
- DevRel for Data Science with Elle O’Brien for teaching, reproducibility, content, documentation, and developer relations.
- Technical Writing for Data Scientists with Eugene Yan for writing as learning, documentation, and portfolio explanation.
- Learn in Public with Shawn Swyx Wang for public learning, feedback, open knowledge projects, and career visibility.
- DataTalks.Club Behind the Scenes for project-based courses, public deadlines, office hours, and community-supported learning.
Use these adjacent pages for deeper work:
- Community Building and Community for peer support, office hours, events, and contributor paths.
- Documentation and Technical Writing for READMEs, decision logs, tutorials, and team memory.
- Developer Relations and Open Source and Developer Relations for teaching users through demos, docs, videos, and product feedback.
- Career Transition and Career Transitions in Data for moving from study into data, ML, AI engineering, and analytics roles.
- Open Source Portfolio Evidence, Data Engineering Portfolio Projects, and Machine Learning Portfolio Projects for turning learning into visible proof.