Wiki

Teaching

Teaching data, ML, and AI through projects, feedback, community, documentation, bootcamps, and public explanation.

Related Wiki Pages

Community Building Documentation Technical Writing Developer Relations LLMs Retrieval-Augmented Generation Data Engineer Roadmap Career Transitions in Data Open Source Portfolio Evidence Data Engineering Portfolio Projects Machine Learning Portfolio Projects Job Search

Teaching is practical education for people trying to do real data, ML, and AI work. It’s rarely just lecture delivery. The strongest examples tie it to curriculum design, mentoring, and project work. Feedback, documentation, community support, and public explanation also do real teaching work.

For role-specific study paths, use the Data Engineering Roadmap, AI Engineering Roadmap, and MLOps Roadmap. For public work that proves skill, use Open Source Portfolio Evidence, Data Engineering Portfolio Projects, and Machine Learning Portfolio Projects.

Teach Through Real Work

Projects tend to sit at the center of technical education. Jeff Katz explains a bootcamp model that uses employer research, active learning, and repeated labs. Students also give feedback during the course. He names Python, SQL, and cloud fundamentals as the core skills for junior data engineers ^[1].

Jeff also explains why his junior curriculum drops Spark, Kafka, and Kubernetes early. Learners need interviewable fundamentals before platform breadth. That same episode connects teaching to career transition because the curriculum exists to move students from study into hired work ^[1].

Erum Afzal gives the community-scale version through Omdena, which turns real AI projects into courses. Collaborators work on practical problems, and the academy turns what they learned into structured lessons, foundational courses, and instructor pathways. She describes the project-to-course model directly ^[2].

Erum also discusses basic, intermediate, and advanced course tiers. Those tiers keep a large community from forcing every learner through the same path ^[2].

The DataTalks.Club course model centers project-based, end-to-end learning through the ML Bookcamp and Machine Learning Zoomcamp. The discussion connects learning with projects, notes, READMEs, and GitHub ^[3] ^[4]. This makes teaching inseparable from community building because office hours, public deadlines, and peer questions help learners keep moving.

Free access matters to the DataTalks.Club course model. The free-to-learn approach came from Open Data Science and from the value of free courses early in a data science career. The Data Engineering Zoomcamp grew when a student proposed a course and several community members split modules. Teaching became a shared community project rather than one instructor’s content pipeline ^[5].

Daniel Egbo gives a learner-side example of that teaching model. ML Zoomcamp helped move his astronomy work from notebooks toward reusable code and production practices. Later course projects added orchestration, object storage, Spark, and warehouse thinking. Teaching becomes a bridge between domain research and practical engineering. It isn’t only a set of lectures ^[6] ^[7].

Build Curriculum From Constraints

Strong curricula begin with constraints. Teachers need to know the target role, learner background, and employer expectations. They also need to account for data access, tools, and available time.

Jeff’s bootcamp episode shows this clearly when he describes market research and employer validation. He also covers syllabi, labs, and reinforcement cycles ^[8].

Jeff says most junior data engineering course time should stay on Python and SQL. Tools get a smaller share: he describes the balance as roughly 85% Python and SQL and 15% newer tools ^[9]. That guidance fits the data engineering roadmap because a roadmap should sequence skills, not collect every current tool.

Irina Brudaru adds a FinTech curriculum example. For a FinTech analytics program, she collected market practices, datasets, and exercises. She then found teachers for modules such as fraud, chargebacks, BigQuery, and data storytelling. The program also included business skills. Hands-on cloud teaching meant BigQuery access and shared datasets ^[10].

Irina returns to analyst fundamentals by naming SQL, visualization, and product tracking. She also includes soft skills and communication ^[10].

Alexander Guschin connects curriculum design to production ML. His teaching arc moves from Kaggle competition rigor to production ML, MLOps, and system design. His assignments also include communication and engineering quality. The episode covers problem-centered assignments such as bot detection ^[11].

The episode also discusses dual evaluation across model quality and technical execution. Competition-style learning becomes useful when the curriculum also teaches maintainable systems and collaboration ^[11].

David Bader brings curriculum design to the university-program scale. He describes founding data science schools and degree programs at Georgia Tech and NJIT. His approach aligns curricula with regional workforce needs and industry partnerships (NSF, Accenture, NVIDIA). He treats the research lab like a startup that ships open-source code such as Arkouda rather than only publishing papers. He also emphasizes mentoring from high school through PhD students ^[12].

Use Feedback and Mentoring as Teaching Infrastructure

Teaching depends on feedback, and Jeff says teachers should constantly check what students actually learned, not assume a clear lecture became understanding. He contrasts passive and active learning. Students learn more when they do the work, ask questions, and expose misunderstandings ^[1].

Irina’s mentoring discussion shows the same idea at one-to-one scale. She describes adapting explanations to the learner, including visual explanations for tables and databases. She also discusses finding technical reviewers and asking for feedback from engineering, domain, and community peers. Mentoring also gives learners a feedback channel that helps them correct SQL, modeling, product logic, and career direction ^[10].

Community gives feedback a place to happen, and Erum’s Omdena discussion covers live sessions, selection, and graduation. She also connects communities to skill discovery and faster learning ^[2].

The community-building account adds the same structure. Events, office hours, answered questions, and mentors make learning more durable than a course watched alone. Community also accelerates software-to-ML learning because peer groups give learners a place to test ideas and get feedback ^[13].

Teaching becomes part of the same feedback loop. Writing about a new topic, explaining it in public, and taking questions forces the learner to check what they understand. That work can become portfolio evidence or a career transition signal ^[13].

Teaching is community work, so it belongs with Community and Community Building. When that teaching moves into a larger event, the organizer questions belong with data AI conference building. Speaker programs, workshops, and community feedback become one learning system.

Teach Reproducibility, Documentation, and Open Work

Johanna Bayer treats teaching as research culture change. She describes teaching open science with Git, homework support, and course structure. She references Carpentries-style beginner curricula. The episode names packaging, environments, formatting, and tests as coding practices students should learn ^[14] ^[15].

That episode belongs with Documentation because reproducible work needs more than code. Johanna discusses README files, contribution guides, issues, and project communication for open-source projects ^[16].

Johanna connects open code to citations, collaboration, and career visibility. For data and ML learners, those are also portfolio signals. A good project explains how to run, review, and extend the work ^[17].

The resource path matters too. Johanna points learners toward The Turing Way, The Carpentries, and related ML handbooks. Those resources make open-science teaching a structured beginner curriculum rather than a vague call to “share code” ^[18].

Elle O’Brien makes a similar bridge from university teaching to developer relations. She discusses applied data science teaching and research reproducibility, then links teaching with DevRel through curriculum design and reusable video content ^[19].

The same work overlaps with Developer Relations, Open Source and Developer Relations, and Technical Writing because tutorials teach users through explanation. Demos teach through examples, and video lessons do the same. These materials also send product feedback back to maintainers.

Demo-first education shows that overlap. A useful developer-education video starts with the goal and keeps pace. It walks through the working feature so the viewer can reproduce the path rather than only hear a concept ^[20].

Make Concepts Visible and Explainable

Meor Amer focuses on explanation craft by using visuals to build intuition before the math. He discusses visualizing the verb and using metaphors. He also recommends breaking and modifying code to understand the machine learning behavior ^[21].

This visual-first approach complements Irina’s teaching of cohort analysis and fraud. It also complements her SQL teaching and Eugene Yan’s writing advice in ^[22].

Eugene Yan presents writing as a way to share, learn, and build career proof. He covers decision logs, rationales, and team memory. He recommends clear portfolio READMEs, quick starts, and repo tours, then connects practical writing habits with learning by teaching ^[22].

Public explanation also supports learn-in-public practice. Shawn Swyx Wang frames public learning as honest progress, correction, and earned expertise. He references open knowledge projects such as collaborative docs and cheat sheets. Learners can help others with notes, corrections, tutorials, and project writeups. Those public explanations can also turn study into open-source portfolio evidence ^[23].

For the AI career-switch version of this practice, see learning in public for an AI career switch.

Teach for Access and Career Mobility

Teaching also works as access infrastructure. Jeff’s bootcamp discussion includes part-time models, affordability, career services, and internships with employer projects. Irina teaches through NGOs and bootcamps, including FrauenLoop and AI Guild.

Irina discusses targeted outreach and partnerships for recruiting more women into Zoomcamps. She also names scheduling choices ^[10]. Erum’s Omdena Academy discussion adds free learner courses, organizational partnerships, and scholarships. It also adds women-focused support and pathways from courses into projects ^[2].

That access work matters for career transitions in data. Teaching helps career changers when it gives them a role target, repeated practice, and visible work. It also gives them a way to learn from interviews and rejections ^[24]. For AI engineering learners, the same access route connects bootcamp structure and visible practice to Nontraditional AI Engineering.

Irina’s learner moving into analytics needs SQL, visualization, product context, and communication. Jeff’s learner moving into data engineering needs Python, SQL, and data modeling. They also need cloud basics and interview practice.

DataTalks.Club keeps the course portfolio free to learn while adding newer LLM/RAG material for current AI engineering demand ^[5] ^[25]. This course path connects teaching with LLMs and Retrieval-Augmented Generation alongside AI Engineering Roadmap after the community has a practical topic.

Alexander’s learner moving into ML needs problem framing, validation, system design, and engineering quality.

University teaching can also become part of an independent practitioner income mix when expertise is visible enough to create course, curriculum, training, and consulting opportunities. Noah Gift frames course and curriculum work as a more scalable part of independent work than unlimited consulting. He then describes the university route as deep subject expertise plus professor relationships and written credibility such as a book. Readers should connect university teaching with technical writing, developer relations, data freelancing strategy, and career transitions in data ^[26] ^[27].

Learners shouldn’t default to more courses. They study a focused concept and apply it in a realistic project. Then they explain the work clearly, ask for feedback, and publish enough for another person to evaluate.

The learning loop connects to community, documentation, and portfolio pages.

Community Building and Community for peer support, office hours, events, and contributor paths.
Documentation and Technical Writing for READMEs, decision logs, tutorials, and team memory.
Developer Relations and Open Source and Developer Relations for teaching users through demos, docs, videos, and product feedback.
Career Transition and Career Transitions in Data for moving from study into data, ML, AI engineering, and analytics roles.
Open Source Portfolio Evidence, Data Engineering Portfolio Projects, and Machine Learning Portfolio Projects for turning learning into visible proof.

DataTalks.Club