Podcast
How to Become a Data Engineer: Skills, MLOps, Pipelines, SQL, CI/CD & Cloud
Open original DataTalks.Club episode
How to Become a Data Engineer: Skills, MLOps, Pipelines, SQL, CI/CD & Cloud
Original Episode
Use these links for the canonical episode and media sources.
- Open the original DataTalks.Club podcast page
- Watch on YouTube
- Listen on Spotify
- Listen on Apple Podcasts
Episode Overview
In this episode, Ellen König—Head of Engineering at alcemy—shares her journey from software and data science to data engineering leadership. She explains why many professionals make the switch, the skills that matter most (from DevOps and CI/CD to collaboration), and how to prepare through side projects and software fundamentals.
People
Use these links to connect the episode to guest notes.
Chapter Summary
Use these checkpoints to decide whether to open the source transcript.
- 0:00 - Episode Introduction & Guest Overview
- 1:51 - Career Narrative: From Backend Developer to Data Engineering Lead
- 6:32 - Motivation to Switch: Blackbox Models, Code Quality, and Professional Fit
- 9:41 - Role Overlap: Data Science Tasks That Are Data Engineering Work
- 12:02 - Data Intuition: How Data Is Produced, Structured, and Biased
- 13:55 - Transferable Strengths: Pipelines, Stakeholder Communication, Exploration
- 15:02 - Core Upskills: Collaborative Coding, CI/CD and DevOps Practices
- 17:34 - MLOps vs Research: When Data Scientists Need Production Engineering Skills
- 19:36 - Learning Pathways: On-the-Job Mentorship, Bootcamps, and Courses
- 21:25 - Experiment First: Side Projects and Small Work Assignments Before Switching
- 23:41 - Software Foundations: Take General Dev Courses (Web, Mobile) to Learn Engineering
- 26:20 - Essential Course Components: Git, Docker, Testing, CLI, Clean Code
- 28:54 - Language Guidance: SQL & Python for Analytics; Java/Scala for Streaming
- 32:43 - Market Dynamics: Strong Demand for Data Engineers and Expectation Gaps
- 35:40 - Teamwork Shift: Adapting to Pair Programming and Close Collaboration
- 38:20 - Organizational Models: Embedded Data Engineers vs Central Platform Teams
- 39:30 - Intersection Roles: Analytics Engineer, Data-Science-Engineers, MLOps
- 41:29 - Project Recipes: Build Scrapers, ETL Pipelines, Schedulers (Airflow)
- 44:00 - Portfolio Example: Domain-Focused Pipelines with Real Data & Automation
- 49:22 - Cloud Cost Control: Billing Exploration, Budgets, and Alerting
- 52:46 - Entry Strategy: When to Apply for Entry-Level Roles vs Internships
- 55:46 - Career Acceleration: Benefits of Consultancies and Large Companies
- 58:36 - Cloud Choice: Practical Differences, Local Demand, and Free Tiers