Podcast

Modern Data Engineering: Iceberg, Delta Lake & AI-Powered Pipelines

S20E3

Open original DataTalks.Club episode

YouTube Spotify Apple Podcasts

data engineering data governance AI open-source

Modern Data Engineering: Iceberg, Delta Lake & AI-Powered Pipelines

Original Episode

Use these links for the canonical episode and media sources.

Open the original DataTalks.Club podcast page
Watch on YouTube
Listen on Spotify
Listen on Apple Podcasts

Episode Overview

How can engineering teams build reliable, scalable lakehouse pipelines that combine transactional table formats with AI-driven automation? In this episode Adrian Brudaru—an economics-trained analyst turned freelance data practitioner and co-founder of a data company focused on open source tooling—joins us to explore the realities of modern data engineering.

People

Use these links to connect the episode to guest notes.

Adrian Brudaru

Chapter Summary

Use these checkpoints to decide whether to open the source transcript.

0:01 - Episode opening & guest introduction
2:23 - Perspective on evolving data engineering challenges
3:10 - Career journey: startups, freelancing, founding DLT
4:03 - DLT as a Python-based ingestion standard and market impact
7:45 - DLT Plus vision and partnership outreach for freelancers
11:03 - Industry shift toward specialization: governance, data quality, streaming
12:37 - Early-career opportunities: AI projects and startup hiring
14:32 - Modern data stack critique and open-source “postmodern” alternatives
16:40 - 2025 trends: AI integration in data engineering and Apache Iceberg adoption
18:17 - Apache Iceberg explained: table format, Parquet storage, vendor lock-in reduction
21:27 - Database layers and catalog role: storage, compute, access, metadata & lineage
23:41 - Metadata and catalog tooling overview (AWS Glue and peers)
25:58 - DuckDB impact: embeddable local OLAP and portable query engine
27:40 - Cost-efficient pipelines: DuckDB with GitHub Actions and headless table formats
30:31 - Headless table formats and DLT support for Delta Lake and Iceberg
31:29 - dbt’s influence on engineering workflows and alternatives like SQLMesh
35:37 - Workflow orchestration options in 2025: Airflow, Prefect, Dagster, GitHub
38:02 - AI engineering convergence: data engineers building AI agents
41:06 - Beginner roadmap: SQL, Python, capturing business requirements, building
44:42 - Tool selection guidance and vendor caution for modern data stacks
45:56 - Transition paths: senior backend engineers moving into data engineering
48:04 - Job market outlook: senior vs junior data engineering opportunities
49:42 - Table format comparisons: Delta, Hudi, and Iceberg differences
51:19 - Streaming architectures and tools: micro-batching, Kafka, SQS, Flink
56:15 - AI-driven commoditization and code generation in data engineering
59:42 - DLT roadmap: DLT Plus and a marketplace for reusable data products
1:01:19 - Episode wrap-up and key takeaways