Podcast
DataOps & GitOps for Data Teams: Onboarding, IaC, Reproducibility & Production Best Practices
Open original DataTalks.Club episode
DataOps & GitOps for Data Teams: Onboarding, IaC, Reproducibility & Production Best Practices
Original Episode
Use these links for the canonical episode and media sources.
- Open the original DataTalks.Club podcast page
- Watch on YouTube
- Listen on Spotify
- Listen on Apple Podcasts
Episode Overview
How do you make data work less fragile and easier to onboard while keeping production safe and reproducible? In this episode, Tomasz Hinc, a DataOps practitioner from Poznań with roots in econometrics, product analytics, data engineering and ML, walks through practical DataOps and GitOps patterns for data teams. We cover platform onboarding (requesting infra vs. merge requests), Infrastructure as Code with Terraform, Terragrunt and Atlantis, and a GitOps workflow from branch to Atlantis dry-run and apply. Tomasz.
People
Use these links to connect the episode to guest notes.
Chapter Summary
Use these checkpoints to decide whether to open the source transcript.
- 0:00 - Podcast Introduction
- 1:40 - Guest Introduction & Episode Overview
- 2:25 - Career Journey: Econometrics → ML Trainee → Data Roles
- 4:31 - Early Experience: OLX, Government Statistics, Academia
- 5:20 - ML Education: Multi-Dimensional Analysis to Machine Learning
- 6:34 - Behavioral Analysis & Product Analytics: Clickstream Modeling
- 7:08 - Operational Realities: ETL Failures, Production Constraints
- 12:40 - Platform Onboarding: Requesting Infra vs. Doing a Merge Request
- 13:07 - Platform Teams’ Role: Review, Enablement, and Safe Practices
- 14:12 - Motivation Shift: From Model-Centric to Data-Centric Work
- 18:59 - Defining DataOps: Enabling Faster, Less Scary Data Work (DataOps, DevOps)
- 20:56 - DataOps & Infra: SQL, Secrets, GitOps, and Developer Enablement
- 23:04 - GitOps & IaC Overview: Terraform, Terragrunt, Atlantis
- 23:42 - Infrastructure as Code: Declarative Configurations & Reproducibility
- 26:21 - GitOps Workflow: Branch, Merge Request, Atlantis Dry Run, Apply
- 27:34 - Onboarding Friction: Tooling Challenges for Data Scientists
- 29:34 - Learning Path: Narrow Scope, Hands-On Mentorship, Roadmap Advice
- 35:55 - Terminal Comfort: Shell Setup, Autocomplete, and Productivity Tweaks
- 38:20 - Learning Resources: YouTube, Articles, and CLI Tutorials
- 40:44 - DataOps vs Data Engineering: Support & Communication vs Pipeline Coding
- 41:52 - Proactive Support: Monitoring, Onboarding, and Cross-Team Education
- 44:23 - Suitable Backgrounds: Any Data Role; Log Reading & Troubleshooting
- 47:55 - Minimal Operational Skills: Git, Command Line, IAM, Password Managers
- 54:37 - Distinction from Management: Cross-Team Enablement vs Team Leads
- 56:44 - Infrastructure Choices for Data: Batch Workloads, ECS/AWS Batch vs Kubernetes
- 58:26 - Company-Scale Migration: Jenkins → GitLab CI and Broad Collaboration
- 1:01:27 - Reproducibility & Dependencies: Fixed Versions, Docker, Silent Failures
- 1:02:28 - Confidence in Data: Pragmatic Edge-Case Checks & Airflow Caveats