Podcast

Data Privacy Playbook: Differential Privacy, Federated Learning, PETs & Consent UX

S14E2

Open original DataTalks.Club episode

YouTube Spotify Apple Podcasts

data governance data privacy machine learning federated learning

Original Episode

Use these links for the canonical episode and media sources.

Open the original DataTalks.Club podcast page
Watch on YouTube
Listen on Spotify
Listen on Apple Podcasts

Episode Overview

How can teams build useful machine learning while respecting user privacy, compliance, and re-identification risk? In this episode, Katharine Jarmul — privacy activist and Principal Data Scientist at ThoughtWorks Germany — walks through a practical Data Privacy Playbook focused on differential privacy, federated learning, privacy-enhancing technologies (PETs) and consent UX.

People

Use these links to connect the episode to guest notes.

Katharine Jarmul

Chapter Summary

Use these checkpoints to decide whether to open the source transcript.

0:00 - Episode Introduction
1:40 - Guest Introduction: Katharine Jarmul — privacy activist, ML engineer, ThoughtWorks,
2:32 - Career Journey: data journalism, NLP, consulting, and machine learning
9:08 - Startup Focus: KI Protect, pseudonymisation, encrypted & federated ML
11:33 - Privacy Regulation Overview: GDPR, CCPA, CPRA and cookie consent defaults
14:35 - Cookie Consent & Opt-Out UX: one-click rejects and user behavior
16:24 - Defining Data Privacy: legal, social, and technical perspectives
21:35 - Practical Data Privacy (book): availability, previews, and giveaways
22:38 - Bridging Legal & Technical Views: privacy risk, translation, and collaboration
25:12 - User Profiling & Fingerprinting: browser history, apps, and re-identification
30:15 - Privacy-Friendly Personalization: session-based intent and ephemeral inference
33:08 - Privacy Engineering & PETs: encrypted ML, federated learning, and architecture
35:09 - Business Case for Privacy: risk management, regulation, and customer trust
40:50 - Differential Privacy Explained: formal definition, use cases, and libraries
45:08 - Anonymization Pitfalls: hashing, k-anonymity, Netflix de-anonymization lessons
47:00 - Designing for Privacy: consent, data minimization, and workflow practices
52:35 - Generative AI & Privacy: ChatGPT incidents, consent, retention, and enterprise
59:29 - Deploying Localized Models: Azure localization, fine-tuning, and ownership
1:01:15 - Further Learning: Probably Private newsletter, notebooks, and differential