Podcast
Applying Computer Vision Research to Building Production-Ready AI Systems for Real-World Deployment
Open original DataTalks.Club episode
Applying Computer Vision Research to Building Production-Ready AI Systems for Real-World Deployment
Original Episode
Use these links for the canonical episode and media sources.
- Open the original DataTalks.Club podcast page
- Watch on YouTube
- Listen on Spotify
- Listen on Apple Podcasts
Episode Overview
How do you take computer vision research out of the lab and turn it into production-ready AI that actually works in the real world? In this episode Aishwarya Jadhav, a Machine Learning Engineer with over four years of industry experience and a Master’s from Carnegie Mellon University, walks through the challenges of applying computer vision research to production systems. Her background spans multimodal LLMs, generative AI, and computer vision, with research experience in multimodal deep learning and text.
People
Use these links to connect the episode to guest notes.
Chapter Summary
Use these checkpoints to decide whether to open the source transcript.
- 0:00 - Podcast Introduction
- 1:33 - Guest Bio & Career Overview: Finance to Self-Driving AI
- 2:51 - Morgan Stanley: Big Data Engineering & Transition to ML
- 3:55 - Carnegie Mellon: Research Focus & Computer Vision Projects
- 5:39 - AI Guide Dog: Mobile Navigation for the Visually Impaired
- 9:14 - AI Guide Dog: Beta Testing, Iterative Development, Hardware Constraints
- 11:22 - Sensor Tradeoffs: LiDAR, Radar, and Cost Considerations
- 11:58 - LiDAR vs Cameras: Principles and Automotive Use Cases
- 14:45 - Tesla’‘s Camera-First Perception: 360° Vision without LiDAR
- 16:06 - Autopilot Use Cases: Driver Assistance vs Full Autonomy
- 19:41 - Waymo Ride-Hailing: App, Service Model, and Driverless Rides
- 19:57 - Gesture Recognition for Traffic Control: Police & Construction Signals
- 22:17 - On-Vehicle Inference: Performance Constraints and Optimization
- 23:28 - Model Compression Techniques: Quantization and Speedups
- 24:05 - Malaria Mapping: AI for Social Good Using Satellite & Topographic Data
- 27:03 - Malaria Project Impact: Field Feedback and Resource Optimization
- 29:45 - Validation Pipeline: Simulation, Closed Tracks, and On-Road Testing
- 31:02 - Sensor Data Management: Collection, Privacy, and Scale
- 32:09 - Labeling Strategy: Human Annotation and Automated Labeling
- 32:43 - Model Release Cadence: Safety Checks and Staged Deployments
- 36:12 - Cross-Domain Transfer: Perception Techniques for Robotics & Drones
- 37:18 - Real-World Complexity: Edge Cases, Geography, and System Coordination
- 43:44 - Reinforcement Learning vs Perception: Roles and Practical Constraints
- 51:28 - Testing Sensitive Cases: Evaluation Stages and Inherited Tests
- 52:53 - Multimodal LLMs in Autonomous Driving: Research and Practical Challenges
- 55:25 - Career Pathways: Skills, Projects, and Entry Routes into Self-Driving AI
- 56:24 - Practical Projects & Tools: Vision Apps, LLMs, and Coding Agents
- 58:35 - Closing Remarks and Final Advice