Podcast
Building Agentic AI Systems: Pragmatic Agent Engineering, Tooling, Retrieval & Evaluation
Open original DataTalks.Club episode
Building Agentic AI Systems: Pragmatic Agent Engineering, Tooling, Retrieval & Evaluation
Original Episode
Use these links for the canonical episode and media sources.
- Open the original DataTalks.Club podcast page
- Watch on YouTube
- Listen on Spotify
- Listen on Apple Podcasts
Episode Overview
How do you build reliable, agentic AI systems that balance practical engineering, tooling, retrieval, and robust evaluation? In this episode Ranjitha Kulkarni, Staff Machine Learning Engineer at NeuBird.ai and former engineer on LLM- and agent-powered product features at Dropbox Dash and Microsoft, explores pragmatic approaches to agent design. Drawing on her work in speech recognition, language modeling, assistant evaluation, and publications on voice query reformulation and automatic online evaluation, Ranjitha.
People
Use these links to connect the episode to guest notes.
Chapter Summary
Use these checkpoints to decide whether to open the source transcript.
- 0:00 - Event Introduction & Community Links
- 3:12 - Early ML Projects: Image Search with OpenCV
- 4:25 - Speech Recognition & Language Modeling Experience
- 4:57 - Transition to Recommendation Systems at Dropbox
- 5:52 - Question Answering & Early Agent Experiments
- 7:44 - Joining Noird.ai: Automating On-call with Agents
- 11:00 - Agent Definition: Autonomy, Objectives & LLMs
- 12:31 - Agent Orchestration: Tools, Memory & Knowledge Stores
- 15:10 - Planning Strategies: Single-step, Multi-pass & Self-reflection
- 18:23 - Implementation Approaches: Prompts, SDKs & Tool Wrappers
- 19:58 - Code Agents vs Natural-Language Agents: Trade-offs
- 21:21 - Context Engineering: Designing Effective LLM Inputs
- 22:50 - SRE Workflows Modeled by Agents: Logs, Metrics & Remediation
- 24:59 - Integration Abstractions: Handling Diverse Tooling
- 29:30 - RAG Reality Check: Latency, Cost & Garbage-In/Garbage-Out
- 31:38 - Retrieval Limitations: Reworking Backends for LLM Context
- 32:48 - Context Engineering Techniques: Chunking, Metadata & Wrappers
- 36:11 - Agentic RAG: Using Retrieval as a Tool Within Agents
- 37:39 - Use Cases: When RAG Is Enough vs When Agents Are Needed
- 40:30 - Dynamic Planning Example: Calendar & Meeting Assistant
- 43:06 - Dropbox Dash & AI Productivity Assistants for Enterprises
- 44:08 - Framework Choices: Build from Scratch vs Use Libraries
- 46:00 - Framework Trade-offs: LangChain, OpenAI Agents SDK, Small Agents
- 48:00 - Agent Marketplaces & Tool Protocols (MCP)
- 51:17 - Evaluation Strategy: Custom Datasets & System Benchmarks
- 53:20 - Testing Agents: Mocking Tools, Integration & Regression Tests
- 56:02 - Goal-based Evaluation: Outcome Assertions Over Exact Paths
- 58:11 - Specialization Challenge: Why Generic Agent Solutions Lag
- 59:06 - Closing Thoughts & Future Outlook for Agent Engineering