Podcast

Building Agentic AI Systems: Pragmatic Agent Engineering, Tooling, Retrieval & Evaluation

S22E1

Open original DataTalks.Club episode

YouTube Spotify Apple Podcasts

LLMs AI agent engineering retrieval-augmented generation MLOps tools

Building Agentic AI Systems: Pragmatic Agent Engineering, Tooling, Retrieval & Evaluation

Original Episode

Use these links for the canonical episode and media sources.

Episode Overview

How do you build reliable, agentic AI systems that balance practical engineering, tooling, retrieval, and robust evaluation? In this episode Ranjitha Kulkarni, Staff Machine Learning Engineer at NeuBird.ai and former engineer on LLM- and agent-powered product features at Dropbox Dash and Microsoft, explores pragmatic approaches to agent design. Drawing on her work in speech recognition, language modeling, assistant evaluation, and publications on voice query reformulation and automatic online evaluation, Ranjitha.

People

Use these links to connect the episode to guest notes.

Ranjitha Kulkarni

Chapter Summary

Use these checkpoints to decide whether to open the source transcript.

0:00 - Event Introduction & Community Links
3:12 - Early ML Projects: Image Search with OpenCV
4:25 - Speech Recognition & Language Modeling Experience
4:57 - Transition to Recommendation Systems at Dropbox
5:52 - Question Answering & Early Agent Experiments
7:44 - Joining Noird.ai: Automating On-call with Agents
11:00 - Agent Definition: Autonomy, Objectives & LLMs
12:31 - Agent Orchestration: Tools, Memory & Knowledge Stores
15:10 - Planning Strategies: Single-step, Multi-pass & Self-reflection
18:23 - Implementation Approaches: Prompts, SDKs & Tool Wrappers
19:58 - Code Agents vs Natural-Language Agents: Trade-offs
21:21 - Context Engineering: Designing Effective LLM Inputs
22:50 - SRE Workflows Modeled by Agents: Logs, Metrics & Remediation
24:59 - Integration Abstractions: Handling Diverse Tooling
29:30 - RAG Reality Check: Latency, Cost & Garbage-In/Garbage-Out
31:38 - Retrieval Limitations: Reworking Backends for LLM Context
32:48 - Context Engineering Techniques: Chunking, Metadata & Wrappers
36:11 - Agentic RAG: Using Retrieval as a Tool Within Agents
37:39 - Use Cases: When RAG Is Enough vs When Agents Are Needed
40:30 - Dynamic Planning Example: Calendar & Meeting Assistant
43:06 - Dropbox Dash & AI Productivity Assistants for Enterprises
44:08 - Framework Choices: Build from Scratch vs Use Libraries
46:00 - Framework Trade-offs: LangChain, OpenAI Agents SDK, Small Agents
48:00 - Agent Marketplaces & Tool Protocols (MCP)
51:17 - Evaluation Strategy: Custom Datasets & System Benchmarks
53:20 - Testing Agents: Mocking Tools, Integration & Regression Tests
56:02 - Goal-based Evaluation: Outcome Assertions Over Exact Paths
58:11 - Specialization Challenge: Why Generic Agent Solutions Lag
59:06 - Closing Thoughts & Future Outlook for Agent Engineering