Distributed machine learning systems allow developers to handle extremely large datasets across multiple clusters, take advantage of automation tools, and benefit from hardware accelerations. This book reveals best practice techniques and insider tips for tackling the challenges of scaling machine learning systems.
In Distributed Machine Learning Patterns you will learn how to:
- Apply distributed systems patterns to build scalable and reliable machine learning projects
- Build ML pipelines with data ingestion, distributed training, model serving, and more
- Automate ML tasks with Kubernetes, TensorFlow, Kubeflow, and Argo Workflows
- Make trade-offs between different patterns and approaches
- Manage and monitor machine learning workloads at scale
Inside Distributed Machine Learning Patterns you’ll learn to apply established distributed systems patterns to machine learning projects—plus explore cutting-edge new patterns created specifically for machine learning. Firmly rooted in the real world, this book demonstrates how to apply patterns using examples based in TensorFlow, Kubernetes, Kubeflow, and Argo Workflows. Hands-on projects and clear, practical DevOps techniques let you easily launch, manage, and monitor cloud-native distributed machine learning pipelines.