Practical patterns for scaling machine learning from your laptop to a distributed cluster.
Distributing machine learning systems allow developers to handle extremely large datasets across multiple clusters, take advantage of automation tools, and benefit from hardware accelerations. This book reveals best practice techniques and insider tips for tackling the challenges of scaling machine learning systems.
In Distributed Machine Learning Patterns you will learn how
Apply distributed systems patterns to build scalable and reliable machine learning projects Build ML pipelines with data ingestion, distributed training, model serving, and more Automate ML tasks with Kubernetes, TensorFlow, Kubeflow, and Argo Workflows Make trade-offs between different patterns and approaches Manage and monitor machine learning workloads at scale Inside Distributed Machine Learning Patterns you’ll learn to apply established distributed systems patterns to machine learning projects—plus explore cutting-edge new patterns created specifically for machine learning. Firmly rooted in the real world, this book demonstrates how to apply patterns using examples based in TensorFlow, Kubernetes, Kubeflow, and Argo Workflows. Hands-on projects and clear, practical DevOps techniques let you easily launch, manage, and monitor cloud-native distributed machine learning pipelines.
About the technology
Deploying a machine learning application on a modern distributed system puts the spotlight on reliability, performance, security, and other operational concerns. In this in-depth guide, Yuan Tang, project lead of Argo and Kubeflow, shares patterns, examples, and hard-won insights on taking an ML model from a single device to a distributed cluster.
About the book
Distributed Machine Learning Patterns provides dozens of techniques for designing and deploying distributed machine learning systems. In it, you’ll learn patterns for distributed model training, managing unexpected failures, and dynamic model serving. You’ll appreciate the practical examples that accompany each pattern along with a full-scale project that implements distributed model training and inference with autoscaling on Kubernetes.
What's inside
Data ingestion, distributed training, model serving, and more Automating Kubernetes and TensorFlow with Kubeflow and Argo Workflows Manage and monitor workloads at scale
About the reader
For data analysts and engineers familiar with the basics of machine learning, Bash, Python, and Docker.
About the author
Yuan Tang is a project lead of Argo and Kubeflow, maintainer of TensorFlow and XGBoost, and author of numerous open source projects.
Table of Contents
PART 1 BASIC CONCEPTS AND BACKGROUND 1 Introduction to distributed machine learning systems PART 2 PATTERNS OF DISTRIBUTED MACHINE LEARNING SYSTEMS 2 Data ingestion patterns 3 Distributed training patterns 4 Model serving patterns 5 Workflow patterns 6 Operation patterns PART 3 BUILDING A DISTRIBUTED MACHINE LEARNING WORKFLOW 7 Project overview and system
Yuan Tang's 'Distributed Machine Learning Patterns' emerges as a crucial text in understanding the intricate landscape of contemporary machine learning challenges, which are inherently distributed and demand reproducibility. This book stands out for its adept blend of theoretical insights and practical applications, making it an indispensable resource for newcomers and seasoned practitioners. Tang demystifies complex concepts and offers a comprehensive guide to implementing distributed machine learning systems. Readers seeking a deep dive into the subject will find no better starting point than this book, as it meticulously bridges the gap between theory and real-world application in distributed machine learning.
The book is a must to read if you work in Machine Learning and want to understand how to release your model in production. The author uses clear and simple language to describe the entire process, the diagram in the book makes this very easy to follow and understand. It is a must to read for everyone working on the space Machine Learning.
A few days ago, I started reading this book, and I loved that it covers patterns for construction and the operation of ML solutions. Chapter 7 was my favorite because it adequately synthesizes the architecture of ML solutions. Highly recommended!
The content is useful, but the author repeats themself over and over. The writing is also unclear, even when the topic is straightforward. Additionally, the book goes into machine learning basics that don't seem necessary or relevant for a book focused on distributed ML patterns.