Distributed Machine Learning Patterns (Paperback)
Jazper Carter
Sold by CitiRetail, Stevenage, United Kingdom
AbeBooks Seller since 29 June 2022
New - Soft cover
Condition: New
Ships from United Kingdom to U.S.A.
Quantity: 1 available
Add to basketSold by CitiRetail, Stevenage, United Kingdom
AbeBooks Seller since 29 June 2022
Condition: New
Quantity: 1 available
Add to basketPaperback. Distributed machine learning systems fail in ways single-node systems never do. A 1024-GPU training job stalls for four hours while every worker reports healthy; gradient synchronization deadlocks leave no stack trace and no alert. A serving cluster absorbs a traffic spike, then silently doubles inference cost because the KV cache policy was tuned for a model half the size. Checkpoint corruption surfaces only after twelve hours of resumed training. These are the predictable failure modes of distributed systems, and the teams that ship reliable distributed ML design against them with patterns that hold across frameworks, clouds, and model scales.Inside this book, readers will learn how to: Design parallelism strategies that fit workload shape and hardware, selecting among data, tensor, pipeline, and expert axes based on architecture, memory budget, and interconnect topology.Tune gradient synchronization and sharding applying ZeRO, FSDP, and pipeline schedules to keep accelerator utilization high without amplifying communication overhead as cluster size grows.Build fault-tolerant training pipelines with checkpoint strategies, elastic cluster patterns, and spot instance management that recover from mid-run hardware failures without restarting from epoch zero.Operate inference at scale using continuous batching, paged attention, and KV cache management to maximize throughput and meet latency SLOs under variable load.Instrument distributed jobs for observability tracing per-rank metrics, gradient norms, and communication timings so silent failures surface before consuming days of compute budget.Manage multi-tenant clusters securely with workload isolation, quota enforcement, and cost attribution that keep shared GPU infrastructure safe and financially accountable.Apply LLM and foundation model patterns for distributed pre-training, RLHF infrastructure, and large-scale inference that generalize across architectures as hardware generations turn over.Assess platform maturity using the book's maturity model to locate gaps in reliability, cost efficiency, and operational readiness across the distributed ML stack.Frameworks rotate; the parallelism decisions, synchronization tradeoffs, and fault-tolerance designs that determine whether a distributed ML system works at scale do not. As foundation models grow larger and serving loads grow steeper, the distance between teams that reason in patterns and teams that copy configurations will only widen.The book is organized in four parts: Foundations, covering parallelism patterns, data sharding, I/O, and orchestration; Training at Scale, addressing fault-tolerant training, checkpoint management, and spot scheduling; Serving and Operations, covering inference architecture, cost control, observability, and multi-tenant security; and Frontier Patterns, applying everything to LLMs and foundation models and closing with end-to-end case studies and a full platform synthesis.This book is for ML architects who design distributed systems others depend on, ML engineers and data engineers who build and operate them, and technical team leads who set reliability and cost standards, with platform and SRE engineers as a strong secondary audience. Every chapter opens with a production incident scenario, teaches canonical patterns by name, and closes with a checklist the team can apply immediately. Readers finish with the vocabulary, playbook, and pattern library to ship reliable distributed ML systems with confidence. This item is printed on demand. Shipping may be from our UK warehouse or from our Australian or US warehouses, depending on stock availability.
Seller Inventory # 9798904980030
"About this title" may belong to another edition of this title.
Orders can be returned within 30 days of receipt.
If you are a consumer you can cancel the contract in accordance with the following. Consumer means any natural person who is acting for purposes which are outside his trade, business, craft or profession.
INFORMATION REGARDING THE RIGHT OF CANCELLATION
Statutory Right to cancel
You have the right to cancel this contract within 14 days without giving any reason.
The cancellation period will expire after 14 days from the day on which you acquire, or a third party other than the carrier and indicated by you acquires, physical possession of the the last good or the last lot or piece.
To exercise the right to cancel, you must inform us, CitiRetail, ABC Books c/o International Logistics, Unit 2D Gatwick Gate Industrial Estate, RH11 0TG, Lowfield Heath, United Kingdom, 44 020 3290 3457, of your decision to cancel this contract by a clear statement (e.g. a letter sent by post, fax or e-mail). You may use the attached model cancellation form, but it is not obligatory. You can also electronically fill in and submit a clear statement on our website, under "My Purchases" in "My Account". If you use this option, we will communicate to you an acknowledgement of receipt of such a cancellation on a durable medium (e.g. by e-mail) without delay.
To meet the cancellation deadline, it is sufficient for you to send your communication concerning your exercise of the right to cancel before the cancellation period has expired.
Effects of cancellation
If you cancel this contract, we will reimburse to you all payments received from you, including the costs of delivery (except for the supplementary costs arising if you chose a type of delivery other than the least expensive type of standard delivery offered by us).
We may make a deduction from the reimbursement for loss in value of any goods supplied, if the loss is the result of unnecessary handling by you.
We will make the reimbursement without undue delay, and not later than 14 days after the day on which we are informed about your decision to cancel with contract.
We will make the reimbursement using the same means of payment as you used for the initial transaction, unless you have expressly agreed otherwise; in any event, you will not incur any fees as a result of such reimbursement.
We may withhold reimbursement until we have received the goods back or you have supplied evidence of having sent back the goods, whichever is the earliest.
You shall send back the goods or hand them over to us or CitiRetail, ABC Books c/o International Logistics, Unit 2D Gatwick Gate Industrial Estate, RH11 0TG, Lowfield Heath, United Kingdom, 44 020 3290 3457, without undue delay and in any event not later than 14 days from the day on which you communicate your cancellation from this contract to us. The deadline is met if you send back the goods before the period of 14 days has expired. You will have to bear the direct cost of returning the goods. You are only liable for any diminished value of the goods resulting from the handling other than what is necessary to establish the nature, characteristics and functioning of the goods.
Exceptions to the right of cancellation
The right of cancellation does not apply to:
Model withdrawal form
(complete and return this form only if you wish to withdraw from the contract)
To: (CitiRetail, ABC Books c/o International Logistics, Unit 2D Gatwick Gate Industrial Estate, RH11 0TG, Lowfield Heath, United Kingdom, 44 020 3290 3457)
I/We (*) hereby give notice that I/We (*) withdraw from my/our (*) contract of sale of the following goods (*)/for the provision of the following goods (*)/for the provision of the following service (*),
Ordered on (*)/received on (*)
Name of consumer(s)
Address of consumer(s)
Signature of consumer(s) (only if this form is notified on paper)
Date
* Delete as appropriate.
Please note that titles are dispatched from our US, Canadian or Australian warehouses. Delivery times specified in shipping terms. Orders ship within 2 business days. Delivery to your door then takes 7-14 days.
| Order quantity | 7 to 60 business days | 7 to 14 business days |
|---|---|---|
| First item | £ 37.00 | £ 37.00 |
Delivery times are set by sellers and vary by carrier and location. Orders passing through Customs may face delays and buyers are responsible for any associated duties or fees. Sellers may contact you regarding additional charges to cover any increased costs to ship your items.