Blueprints for High Availability: Designing Resilient Distributed Systems - Hardcover

Marcus, Evan; Stern, Hal

 
9780471356011: Blueprints for High Availability: Designing Resilient Distributed Systems

Synopsis

"Rely on this book for information on the technologies and methods you′ll need to design and implement high–availability systems...It will help you transform the vision of always–on networks into a reality."–Dr. Eric Schmidt, Chairman and CEO, Novell Corporation
Your system will crash! The reason could be something as complex as network congestion or something as mundane as an operating system fault. The good news is that there are steps you can take to maximize your system availability and prevent serious downtime. This authoritative book will provide you with the tools to deploy a system with confidence. The authors guide you through the building of a network that runs with high availability, resiliency, and predictability. They clearly show you how to assess the elements of a system that can fail, select the appropriate level of reliability, and provide steps for designing, implementing, and testing your solution to reduce downtime to a minimum. All the while, they help you determine how much you can afford to spend by balancing costs and benefits. This book of practical, hands–on blueprints:
∗ Examines what can go wrong with the various components of your system
∗ Provides twenty key system design principles for attaining resilience and high availability
∗ Discusses how to arrange disks and disk arrays for protection against hardware failures
∗ Looks at failovers, the software that manages them, and sorts through the myriad of different failover configurations
∗ Provides techniques for improving network reliability and redundancy
∗ Reviews techniques for replicating data and applications to other systems across a network
∗ Offers guidance on application recovery
∗ Examines Disaster Recovery

"synopsis" may belong to another edition of this title.

About the Author

EVAN MARCUS is a Senior Systems Engineer at VERITAS Software Corporation and co–designed a key piece of the first commercial Sun–based software for High Availability. He has been the company′s consultant for successful implementations of VERITAS High Availability Products around the world.

HAL STERN is a Distinguished Systems Engineer at Sun Microsystems. He has led reliability and improvement teams for several financial services clients and focuses on performance, reliability, and networked system architecture. He is also the author of Managing NFS and NIS.

From the Back Cover

"Rely on this book for information on the technologies and methods you′ll need to design and implement high–availability systems...It will help you transform the vision of always–on networks into a reality."–Dr. Eric Schmidt, Chairman and CEO, Novell Corporation

Your system will crash! The reason could be something as complex as network congestion or something as mundane as an operating system fault. The good news is that there are steps you can take to maximize your system availability and prevent serious downtime. This authoritative book will provide you with the tools to deploy a system with confidence. The authors guide you through the building of a network that runs with high availability, resiliency, and predictability. They clearly show you how to assess the elements of a system that can fail, select the appropriate level of reliability, and provide steps for designing, implementing, and testing your solution to reduce downtime to a minimum. All the while, they help you determine how much you can afford to spend by balancing costs and benefits. This book of practical, hands–on blueprints:
∗ Examines what can go wrong with the various components of your system
∗ Provides twenty key system design principles for attaining resilience and high availability
∗ Discusses how to arrange disks and disk arrays for protection against hardware failures
∗ Looks at failovers, the software that manages them, and sorts through the myriad of different failover configurations
∗ Provides techniques for improving network reliability and redundancy
∗ Reviews techniques for replicating data and applications to other systems across a network
∗ Offers guidance on application recovery
∗ Examines Disaster Recovery

"About this title" may belong to another edition of this title.