1. Trang chủ
  2. » Luận Văn - Báo Cáo

Blueprints For High Availability

31 0 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Blueprints For High Availability
Tác giả Evan Marcus, Hal Stern
Trường học Wiley Publishing, Inc.
Chuyên ngành Information Technology
Thể loại Book
Năm xuất bản 2003
Thành phố Indianapolis
Định dạng
Số trang 31
Dung lượng 279,98 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

BlueprintsForHighAvailability TV pdf Evan Marcus Hal Stern Blueprints for High Availability Second Edition Blueprints for High Availability Second Edition Executive Publisher Robert Ipsen Executive Ed[.]

Trang 2

Evan Marcus

Hal Stern

Blueprints for High Availability

Second Edition

Trang 3

Blueprints for High Availability

Second Edition

Trang 4

Development Editor: Scott Amerman

Editorial Manager: Kathryn A Malm

Production Editor: Vincent Kunkemueller

Text Design & Composition: Wiley Composition Services

Copyright © 2003 by Wiley Publishing, Inc., Indianapolis, Indiana All rights reserved Published simultaneously in Canada

No part of this publication may be reproduced, stored in a retrieval system, or transmitted

in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rose- wood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8700 Requests to the Pub- lisher for permission should be addressed to the Legal Department, Wiley Publishing, Inc.,

10475 Crosspoint Blvd., Indianapolis, IN 46256, (317) 572-3447, fax (317) 572-4447, E-mail: permcoordinator@wiley.com.

Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect

to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose No warranty may

be created or extended by sales representatives or written sales materials The advice and strategies contained herein may not be suitable for your situation You should consult with

a professional where appropriate Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, inci- dental, consequential, or other damages.

For general information on our other products and services please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.

Trademarks: Wiley, the Wiley Publishing logo and related trade dress are trademarks or registered trademarks of John Wiley & Sons, Inc and/or it’s affiliates in the United States and other countries, and may not be used without written permission All other trademarks are the property of their respective owners Wiley Publishing, Inc., is not associated with any product or vendor mentioned in this book.

Wiley also publishes its books in a variety of electronic formats Some content that appears

in print may not be available in electronic books.

Library of Congress Cataloging-in-Publication Data is available from the publisher.

ISBN: 0-471-43026-9

Printed in the United States of America

10 9 8 7 6 5 4 3 2 1

Trang 5

For Carol, Hannah, Madeline, and Jonathan

—Evan MarcusFor Toby, Elana, and Benjamin

—Hal Stern

Trang 6

Contents vii

For the Second Edition xix

Preface from the First Edition xxiv

Trang 7

Chapter 3 The Value of Availability 31

What Is High Availability? 31

Direct Costs of Downtime 34Indirect Costs of Downtime 36The Value of Availability 37Example 1: Clustering Two Nodes 42Example 2: Unknown Cost of Downtime 46The Availability Continuum 47The Availability Index 51The Lifecycle of an Outage 52

Chapter 4 The Politics of Availability 61

Beginning the Persuasion Process 61

Delivering the Message 70The Slide Presentation 70

After the Message Is Delivered 73

viii Contents

Trang 8

Chapter 5 20 Key High Availability Design Principles 75

#18: Remove Single Points of Failure (SPOFs) 78

#16: Consolidate Your Servers 81

#14: Enforce Change Control 83

#13: Document Everything 84

#12: Employ Service Level Agreements 87

#9: Separate Your Environments 90

#8: Learn from History 92

#6: Choose Mature Software 94

#5: Choose Mature, Reliable Hardware 95

#4: Reuse Configurations 97

#3: Exploit External Resources 98

#2: One Problem, One Solution 99

#1: K.I.S.S (Keep It Simple ) 101

Chapter 6 Backups and Restores 105

The Basic Rules for Backups 106

Do Backups Really Offer High Availability? 108What Should Get Backed Up? 109

Getting Backups Off-Site 110

Commercial or Homegrown? 111Examples of Commercial Backup Software 113Commercial Backup Software Features 113

Improving Backup Performance:

Solving for Performance 122

Trang 9

Use More Hardware 135

Third-Mirror Breakoff 136Sophisticated Software Features 138Copy-on-Write Snapshots 138

Fast and Flash Backup 141Handling Backup Tapes and Data 141General Backup Security 144

Disk Space Requirements for Restores 146

Chapter 7 Highly Available Data Management 149

Four Fundamental Truths 150Likelihood of Failure of Disks 150

Ensuring Data Accessibility 151Six Independent Layers of Data Storage and Management 152Disk Hardware and Connectivity Terminology 153

x Contents

Trang 10

Managing Disk and Volume Availability 180

Chapter 8 SAN, NAS, and Virtualization 183

Storage Area Networks (SANs) 184

Network Failure Taxonomy 204Network Reliability Challenges 205Network Failure Modes 207Physical Device Failures 208

IP Address Configuration 209

Congestion-Induced Failures 211Network Traffic Congestion 211Design and Operations Guidelines 213Building Redundant Networks 214

Redundant Network Connections 216Redundant Network Attach 217Multiple Network Attach 217

Configuring Multiple Networks 220

IP Routing Redundancy 223Dynamic Route Recovery 224Static Route Recovery with VRRP 225Routing Recovery Guidelines 226Choosing Your Network Recovery Model 227Load Balancing and Network Redirection 228

Trang 11

Network Service Reliability 232Network Service Dependencies 233Hardening Core Services 236Denial-of-Service Attacks 237

Chapter 10 Data Centers and the Local Environment 241

Advantages and Disadvantages to Data Center Racks 244The China Syndrome Test 247Balancing Security and Access 247

Off-Site Hosting Facilities 250

Chapter 11 People and Processes 263

System Management and Modifications 264Maintenance Plans and Processes 265

Working with Your Vendors 274The Vendor’s Role in System Recovery 275

Vendor Consulting Services 277

The Audience for Documentation 282Documentation and Security 283Reviewing Documentation 284System Administrators 284

xii Contents

Trang 12

Chapter 12 Clients and Consumers 291

Hardening Enterprise Clients 292

Database Application Recovery 299

Chapter 13 Application Design 303

Application Recovery Overview 304Application Failure Modes 305Application Recovery Techniques 306Kinder, Gentler Failures 308Application Recovery from System Failures 309Virtual Memory Exhaustion 309

Memory Corruption and Recovery 318

Boundary Condition Checks 322

Chapter 14 Data and Web Services 333

Network File System Services 334Detecting RPC Failures 334NFS Server Constraints 336Inside an NFS Failover 337Optimizing NFS Recovery 337

Trang 13

Database Servers 342Managing Recovery Time 343

Web Services Standards 357

Chapter 15 Local Clustering and Failover 361

A Brief and Incomplete History of Clustering 362Server Failures and Failover 365Logical, Application-centric Thinking 367Failover Requirements 369

Chapter 16 Failover Management and Issues 387

Failover Management Software (FMS) 388

Who Performs a Test, and Other Component Monitoring Issues 391When Component Tests Fail 392Time to Manual Failover 393Homemade Failover Software or Commercial Software? 395Commercial Failover Management Software 397When Good Failovers Go Bad 398

Causes and Remedies of Split-Brain Syndrome 400Undesirable Failovers 404xiv Contents

Trang 14

Verification and Testing 404State Transition Diagrams 405

Chapter 17 Failover Configurations 415

Two-Node Failover Configurations 416Active-Passive Failover 416Active-Passive Issues and Considerations 417How Can I Use the Standby Server? 418Active-Active Failover 421Active-Active or Active-Passive? 424Service Group Failover 425Larger Cluster Configurations 426

Trang 15

Chapter 19 Virtual Machines and Resource Management 465

Partitions and Domains: System-Level VMs 466Containers and Jails: OS Level VMs 468

Chapter 20 The Disaster Recovery Plan 473

Should You Worry about DR? 474Three Primary Goals of a DR Plan 475Health and Protection of the Employees 475The Survival of the Enterprise 476The Continuity of the Enterprise 476What Goes into a Good DR Plan 476Preparing to Build the DR Plan 477

So What Should You Do? 490

Equipping the DR Site 498

Is Your Plan Any Good? 500Qualities of a Good Exercise 500Planning for an Exercise 501Possible Exercise Limitations 503Make It More Realistic 503Ideas for an Exercise Scenario 504

Three Types of Exercises 507

The Effects of a Disaster on People 509Typical Responses to Disasters 509What Can the Enterprise Do to Help? 510

xvi Contents

Trang 16

Chapter 21 A Resilient Enterprise* 513

The New York Board of Trade 514

No Way for a Major Exchange to Operate 517

Chaotic Trading Environment 528Improvements to the DR Site 531

The New Trading Facility 533Future Disaster Recovery Plans 534

The Outcry for Open Outcry 535Modernizing the Open Outcry Process 536The Effects on the People 538

Trang 18

The strong positive response to the first edition of Blueprints for High ity was extremely gratifying It was very encouraging to see that our messageabout high availability could find a receptive audience We received a lot ofgreat feedback about our writing style that mentioned how we were able toexplain technical issues without getting too technical in our writing

Availabil-Although the comments that reached us were almost entirely positive, thisbook is our child, and we know where the flaws in the first edition were In thissecond edition, we have filled some areas out that we felt were a little flat thefirst time around, and we have paid more attention to the arrangement of thechapters this time

Without question, our “Tales from the Field” received the most praise fromour readers We heard from people who said that they sat down and justskimmed through the book looking for the Tales That, too, is very gratifying

We had a lot of fun collecting them, and telling the stories in such a positiveway We have added a bunch of new ones in this edition Skim away!

Our mutual thanks go out to the editorial team at John Wiley & Sons Onceagain, the push to complete the book came from Carol Long, who would notlet us get away with slipped deadlines, or anything else that we tried to pull

We had no choice but to deliver a book that we hope is as well received as thefirst edition She would accept nothing less Scott Amerman was a new addi-tion to the team this time out His kind words of encouragement balanced withhis strong insistence that we hit our delivery dates were a potent combination

From Evan Marcus

It’s been nearly four years since Hal and I completed our work on the first tion of Blueprints for High Availability, and in that time, a great many things

edi-Preface For the Second Edition

Trang 19

have changed The biggest personal change for me is that my family has had anew addition At this writing, my son Jonathan is almost three years old Amore general change over the last 4 years is that computers have become muchless expensive and much more pervasive They have also become much easier

to use Jonathan often sits down in front of one of our computers, turns it on,logs in, puts in a CD-ROM, and begins to play games, all by himself He canalso click his way around Web sites like www.pbskids.org I find it quiteremarkable that a three-year-old who cannot quite dress himself is so comfort-able in front of a computer

The biggest societal change that has taken place in the last 4 years (and, infact, in much longer than the last 4 years) occurred on September 11, 2001, withthe terrorist attacks on New York and Washington, DC I am a lifelong resident

of the New York City suburbs, in northern New Jersey, where the loss of ourfriends, neighbors, and safety is keenly felt by everyone But for the purposes

of this book, I will confine the discussion to how computer technology andhigh availability were affected

In the first edition, we devoted a single chapter to the subject of disasterrecovery, and in it we barely addressed many of the most important issues Inthis, the second edition, we have totally rewritten the chapter on disasterrecovery (Chapter 20, “A Disaster Recovery Plan”), based in part on many ofthe lessons that we learned and heard about in the wake of September 11 Wehave also added a chapter (Chapter 21, “A Resilient Enterprise”) that tells themost remarkable story of the New York Board of Trade, and how they wereable to recover their operations on September 11 and were ready to resumetrading less than 12 hours after the attacks When you read the New YorkBoard of Trade’s story, you may notice that we did not discuss the technologythat they used to make their recovery That was a conscious decision that wemade because we felt that it was not the technology that mattered most, butrather the efforts of the people that allowed the organization to not just sur-vive, but to thrive

Chapter 21 has actually appeared in almost exactly the same form in anotherbook In between editions of Blueprints, I was co-editor and contributor to aninternal VERITAS book called The Resilient Enterprise, and I originally wrotethis chapter for that book I extend my gratitude to Richard Barker, Paul Mas-siglia, and each of the other authors of that book, who gave me their permis-sion to reuse the chapter here

But some people never truly learn the lessons Immediately after September

11, a lot of noise was made about how corporations needed to make selves more resilient, should another attack occur There was a great deal ofdiscussion about how these firms would do a better job of distributing theirdata to multiple locations, and making sure that there were no single points offailure Because of the economy, which suffered greatly as a result of theattacks, no money was budgeted for protective measures right away, and as

them-xx Preface

Trang 20

time wore on, other priorities came along and the money that should havegone to replicating data and sending backups off-site was spent other ways.Many of the organizations that needed to protect themselves have done little

or nothing in the time since September 11, and that is a shame If there isanother attack, it will be a great deal more than a shame

Of course, technology has changed in the last 4 years We felt we needed toadd a chapter about some new and popular technology related to the field ofavailability Chapter 8 is an overview of SANs, NAS, and storage virtualization

We also added Chapter 22, which is a look at some emerging technologies

Despite all of the changes in society, technology, and families, the basic ciples of high availability that we discussed in the first edition have notchanged The mission statement that drove the first book still holds: “You can-not achieve high availability by simply installing clustering software andwalking away.” The technologies that systems need to achieve high availabil-ity are not automatically included by system and operating system vendors.It’s still difficult, complex, and costly

prin-We have tried to take a more practical view of the costs and benefits of highavailability in this edition, making our Availability Index model much moredetailed and prominent The technology chapters have been arranged in anorder that maps to their positions on the Index; earlier chapters discuss morebasic and less expensive examples of availability technology like backups anddisk mirroring, while later chapters discuss more complex and expensive tech-nologies that can deliver the highest levels of availability, such as replicationand disaster recovery

As much as things have changed since the first edition, one note that weincluded in that Preface deserves repeating here: Some readers may begrudgethe lack of simple, universal answers in this book There are two reasons forthis One is that the issues that arise at each site, and for each computer system,are different It is unreasonable to expect that what works for a 10,000-employee global financial institution will also work for a 10-person law office

We offer the choices and allow the reader to determine which one will workbest in his or her environment The other reason is that after 15 years of work-ing on, with, and occasionally for computers, I have learned that the most cor-rect answer to most computing problems is a rather unfortunate, “It depends.” Writing a book such as this one is a huge task, and it is impossible to do italone I have been very fortunate to have had the help and support of a hugecast of terrific people Once again, my eternal love and gratitude go to mywonderful wife Carol, who puts up with all of my ridiculous interests andhobbies (like writing books), our beautiful daughters Hannah and Madeline,and our delightful son Jonathan Without them and their love and support,this book would simply not have been possible Thanks, too, for your love andsupport to my parents, Roberta and David Marcus, and my in-laws, Gladysand Herb Laden, who still haven’t given me that recipe

Ngày đăng: 19/04/2023, 21:40

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

w