thomas h cormen introduction to algorithms 2001

In Chapter 2, we see our first algorithms, which solve the problem of sorting a sequence of n numbers.. The sorting algorstruc-ithms we examine are insertionsort, which uses an increment

Trang 2

Introduction to Algorithms

Second Edition

Trang 4

The MIT Press

Cambridge, Massachusetts London, England

McGraw-Hill Book Company

Trang 5

Ordering Information:

North America

Text orders should be addressed to the McGraw-Hill Book Company All other orders should be addressed to The MIT Press.

Outside North America

All orders should be addressed to The MIT Press or its local distributor.

This book was printed and bound in the United States of America.

Library of Congress Cataloging-in-Publication Data

Introduction to algorithms / Thomas H Cormen [et al.].—2nd ed.

p cm.

Includes bibliographical references and index.

ISBN 0-262-03293-7 (hc : alk paper, MIT Press).—ISBN 0-07-013151-1 (McGraw-Hill)

1 Computer programming 2 Computer algorithms I Title:

Algorithms II Cormen, Thomas H.

QA76.6 I5858 2001

005.1—dc21

2001031277

Trang 6

4.1 The substitution method 63

4.2 The recursion-tree method 67

4.3 The master method 73

4.4 Proof of the master theorem 76

5 Probabilistic Analysis and Randomized Algorithms 91

5.1 The hiring problem 91

5.2 Indicator random variables 94

5.3 Randomized algorithms 99

5.4 Probabilistic analysis and further uses of indicator random variables

10 6

Trang 7

II Sorting and Order Statistics

8 Sorting in Linear Time 165

8.1 Lower bounds for sorting 165

8.2 Counting sort 168

8.3 Radix sort 170

8.4 Bucket sort 174

9 Medians and Order Statistics 183

9.1 Minimum and maximum 184

9.2 Selection in expected linear time 185

9.3 Selection in worst-case linear time 189

III Data Structures

10 Elementary Data Structures 200

10.1 Stacks and queues 200

10.2 Linked lists 204

10.3 Implementing pointers and objects 209

10.4 Representing rooted trees 214

Trang 8

Contents vii

12 Binary Search Trees 253

12.1 What is a binary search tree? 253 12.2 Querying a binary search tree 256

12.3 Insertion and deletion 261

12.4 Randomly built binary search trees 26 5

14 Augmenting Data Structures 302

14.1 Dynamic order statistics 302

14.2 How to augment a data structure 308

15.3 Elements of dynamic programming 339

15.4 Longest common subsequence 350

15.5 Optimal binary search trees 356

17.2 The accounting method 410

17.3 The potential method 412

17.4 Dynamic tables 416

Trang 9

V Advanced Data Structures

18.1 Definition of B-trees 438

18.2 Basic operations on B-trees 441

18.3 Deleting a key from a B-tree 449

19.1 Binomial trees and binomial heaps 457

19.2 Operations on binomial heaps 461

20.1 Structure of Fibonacci heaps 477

20.2 Mergeable-heap operations 479

20.3 Decreasing a key and deleting a node 489

20.4 Bounding the maximum degree 493

21 Data Structures for Disjoint Sets 498

22.5 Strongly connected components 552

23.1 Growing a minimum spanning tree 562

23.2 The algorithms of Kruskal and Prim 567

24 Single-Source Shortest Paths 580

24.1 The Bellman-Ford algorithm 588

24.2 Single-source shortest paths in directed acyclic graphs 592

24.3 Dijkstra’s algorithm 595

24.4 Difference constraints and shortest paths 601

24.5 Proofs of shortest-paths properties 607

Trang 10

Contents ix

25 All-Pairs Shortest Paths 620

25.1 Shortest paths and matrix multiplication 622

25.2 The Floyd-Warshall algorithm 629

25.3 Johnson’s algorithm for sparse graphs 636

26.1 Flow networks 644

26.2 The Ford-Fulkerson method 651

26.3 Maximum bipartite matching 664

26.4 Push-relabel algorithms 669

26.5 The relabel-to-front algorithm 681

VII Selected Topics

27 Sorting Networks 704

27.1 Comparison networks 704

27.2 The zero-one principle 709

27.3 A bitonic sorting network 712

27.4 A merging network 716

27.5 A sorting network 719

28 Matrix Operations 725

28.1 Properties of matrices 725

28.2 Strassen’s algorithm for matrix multiplication 735

28.3 Solving systems of linear equations 742

28.4 Inverting matrices 755

28.5 Symmetric positive-definite matrices and least-squares approximation

760

29.1 Standard and slack forms 777

29.2 Formulating problems as linear programs 785

29.3 The simplex algorithm 790

29.4 Duality 804

29.5 The initial basic feasible solution 811

30 Polynomials and the FFT 822

30.1 Representation of polynomials 824

30.2 The DFT and FFT 830

30.3 Efficient FFT implementations 839

Trang 11

31 Number-Theoretic Algorithms 849

31.1 Elementary number-theoretic notions 850

31.2 Greatest common divisor 856

31.3 Modular arithmetic 862

31.4 Solving modular linear equations 869

31.5 The Chinese remainder theorem 873

32.1 The naive string-matching algorithm 909

32.2 The Rabin-Karp algorithm 911

32.3 String matching with finite automata 916

32.4 The Knuth-Morris-Pratt algorithm 923

33.1 Line-segment properties 934

33.2 Determining whether any pair of segments intersects 940

33.3 Finding the convex hull 947

33.4 Finding the closest pair of points 957

35.1 The vertex-cover problem 1024

35.2 The traveling-salesman problem 1027

35.3 The set-covering problem 1033

35.4 Randomization and linear programming 1039

35.5 The subset-sum problem 1043

VIII Appendix: Mathematical Background

A.1 Summation formulas and properties 1058

A.2 Bounding summations 1062

Trang 12

C.3 Discrete random variables 1106

C.4 The geometric and binomial distributions 1112

C.5 The tails of the binomial distribution 1118

Trang 14

This book provides a comprehensive introduction to the modern study of computeralgorithms It presents many algorithms and covers them in considerable depth, yetmakes their design and analysis accessible to all levels of readers We have tried tokeep explanations elementary without sacrificing depth of coverage or mathemati-cal rigor

Each chapter presents an algorithm, a design technique, an application area, or arelated topic Algorithms are described in English and in a “pseudocode” designed

to be readable by anyone who has done a little programming The book contains

over 230 figures illustrating how the algorithms work Since we emphasize ciency as a design criterion, we include careful analyses of the running times of all

effi-our algorithms

The text is intended primarily for use in undergraduate or graduate courses inalgorithms or data structures Because it discusses engineering issues in algorithmdesign, as well as mathematical aspects, it is equally well suited for self-study bytechnical professionals

In this, the second edition, we have updated the entire book The changes rangefrom the addition of new chapters to the rewriting of individual sentences

To the teacher

This book is designed to be both versatile and complete You will find it useful for

a variety of courses, from an undergraduate course in data structures up through

a graduate course in algorithms Because we have provided considerably morematerial than can fit in a typical one-term course, you should think of the book as

a “buffet” or “smorgasbord” from which you can pick and choose the material thatbest supports the course you wish to teach

You should find it easy to organize your course around just the chapters youneed We have made chapters relatively self-contained, so that you need not worryabout an unexpected and unnecessary dependence of one chapter on another Eachchapter presents the easier material first and the more difficult material later, with

Trang 15

section boundaries marking natural stopping points In an undergraduate course,you might use only the earlier sections from a chapter; in a graduate course, youmight cover the entire chapter.

We have included over 920 exercises and over 140 problems Each section endswith exercises, and each chapter ends with problems The exercises are generallyshort questions that test basic mastery of the material Some are simple self-checkthought exercises, whereas others are more substantial and are suitable as assignedhomework The problems are more elaborate case studies that often introduce newmaterial; they typically consist of several questions that lead the student throughthe steps required to arrive at a solution

We have starred () the sections and exercises that are more suitable for graduate

students than for undergraduates A starred section is not necessarily more cult than an unstarred one, but it may require an understanding of more advancedmathematics Likewise, starred exercises may require an advanced background ormore than average creativity

diffi-To the student

We hope that this textbook provides you with an enjoyable introduction to thefield of algorithms We have attempted to make every algorithm accessible andinteresting To help you when you encounter unfamiliar or difficult algorithms, wedescribe each one in a step-by-step manner We also provide careful explanations

of the mathematics needed to understand the analysis of the algorithms If youalready have some familiarity with a topic, you will find the chapters organized sothat you can skim introductory sections and proceed quickly to the more advancedmaterial

This is a large book, and your class will probably cover only a portion of itsmaterial We have tried, however, to make this a book that will be useful to younow as a course textbook and also later in your career as a mathematical deskreference or an engineering handbook

What are the prerequisites for reading this book?

• You should have some programming experience In particular, you should derstand recursive procedures and simple data structures such as arrays andlinked lists

un-• You should have some facility with proofs by mathematical induction A fewportions of the book rely on some knowledge of elementary calculus Beyondthat, Parts I and VIII of this book teach you all the mathematical techniques youwill need

Trang 16

Preface xv

To the professional

The wide range of topics in this book makes it an excellent handbook on rithms Because each chapter is relatively self-contained, you can focus in on thetopics that most interest you

algo-Most of the algorithms we discuss have great practical utility We thereforeaddress implementation concerns and other engineering issues We often providepractical alternatives to the few algorithms that are primarily of theoretical interest

If you wish to implement any of the algorithms, you will find the translation ofour pseudocode into your favorite programming language a fairly straightforwardtask The pseudocode is designed to present each algorithm clearly and succinctly.Consequently, we do not address error-handling and other software-engineering is-sues that require specific assumptions about your programming environment Weattempt to present each algorithm simply and directly without allowing the idiosyn-crasies of a particular programming language to obscure its essence

To our colleagues

We have supplied an extensive bibliography and pointers to the current literature.Each chapter ends with a set of “chapter notes” that give historical details andreferences The chapter notes do not provide a complete reference to the wholefield of algorithms, however Though it may be hard to believe for a book of thissize, many interesting algorithms could not be included due to lack of space.Despite myriad requests from students for solutions to problems and exercises,

we have chosen as a matter of policy not to supply references for problems andexercises, to remove the temptation for students to look up a solution rather than tofind it themselves

Changes for the second edition

What has changed between the first and second editions of this book? Depending

on how you look at it, either not much or quite a bit

A quick look at the table of contents shows that most of the first-edition chaptersand sections appear in the second edition We removed two chapters and a handful

of sections, but we have added three new chapters and four new sections apart fromthese new chapters If you were to judge the scope of the changes by the table ofcontents, you would likely conclude that the changes were modest

The changes go far beyond what shows up in the table of contents, however

In no particular order, here is a summary of the most significant changes for thesecond edition:

Trang 17

• Cliff Stein was added as a coauthor.

• Errors have been corrected How many errors? Let’s just say several

• There are three new chapters:

• Chapter 1 discusses the role of algorithms in computing

• Chapter 5 covers probabilistic analysis and randomized algorithms As inthe first edition, these topics appear throughout the book

• Chapter 29 is devoted to linear programming

• Within chapters that were carried over from the first edition, there are new tions on the following topics:

sec-• perfect hashing (Section 11.5),

• two applications of dynamic programming (Sections 15.1 and 15.5), and

• approximation algorithms that use randomization and linear programming(Section 35.4)

• To allow more algorithms to appear earlier in the book, three of the chapters onmathematical background have been moved from Part I to the Appendix, which

is Part VIII

• There are over 40 new problems and over 185 new exercises

• We have made explicit the use of loop invariants for proving correctness Ourfirst loop invariant appears in Chapter 2, and we use them a couple of dozentimes throughout the book

• Many of the probabilistic analyses have been rewritten In particular, we use in

a dozen places the technique of “indicator random variables,” which simplifyprobabilistic analyses, especially when random variables are dependent

• We have expanded and updated the chapter notes and bibliography The ography has grown by over 50%, and we have mentioned many new algorithmicresults that have appeared subsequent to the printing of the first edition

bibli-We have also made the following changes:

• The chapter on solving recurrences no longer contains the iteration method stead, in Section 4.2, we have “promoted” recursion trees to constitute a method

In-in their own right We have found that drawIn-ing out recursion trees is less prone than iterating recurrences We do point out, however, that recursion treesare best used as a way to generate guesses that are then verified via the substi-tution method

Trang 18

error-Preface xvii

• The partitioning method used for quicksort (Section 7.1) and the expectedlinear-time order-statistic algorithm (Section 9.2) is different We now use themethod developed by Lomuto, which, along with indicator random variables,allows for a somewhat simpler analysis The method from the first edition, due

to Hoare, appears as a problem in Chapter 7

• We have modified the discussion of universal hashing in Section 11.3.3 so that

it integrates into the presentation of perfect hashing

• There is a much simpler analysis of the height of a randomly built binary searchtree in Section 12.4

• The discussions on the elements of dynamic programming (Section 15.3) andthe elements of greedy algorithms (Section 16.2) are significantly expanded.The exploration of the activity-selection problem, which starts off the greedy-algorithms chapter, helps to clarify the relationship between dynamic program-ming and greedy algorithms

• We have replaced the proof of the running time of the disjoint-set-union datastructure in Section 21.4 with a proof that uses the potential method to derive atight bound

• The proof of correctness of the algorithm for strongly connected components

in Section 22.5 is simpler, clearer, and more direct

• Chapter 24, on single-source shortest paths, has been reorganized to moveproofs of the essential properties to their own section The new organizationallows us to focus earlier on algorithms

• Section 34.5 contains an expanded overview of NP-completeness as well as newNP-completeness proofs for the hamiltonian-cycle and subset-sum problems.Finally, virtually every section has been edited to correct, simplify, and clarifyexplanations and proofs

Web site

Another change from the first edition is that this book now has its own web site:http://mitpress.mit.edu/algorithms/ You can use the web site toreport errors, obtain a list of known errors, or make suggestions; we would like tohear from you We particularly welcome ideas for new exercises and problems, butplease include solutions

We regret that we cannot personally respond to all comments

Trang 19

Acknowledgments for the first edition

Many friends and colleagues have contributed greatly to the quality of this book

We thank all of you for your help and constructive criticisms

MIT’s Laboratory for Computer Science has provided an ideal working ment Our colleagues in the laboratory’s Theory of Computation Group have beenparticularly supportive and tolerant of our incessant requests for critical appraisal ofchapters We specifically thank Baruch Awerbuch, Shafi Goldwasser, Leo Guibas,Tom Leighton, Albert Meyer, David Shmoys, and ´Eva Tardos Thanks to WilliamAng, Sally Bemus, Ray Hirschfeld, and Mark Reinhold for keeping our machines(DEC Microvaxes, Apple Macintoshes, and Sun Sparcstations) running and for re-compiling TEX whenever we exceeded a compile-time limit Thinking MachinesCorporation provided partial support for Charles Leiserson to work on this bookduring a leave of absence from MIT

environ-Many colleagues have used drafts of this text in courses at other schools Theyhave suggested numerous corrections and revisions We particularly wish to thankRichard Beigel, Andrew Goldberg, Joan Lucas, Mark Overmars, Alan Sherman,and Diane Souvaine

Many teaching assistants in our courses have made significant contributions tothe development of this material We especially thank Alan Baratz, Bonnie Berger,Aditi Dhagat, Burt Kaliski, Arthur Lent, Andrew Moulton, Marios Papaefthymiou,Cindy Phillips, Mark Reinhold, Phil Rogaway, Flavio Rose, Arie Rudich, AlanSherman, Cliff Stein, Susmita Sur, Gregory Troxel, and Margaret Tuttle

Additional valuable technical assistance was provided by many individuals.Denise Sergent spent many hours in the MIT libraries researching bibliographicreferences Maria Sensale, the librarian of our reading room, was always cheerfuland helpful Access to Albert Meyer’s personal library saved many hours of li-brary time in preparing the chapter notes Shlomo Kipnis, Bill Niehaus, and DavidWilson proofread old exercises, developed new ones, and wrote notes on their so-lutions Marios Papaefthymiou and Gregory Troxel contributed to the indexing.Over the years, our secretaries Inna Radzihovsky, Denise Sergent, Gayle Sherman,and especially Be Blackburn provided endless support in this project, for which wethank them

Many errors in the early drafts were reported by students We particularlythank Bobby Blumofe, Bonnie Eisenberg, Raymond Johnson, John Keen, RichardLethin, Mark Lillibridge, John Pezaris, Steve Ponzio, and Margaret Tuttle for theircareful readings

Colleagues have also provided critical reviews of specific chapters, or tion on specific algorithms, for which we are grateful We especially thank BillAiello, Alok Aggarwal, Eric Bach, Vaˇsek Chv´atal, Richard Cole, Johan Hastad,Alex Ishii, David Johnson, Joe Kilian, Dina Kravets, Bruce Maggs, Jim Orlin,James Park, Thane Plambeck, Hershel Safer, Jeff Shallit, Cliff Stein, Gil Strang,

Trang 20

informa-Preface xix

Bob Tarjan, and Paul Wang Several of our colleagues also graciously supplied

us with problems; we particularly thank Andrew Goldberg, Danny Sleator, andUmesh Vazirani

It has been a pleasure working with The MIT Press and McGraw-Hill in thedevelopment of this text We especially thank Frank Satlow, Terry Ehling, LarryCohen, and Lorrie Lejeune of The MIT Press and David Shapiro of McGraw-Hillfor their encouragement, support, and patience We are particularly grateful toLarry Cohen for his outstanding copyediting

Acknowledgments for the second edition

When we asked Julie Sussman, P.P.A., to serve as a technical copyeditor for thesecond edition, we did not know what a good deal we were getting In addition

to copyediting the technical content, Julie enthusiastically edited our prose It ishumbling to think of how many errors Julie found in our earlier drafts, thoughconsidering how many errors she found in the first edition (after it was printed,unfortunately), it is not surprising Moreover, Julie sacrificed her own schedule

to accommodate ours—she even brought chapters with her on a trip to the VirginIslands! Julie, we cannot thank you enough for the amazing job you did

The work for the second edition was done while the authors were members ofthe Department of Computer Science at Dartmouth College and the Laboratory forComputer Science at MIT Both were stimulating environments in which to work,and we thank our colleagues for their support

Friends and colleagues all over the world have provided suggestions and ions that guided our writing Many thanks to Sanjeev Arora, Javed Aslam, GuyBlelloch, Avrim Blum, Scot Drysdale, Hany Farid, Hal Gabow, Andrew Goldberg,David Johnson, Yanlin Liu, Nicolas Schabanel, Alexander Schrijver, Sasha Shen,David Shmoys, Dan Spielman, Gerald Jay Sussman, Bob Tarjan, Mikkel Thorup,and Vijay Vazirani

opin-Many teachers and colleagues have taught us a great deal about algorithms Weparticularly acknowledge our teachers Jon L Bentley, Bob Floyd, Don Knuth,Harold Kuhn, H T Kung, Richard Lipton, Arnold Ross, Larry Snyder, Michael I.Shamos, David Shmoys, Ken Steiglitz, Tom Szymanski, ´Eva Tardos, Bob Tarjan,and Jeffrey Ullman

We acknowledge the work of the many teaching assistants for the algorithmscourses at MIT and Dartmouth, including Joseph Adler, Craig Barrack, BobbyBlumofe, Roberto De Prisco, Matteo Frigo, Igal Galperin, David Gupta, Raj D.Iyer, Nabil Kahale, Sarfraz Khurshid, Stavros Kolliopoulos, Alain Leblanc, Yuan

Ma, Maria Minkoff, Dimitris Mitsouras, Alin Popescu, Harald Prokop, SudiptaSengupta, Donna Slonim, Joshua A Tauber, Sivan Toledo, Elisheva Werner-Reiss,Lea Wittie, Qiang Wu, and Michael Zhang

Trang 21

Computer support was provided by William Ang, Scott Blomquist, and GregShomo at MIT and by Wayne Cripps, John Konkle, and Tim Tregubov at Dart-mouth Thanks also to Be Blackburn, Don Dailey, Leigh Deacon, Irene Sebeda,and Cheryl Patton Wu at MIT and to Phyllis Bellmore, Kelly Clark, Delia Mauceli,Sammie Travis, Deb Whiting, and Beth Young at Dartmouth for administrativesupport Michael Fromberger, Brian Campbell, Amanda Eubanks, Sung HoonKim, and Neha Narula also provided timely support at Dartmouth.

Many people were kind enough to report errors in the first edition We thankthe following people, each of whom was the first to report an error from the firstedition: Len Adleman, Selim Akl, Richard Anderson, Juan Andrade-Cetto, Gre-gory Bachelis, David Barrington, Paul Beame, Richard Beigel, Margrit Betke, AlexBlakemore, Bobby Blumofe, Alexander Brown, Xavier Cazin, Jack Chan, RichardChang, Chienhua Chen, Ien Cheng, Hoon Choi, Drue Coles, Christian Collberg,George Collins, Eric Conrad, Peter Csaszar, Paul Dietz, Martin Dietzfelbinger,Scot Drysdale, Patricia Ealy, Yaakov Eisenberg, Michael Ernst, Michael For-mann, Nedim Fresko, Hal Gabow, Marek Galecki, Igal Galperin, Luisa Gargano,John Gately, Rosario Genario, Mihaly Gereb, Ronald Greenberg, Jerry Gross-man, Stephen Guattery, Alexander Hartemik, Anthony Hill, Thomas Hofmeister,Mathew Hostetter, Yih-Chun Hu, Dick Johnsonbaugh, Marcin Jurdzinki, NabilKahale, Fumiaki Kamiya, Anand Kanagala, Mark Kantrowitz, Scott Karlin, DeanKelley, Sanjay Khanna, Haluk Konuk, Dina Kravets, Jon Kroger, Bradley Kusz-maul, Tim Lambert, Hang Lau, Thomas Lengauer, George Madrid, Bruce Maggs,Victor Miller, Joseph Muskat, Tung Nguyen, Michael Orlov, James Park, SeongbinPark, Ioannis Paschalidis, Boaz Patt-Shamir, Leonid Peshkin, Patricio Poblete, IraPohl, Stephen Ponzio, Kjell Post, Todd Poynor, Colin Prepscius, Sholom Rosen,Dale Russell, Hershel Safer, Karen Seidel, Joel Seiferas, Erik Seligman, StanleySelkow, Jeffrey Shallit, Greg Shannon, Micha Sharir, Sasha Shen, Norman Shul-man, Andrew Singer, Daniel Sleator, Bob Sloan, Michael Sofka, Volker Strumpen,Lon Sunshine, Julie Sussman, Asterio Tanaka, Clark Thomborson, Nils Thomme-sen, Homer Tilton, Martin Tompa, Andrei Toom, Felzer Torsten, Hirendu Vaishnav,

M Veldhorst, Luca Venuti, Jian Wang, Michael Wellman, Gerry Wiener, RonaldWilliams, David Wolfe, Jeff Wong, Richard Woundy, Neal Young, Huaiyuan Yu,Tian Yuxing, Joe Zachary, Steve Zhang, Florian Zschoke, and Uri Zwick

Many of our colleagues provided thoughtful reviews or filled out a long vey We thank reviewers Nancy Amato, Jim Aspnes, Kevin Compton, WilliamEvans, Peter Gacs, Michael Goldwasser, Andrzej Proskurowski, Vijaya Ramachan-dran, and John Reif We also thank the following people for sending back thesurvey: James Abello, Josh Benaloh, Bryan Beresford-Smith, Kenneth Blaha,Hans Bodlaender, Richard Borie, Ted Brown, Domenico Cantone, M Chen,Robert Cimikowski, William Clocksin, Paul Cull, Rick Decker, Matthew Dick-erson, Robert Douglas, Margaret Fleck, Michael Goodrich, Susanne Hambrusch,

Trang 22

sur-Preface xxi

Dean Hendrix, Richard Johnsonbaugh, Kyriakos Kalorkoti, Srinivas Kankanahalli,Hikyoo Koh, Steven Lindell, Errol Lloyd, Andy Lopez, Dian Rae Lopez, GeorgeLucker, David Maier, Charles Martel, Xiannong Meng, David Mount, Alberto Poli-criti, Andrzej Proskurowski, Kirk Pruhs, Yves Robert, Guna Seetharaman, StanleySelkow, Robert Sloan, Charles Steele, Gerard Tel, Murali Varanasi, Bernd Walter,and Alden Wright We wish we could have carried out all your suggestions Theonly problem is that if we had, the second edition would have been about 3000pages long!

The second edition was produced in LATEX 2ε Michael Downes converted the

LATEX macros from “classic” LATEX to LATEX 2ε, and he converted the text files to

use these new macros David Jones also provided LATEX 2ε support Figures for the

second edition were produced by the authors using MacDraw Pro As in the firstedition, the index was compiled using Windex, a C program written by the authors,and the bibliography was prepared using BIBTEX Ayorkor Mills-Tettey and RobLeathern helped convert the figures to MacDraw Pro, and Ayorkor also checkedour bibliography

As it was in the first edition, working with The MIT Press and McGraw-Hillhas been a delight Our editors, Bob Prior of The MIT Press and Betsy Jones ofMcGraw-Hill, put up with our antics and kept us going with carrots and sticks.Finally, we thank our wives—Nicole Cormen, Gail Rivest, and Rebecca Ivry—our children—Ricky, William, and Debby Leiserson; Alex and Christopher Rivest;and Molly, Noah, and Benjamin Stein—and our parents—Renee and PerryCormen, Jean and Mark Leiserson, Shirley and Lloyd Rivest, and Irene and IraStein—for their love and support during the writing of this book The patienceand encouragement of our families made this project possible We affectionatelydedicate this book to them

May 2001

Trang 24

Introduction to Algorithms

Second Edition

Trang 26

This part will get you started in thinking about designing and analyzing algorithms

It is intended to be a gentle introduction to how we specify algorithms, some of thedesign strategies we will use throughout this book, and many of the fundamentalideas used in algorithm analysis Later parts of this book will build upon this base.Chapter 1 is an overview of algorithms and their place in modern computingsystems This chapter defines what an algorithm is and lists some examples It alsomakes a case that algorithms are a technology, just as are fast hardware, graphicaluser interfaces, object-oriented systems, and networks

In Chapter 2, we see our first algorithms, which solve the problem of sorting

a sequence of n numbers They are written in a pseudocode which, although not

directly translatable to any conventional programming language, conveys the ture of the algorithm clearly enough that a competent programmer can implement

struc-it in the language of his choice The sorting algorstruc-ithms we examine are insertionsort, which uses an incremental approach, and merge sort, which uses a recursivetechnique known as “divide and conquer.” Although the time each requires in-

creases with the value of n, the rate of increase differs between the two algorithms.

We determine these running times in Chapter 2, and we develop a useful notation

to express them

Chapter 3 precisely defines this notation, which we call asymptotic notation Itstarts by defining several asymptotic notations, which we use for bounding algo-rithm running times from above and/or below The rest of Chapter 3 is primarily apresentation of mathematical notation Its purpose is more to ensure that your use

of notation matches that in this book than to teach you new mathematical concepts

Trang 27

Chapter 4 delves further into the divide-and-conquer method introduced inChapter 2 In particular, Chapter 4 contains methods for solving recurrences, whichare useful for describing the running times of recursive algorithms One powerfultechnique is the “master method,” which can be used to solve recurrences that arisefrom divide-and-conquer algorithms Much of Chapter 4 is devoted to proving thecorrectness of the master method, though this proof may be skipped without harm.Chapter 5 introduces probabilistic analysis and randomized algorithms We typ-ically use probabilistic analysis to determine the running time of an algorithm incases in which, due to the presence of an inherent probability distribution, therunning time may differ on different inputs of the same size In some cases, weassume that the inputs conform to a known probability distribution, so that we areaveraging the running time over all possible inputs In other cases, the probabilitydistribution comes not from the inputs but from random choices made during thecourse of the algorithm An algorithm whose behavior is determined not only by itsinput but by the values produced by a random-number generator is a randomizedalgorithm We can use randomized algorithms to enforce a probability distribution

on the inputs—thereby ensuring that no particular input always causes poor mance—or even to bound the error rate of algorithms that are allowed to produceincorrect results on a limited basis

perfor-Appendices A–C contain other mathematical material that you will find helpful

as you read this book You are likely to have seen much of the material in theappendix chapters before having read this book (although the specific notationalconventions we use may differ in some cases from what you have seen in the past),and so you should think of the Appendices as reference material On the otherhand, you probably have not already seen most of the material in Part I All thechapters in Part I and the Appendices are written with a tutorial flavor

Trang 28

1 The Role of Algorithms in Computing

What are algorithms? Why is the study of algorithms worthwhile? What is the role

of algorithms relative to other technologies used in computers? In this chapter, wewill answer these questions

1.1 Algorithms

Informally, an algorithm is any well-defined computational procedure that takes some value, or set of values, as input and produces some value, or set of values, as output An algorithm is thus a sequence of computational steps that transform the

input into the output

We can also view an algorithm as a tool for solving a well-specified tional problem The statement of the problem specifies in general terms the desired

computa-input/output relationship The algorithm describes a specific computational dure for achieving that input/output relationship

proce-For example, one might need to sort a sequence of numbers into nondecreasingorder This problem arises frequently in practice and provides fertile ground forintroducing many standard design techniques and analysis tools Here is how we

formally define the sorting problem:

Input: A sequence of n numbers a1 , a2, , a n

Output: A permutation (reordering) a

returns as output the sequence26, 31, 41, 41, 58, 59 Such an input sequence is

called an instance of the sorting problem In general, an instance of a problem

consists of the input (satisfying whatever constraints are imposed in the problemstatement) needed to compute a solution to the problem

Sorting is a fundamental operation in computer science (many programs use it

as an intermediate step), and as a result a large number of good sorting algorithms

Trang 29

have been developed Which algorithm is best for a given application dependson—among other factors—the number of items to be sorted, the extent to whichthe items are already somewhat sorted, possible restrictions on the item values, andthe kind of storage device to be used: main memory, disks, or tapes.

An algorithm is said to be correct if, for every input instance, it halts with the correct output We say that a correct algorithm solves the given computational

problem An incorrect algorithm might not halt at all on some input instances, or itmight halt with an answer other than the desired one Contrary to what one mightexpect, incorrect algorithms can sometimes be useful, if their error rate can becontrolled We shall see an example of this in Chapter 31 when we study algorithmsfor finding large prime numbers Ordinarily, however, we shall be concerned onlywith correct algorithms

An algorithm can be specified in English, as a computer program, or even as

a hardware design The only requirement is that the specification must provide aprecise description of the computational procedure to be followed

What kinds of problems are solved by algorithms?

Sorting is by no means the only computational problem for which algorithms havebeen developed (You probably suspected as much when you saw the size of thisbook.) Practical applications of algorithms are ubiquitous and include the follow-ing examples:

• The Human Genome Project has the goals of identifying all the 100,000 genes

in human DNA, determining the sequences of the 3 billion chemical base pairsthat make up human DNA, storing this information in databases, and devel-oping tools for data analysis Each of these steps requires sophisticated algo-rithms While the solutions to the various problems involved are beyond thescope of this book, ideas from many of the chapters in this book are used in thesolution of these biological problems, thereby enabling scientists to accomplishtasks while using resources efficiently The savings are in time, both human andmachine, and in money, as more information can be extracted from laboratorytechniques

• The Internet enables people all around the world to quickly access and retrievelarge amounts of information In order to do so, clever algorithms are employed

to manage and manipulate this large volume of data Examples of problemswhich must be solved include finding good routes on which the data will travel(techniques for solving such problems appear in Chapter 24), and using a searchengine to quickly find pages on which particular information resides (relatedtechniques are in Chapters 11 and 32)

Trang 30

1.1 Algorithms 7

• Electronic commerce enables goods and services to be negotiated and changed electronically The ability to keep information such as credit card num-bers, passwords, and bank statements private is essential if electronic commerce

ex-is to be used widely Public-key cryptography and digital signatures (covered inChapter 31) are among the core technologies used and are based on numericalalgorithms and number theory

• In manufacturing and other commercial settings, it is often important to cate scarce resources in the most beneficial way An oil company may wish

allo-to know where allo-to place its wells in order allo-to maximize its expected profit Acandidate for the presidency of the United States may want to determine where

to spend money buying campaign advertising in order to maximize the chances

of winning an election An airline may wish to assign crews to flights in theleast expensive way possible, making sure that each flight is covered and thatgovernment regulations regarding crew scheduling are met An Internet serviceprovider may wish to determine where to place additional resources in order toserve its customers more effectively All of these are examples of problems thatcan be solved using linear programming, which we shall study in Chapter 29.While some of the details of these examples are beyond the scope of this book,

we do give underlying techniques that apply to these problems and problem areas

We also show how to solve many concrete problems in this book, including thefollowing:

• We are given a road map on which the distance between each pair of adjacentintersections is marked, and our goal is to determine the shortest route fromone intersection to another The number of possible routes can be huge, even if

we disallow routes that cross over themselves How do we choose which of allpossible routes is the shortest? Here, we model the road map (which is itself amodel of the actual roads) as a graph (which we will meet in Chapter 10 andAppendix B), and we wish to find the shortest path from one vertex to another

in the graph We shall see how to solve this problem efficiently in Chapter 24

• We are given a sequenceA1 , A2, , A n of n matrices, and we wish to mine their product A1 A2· · · A n Because matrix multiplication is associative,

deter-there are several legal multiplication orders For example, if n = 4, we couldperform the matrix multiplications as if the product were parenthesized in any

of the following orders: (A1(A2(A3A4))), (A1((A2A3)A4)), ((A1A2)(A3A4)), ((A1(A2A3))A4), or (((A1A2)A3)A4) If these matrices are all square (and

hence the same size), the multiplication order will not affect how long the trix multiplications take If, however, these matrices are of differing sizes (yettheir sizes are compatible for matrix multiplication), then the multiplicationorder can make a very big difference The number of possible multiplication

Trang 31

ma-orders is exponential in n, and so trying all possible ma-orders may take a very

long time We shall see in Chapter 15 how to use a general technique known asdynamic programming to solve this problem much more efficiently

• We are given an equation ax ≡ b (mod n), where a, b, and n are integers, and we wish to find all the integers x, modulo n, that satisfy the equation.

There may be zero, one, or more than one such solution We can simply try

x = 0, 1, , n − 1 in order, but Chapter 31 shows a more efficient method.

• We are given n points in the plane, and we wish to find the convex hull of

these points The convex hull is the smallest convex polygon containing thepoints Intuitively, we can think of each point as being represented by a nailsticking out from a board The convex hull would be represented by a tightrubber band that surrounds all the nails Each nail around which the rubberband makes a turn is a vertex of the convex hull (See Figure 33.6 on page 948for an example.) Any of the 2n subsets of the points might be the vertices

of the convex hull Knowing which points are vertices of the convex hull isnot quite enough, either, since we also need to know the order in which theyappear There are many choices, therefore, for the vertices of the convex hull.Chapter 33 gives two good methods for finding the convex hull

These lists are far from exhaustive (as you again have probably surmised fromthis book’s heft), but exhibit two characteristics that are common to many interest-ing algorithms

1 There are many candidate solutions, most of which are not what we want ing one that we do want can present quite a challenge

Find-2 There are practical applications Of the problems in the above list, shortestpaths provides the easiest examples A transportation firm, such as a trucking

or railroad company, has a financial interest in finding shortest paths through

a road or rail network because taking shorter paths results in lower labor andfuel costs Or a routing node on the Internet may need to find the shortest paththrough the network in order to route a message quickly

Data structures

This book also contains several data structures A data structure is a way to store

and organize data in order to facilitate access and modifications No single datastructure works well for all purposes, and so it is important to know the strengthsand limitations of several of them

Trang 32

Hard problems

Most of this book is about efficient algorithms Our usual measure of efficiency

is speed, i.e., how long an algorithm takes to produce its result There are someproblems, however, for which no efficient solution is known Chapter 34 studies

an interesting subset of these problems, which are known as NP-complete

Why are NP-complete problems interesting? First, although no efficient rithm for an NP-complete problem has ever been found, nobody has ever proventhat an efficient algorithm for one cannot exist In other words, it is unknownwhether or not efficient algorithms exist for NP-complete problems Second, theset of NP-complete problems has the remarkable property that if an efficient al-gorithm exists for any one of them, then efficient algorithms exist for all of them.This relationship among the NP-complete problems makes the lack of efficient so-lutions all the more tantalizing Third, several NP-complete problems are similar,but not identical, to problems for which we do know of efficient algorithms Asmall change to the problem statement can cause a big change to the efficiency ofthe best known algorithm

algo-It is valuable to know about NP-complete problems because some of them arisesurprisingly often in real applications If you are called upon to produce an efficientalgorithm for an NP-complete problem, you are likely to spend a lot of time in afruitless search If you can show that the problem is NP-complete, you can insteadspend your time developing an efficient algorithm that gives a good, but not thebest possible, solution

As a concrete example, consider a trucking company with a central warehouse.Each day, it loads up the truck at the warehouse and sends it around to several lo-cations to make deliveries At the end of the day, the truck must end up back atthe warehouse so that it is ready to be loaded for the next day To reduce costs, thecompany wants to select an order of delivery stops that yields the lowest overalldistance traveled by the truck This problem is the well-known “traveling-salesmanproblem,” and it is NP-complete It has no known efficient algorithm Under cer-tain assumptions, however, there are efficient algorithms that give an overall dis-tance that is not too far above the smallest possible Chapter 35 discusses such

“approximation algorithms.”

Trang 33

1.1-1

Give a real-world example in which one of the following computational problemsappears: sorting, determining the best order for multiplying matrices, or findingthe convex hull

1.2 Algorithms as a technology

Suppose computers were infinitely fast and computer memory was free Wouldyou have any reason to study algorithms? The answer is yes, if for no other reasonthan that you would still like to demonstrate that your solution method terminatesand does so with the correct answer

If computers were infinitely fast, any correct method for solving a problemwould do You would probably want your implementation to be within the bounds

of good software engineering practice (i.e., well designed and documented), butyou would most often use whichever method was the easiest to implement

Of course, computers may be fast, but they are not infinitely fast And memorymay be cheap, but it is not free Computing time is therefore a bounded resource,and so is space in memory These resources should be used wisely, and algorithmsthat are efficient in terms of time or space will help you do so

Trang 34

1.2 Algorithms as a technology 11

Efficiency

Algorithms devised to solve the same problem often differ dramatically in theirefficiency These differences can be much more significant than differences due tohardware and software

As an example, in Chapter 2, we will see two algorithms for sorting The first,

known as insertion sort, takes time roughly equal to c1 n2to sort n items, where c1

is a constant that does not depend on n That is, it takes time roughly proportional

to n2 The second, merge sort, takes time roughly equal to c2n lg n, where lg n

stands for log2n and c2is another constant that also does not depend on n Insertion sort usually has a smaller constant factor than merge sort, so that c1 < c2 We shallsee that the constant factors can be far less significant in the running time than the

dependence on the input size n Where merge sort has a factor of lg n in its running time, insertion sort has a factor of n, which is much larger Although insertion sort

is usually faster than merge sort for small input sizes, once the input size n becomes large enough, merge sort’s advantage of lg n vs n will more than compensate for the difference in constant factors No matter how much smaller c1 is than c2, there

will always be a crossover point beyond which merge sort is faster

For a concrete example, let us pit a faster computer (computer A) running tion sort against a slower computer (computer B) running merge sort They eachmust sort an array of one million numbers Suppose that computer A executes onebillion instructions per second and computer B executes only ten million instruc-tions per second, so that computer A is 100 times faster than computer B in rawcomputing power To make the difference even more dramatic, suppose that theworld’s craftiest programmer codes insertion sort in machine language for com-

inser-puter A, and the resulting code requires 2n2instructions to sort n numbers (Here,

c1= 2.) Merge sort, on the other hand, is programmed for computer B by an age programmer using a high-level language with an inefficient compiler, with the

aver-resulting code taking 50n lg n instructions (so that c2 = 50) To sort one millionnumbers, computer A takes

Trang 35

Algorithms and other technologies

The example above shows that algorithms, like computer hardware, are a ogy Total system performance depends on choosing efficient algorithms as much

technol-as on choosing ftechnol-ast hardware Just technol-as rapid advances are being made in other puter technologies, they are being made in algorithms as well

com-You might wonder whether algorithms are truly that important on contemporarycomputers in light of other advanced technologies, such as

• hardware with high clock rates, pipelining, and superscalar architectures,

• easy-to-use, intuitive graphical user interfaces (GUIs),

• object-oriented systems, and

• local-area and wide-area networking

The answer is yes Although there are some applications that do not explicitlyrequire algorithmic content at the application level (e.g., some simple web-basedapplications), most also require a degree of algorithmic content on their own Forexample, consider a web-based service that determines how to travel from onelocation to another (Several such services existed at the time of this writing.) Itsimplementation would rely on fast hardware, a graphical user interface, wide-areanetworking, and also possibly on object orientation However, it would also requirealgorithms for certain operations, such as finding routes (probably using a shortest-path algorithm), rendering maps, and interpolating addresses

Moreover, even an application that does not require algorithmic content at theapplication level relies heavily upon algorithms Does the application rely on fasthardware? The hardware design used algorithms Does the application rely ongraphical user interfaces? The design of any GUI relies on algorithms Does theapplication rely on networking? Routing in networks relies heavily on algorithms.Was the application written in a language other than machine code? Then it wasprocessed by a compiler, interpreter, or assembler, all of which make extensive use

of algorithms Algorithms are at the core of most technologies used in rary computers

contempo-Furthermore, with the ever-increasing capacities of computers, we use them tosolve larger problems than ever before As we saw in the above comparison be-tween insertion sort and merge sort, it is at larger problem sizes that the differences

in efficiencies between algorithms become particularly prominent

Having a solid base of algorithmic knowledge and technique is one characteristicthat separates the truly skilled programmers from the novices With modern com-puting technology, you can accomplish some tasks without knowing much aboutalgorithms, but with a good background in algorithms, you can do much, muchmore

Trang 36

Problems for Chapter 1 13

Suppose we are comparing implementations of insertion sort and merge sort on the

same machine For inputs of size n, insertion sort runs in 8n2steps, while merge

sort runs in 64n lg n steps For which values of n does insertion sort beat merge

1-1 Comparison of running times

For each function f (n) and time t in the following table, determine the largest size

n of a problem that can be solved in time t, assuming that the algorithm to solve the problem takes f (n) microseconds.

Trang 37

Chapter notes

There are many excellent texts on the general topic of algorithms, including those

by Aho, Hopcroft, and Ullman [5, 6], Baase and Van Gelder [26], Brassardand Bratley [46, 47], Goodrich and Tamassia [128], Horowitz, Sahni, and Ra-jasekaran [158], Kingston [179], Knuth [182, 183, 185], Kozen [193], Manber[210], Mehlhorn [217, 218, 219], Purdom and Brown [252], Reingold, Nievergelt,and Deo [257], Sedgewick [269], Skiena [280], and Wilf [315] Some of the morepractical aspects of algorithm design are discussed by Bentley [39, 40] and Gonnet[126] Surveys of the field of algorithms can also be found in the Handbook ofTheoretical Computer Science, Volume A [302] and the CRC Handbook on Al-gorithms and Theory of Computation [24] Overviews of the algorithms used incomputational biology can be found in textbooks by Gusfield [136], Pevzner [240],Setubal and Meidanis [272], and Waterman [309]

Trang 38

2 Getting Started

This chapter will familiarize you with the framework we shall use throughout thebook to think about the design and analysis of algorithms It is self-contained, but

it does include several references to material that will be introduced in Chapters

3 and 4 (It also contains several summations, which Appendix A shows how tosolve.)

We begin by examining the insertion sort algorithm to solve the sorting problemintroduced in Chapter 1 We define a “pseudocode” that should be familiar to read-ers who have done computer programming and use it to show how we shall specifyour algorithms Having specified the algorithm, we then argue that it correctly sortsand we analyze its running time The analysis introduces a notation that focuses

on how that time increases with the number of items to be sorted Following ourdiscussion of insertion sort, we introduce the divide-and-conquer approach to thedesign of algorithms and use it to develop an algorithm called merge sort We endwith an analysis of merge sort’s running time

2.1 Insertion sort

Our first algorithm, insertion sort, solves the sorting problem introduced in

Chap-ter 1:

Input: A sequence of n numbers a1 , a2, , a n

Output: A permutation (reordering) a

The numbers that we wish to sort are also known as the keys.

In this book, we shall typically describe algorithms as programs written in a

pseudocode that is similar in many respects to C, Pascal, or Java If you have been

introduced to any of these languages, you should have little trouble reading our gorithms What separates pseudocode from “real” code is that in pseudocode, we

Trang 39

Figure 2.1 Sorting a hand of cards using insertion sort.

employ whatever expressive method is most clear and concise to specify a given gorithm Sometimes, the clearest method is English, so do not be surprised if youcome across an English phrase or sentence embedded within a section of “real”code Another difference between pseudocode and real code is that pseudocode

al-is not typically concerned with al-issues of software engineering Issues of data straction, modularity, and error handling are often ignored in order to convey theessence of the algorithm more concisely

ab-We start with insertion sort, which is an efficient algorithm for sorting a small

number of elements Insertion sort works the way many people sort a hand ofplaying cards We start with an empty left hand and the cards face down on thetable We then remove one card at a time from the table and insert it into thecorrect position in the left hand To find the correct position for a card, we compare

it with each of the cards already in the hand, from right to left, as illustrated inFigure 2.1 At all times, the cards held in the left hand are sorted, and these cardswere originally the top cards of the pile on the table

Our pseudocode for insertion sort is presented as a procedure called INSERTION

-SORT, which takes as a parameter an array A[1 n] containing a sequence of length n that is to be sorted (In the code, the number n of elements in A is denoted

by length[ A].) The input numbers are sorted in place: the numbers are rearranged

within the array A, with at most a constant number of them stored outside the array at any time The input array A contains the sorted output sequence when

INSERTION-SORT is finished

Trang 40

Figure 2.2 The operation of I NSERTION -S ORTon the array A = 5, 2, 4, 6, 1, 3 Array indices

appear above the rectangles, and values stored in the array positions appear within the rectangles.

(a)–(e) The iterations of the for loop of lines 1–8 In each iteration, the black rectangle holds the

key taken from A[ j ], which is compared with the values in shaded rectangles to its left in the test of

line 5 Shaded arrows show array values moved one position to the right in line 6, and black arrows

indicate where the key is moved to in line 8 (f) The final sorted array.

Loop invariants and the correctness of insertion sort

Figure 2.2 shows how this algorithm works for A = 5, 2, 4, 6, 1, 3 The dex j indicates the “current card” being inserted into the hand At the beginning

in-of each iteration in-of the “outer” for loop, which is indexed by j , the subarray

con-sisting of elements A[1 j − 1] constitute the currently sorted hand, and elements A[ j + 1 n] correspond to the pile of cards still on the table In fact, elements A[1 j − 1] are the elements originally in positions 1 through j − 1, but now in sorted order We state these properties of A[1 j −1] formally as a loop invariant:

At the start of each iteration of the for loop of lines 1–8, the subarray

A[1 j −1] consists of the elements originally in A[1 j −1] but in sorted

order

We use loop invariants to help us understand why an algorithm is correct Wemust show three things about a loop invariant:

Tiêu đề	Introduction to Algorithms Second Edition
Tác giả	Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein
Trường học	Massachusetts Institute of Technology
Chuyên ngành	Computer Programming, Computer Algorithms
Thể loại	Book
Năm xuất bản	2001
Thành phố	Cambridge

Định dạng
Số trang	1.203
Dung lượng	13,8 MB