1. Trang chủ
  2. » Giáo Dục - Đào Tạo

The art of computer programming volume 3 sorting and searching (second edition 2011) part 1

412 2 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Sorting and Searching
Tác giả Donald E. Knuth
Chuyên ngành Computer Science
Thể loại Sách tham khảo
Năm xuất bản 2011
Định dạng
Số trang 412
Dung lượng 14,1 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Indeed,Ibelievethat virtually every important aspect ofprogrammingarisessomewhereinthe context of sorting or searching!. ThisvolumecomprisesChapters5and6 of thecompleteseries.Chapter5 is

Trang 1

the classic work

Trang 2

Third Edition (0-201-89683-4)Thisfirstvolumebeginswithbasicprogrammingconcepts and techniques, then focuses oninformation structures—the representation

of informationinside acomputer, thestructuralrelationshipsbetweendata elements andhow

todealwiththemefficiently.Elementaryapplicationsaregiventosimulation,numericalmethods, symbolic computing, software andsystemdesign

Volume2/ Seminumerical AlgorithmsThird Edition (0-201-89684-2)

Thesecondvolumeoffers acompleteintroduction to thefieldof seminumericalalgorithms,with separate chapters onrandomnumbersandarithmetic.The booksummarizesthemajor paradigms andbasictheory of suchalgorithms,thereby providingacomprehensiveinterfacebetween computer programming

and numericalanalysis

Volume3/ SortingandSearching

SecondEdition (0-201-89685-0)

Thethirdvolumecomprises themost

comprehensive survey ofclassicalcomputer

techniquesfor sortingandsearching.Itextendsthe treatment of data structuresinVolumeI

to consider bothlargeandsmalldatabases andinternaland external memories

Volume 4A/Combinatorial Algorithms,Part1(0-201-03804-8)

Thisvolumeintroduces techniquesthatallowcomputers todeal efficientlywithgiganticproblems.Itscoveragebeginswith Booleanfunctionsandbitwise tricksand techniques,thentreatsindepth the generation ofall

tuplesand permutations,allcombinations

Trang 3

jui m

Trang 5

THE ART OF

COMPUTER PROGRAMMING

SECOND EDITION

Trang 7

THE ART OF

SECOND EDITION

UpperSaddleRiver,NJ •Boston •Indianapolis •San Francisco

NewYork•Toronto Montreal•London •Munich •Paris•Madrid

Trang 8

T^XisatrademarkoftheAmerican Mathematical Society

METflFONTisatrademarkofAddison-Wesley

Theauthorand publisher have taken careinthepreparationof thisbook, butmakenoexpressedorimpliedwarrantyofany kind and assume noresponsibility for errors oromissions.Noliabilityisassumedforincidental orconsequentialdamagesinconnectionwithor arisingoutofthe useofthe informationorprograms containedherein.Thepublisheroffersexcellentdiscountsonthisbookwhenorderedinquantityforbulkpurposesor special sales,whichmayinclude electronic versionsand/or customcovers

and contentparticular toyourbusiness, training goals,marketingfocus,and brandinginterests.Formoreinformation,please contact:

U.S.CorporateandGovernmentSales (800)382-3419

corpsalesOpearsontechgroup.com

Forsalesoutside theU.S., please contact:

International Sales international0pearsoned.com

Visituson theWeb:informit.com/aw

LibraryofCongressCataloging-in-PublicationData

Knuth,Donald Ervin,

1938-The art of computer programming/Donald Ervin Knuth.

xiv,782p. 24cm

Includes bibliographical references and index

Contents:v.1.Fundamental algorithms —v.2.Seminumericalalgorithms —v.3.Sorting and searching —v.4a.Combinatorialalgorithms

Copyright©1998byAddison -Wesley

All rights reserved PrintedintheUnitedStates ofAmerica This publicationisprotectedbycopyright,and permission must be obtained from the publisherpriortoany prohibited reproduction, storageinaretrievalsystem,ortransmissioninany form

orby any means,electronic,mechanical, photocopying,recording,or likewise Forinformation regarding permissions,write to:

Pearson Education,Inc

Rightsand Contracts Department

501 BoylstonStreet,Suite900

Trang 9

anoblescience;cooksaregentlemen

—TITUSLIVIUS,AbUrbe ConditaXXXIX.vi(Robert Burton,Anatomyof Melancholy1.2.2.2)

ThisBOOKforms a natural sequel to the materialoninformation structures

inChapter2 ofVolume1,becauseitadds the concept of linearly ordered data tothe other basic structuralideas

Thetitle SortingandSearching”maysoundasifthisbookisonlyforthosesystemsprogrammerswhoareconcerned with the preparation of general-purposesorting routines or applications to informationretrieval.Butin factthe area ofsortingandsearching providesanidealframeworkfordiscussing awidevariety

ofimportant generalissues:

•Howaregoodalgorithms discovered?

•Howcan given algorithmsand programsbeimproved?

•Howcan theefficiencyofalgorithms be analyzed mathematically?

•Howcan a person choose rationallybetweendifferentalgorithms

forthe

sametask?

•Inwhatsensescanalgorithmsbeproved “best possible”?

•Howdoes the theory ofcomputinginteractwith practical considerations?

•Howcan externalmemoriesliketapes,drums,or disksbeusedefficientlywith large databases?

Indeed,Ibelievethat virtually every important aspect ofprogrammingarisessomewhereinthe context of sorting or searching!

ThisvolumecomprisesChapters5and6 of thecompleteseries.Chapter5

isconcerned with sorting into order;thisisa large subject that hasbeendividedchieflyintotwoparts,internal sortingandexternalsorting Therealsoare

supplementarysections,whichdevelop auxiliary theoriesaboutpermutations(Section5.1)and about optimumtechniquesforsorting (Section5.3).Chapter6dealswith theproblemof searchingforspecifieditemsintables orfiles;thisissubdivided intomethodsthat search sequentially, orby comparisonof keys, or

bydigitalproperties,orbyhashing,andthen themoredifficultproblemof

Trang 10

between bothchapters,with strong analogies tying the topics together Twoimportantvarietiesofinformation structures are also discussed,inaddition tothose consideredinChapter2,namelypriorityqueues (Section5.2.3)andlinearlistsrepresented as balanced trees (Section6.2.3).

LikeVolumes1and2,thisbookincludes alotofmaterial that does not

appearinother publications Manypeoplehave kindly written tomeabout

theirideas,orspokentomeaboutthem,andIhopethatIhave not distortedthe material too badlywhenIhave presenteditinmy ownwords

Ihave nothadtime to search the patent literature systematically; indeed,

Idecry the currenttendencytoseek patentsonalgorithms(seeSection5.4.5)

Ifsomebodysendsmeacopyofa relevant patent not presently citedin thisbook,Iwilldutifullyrefertoitinfutureeditions.However,Iwanttoencouragepeople to continue the centuries-oldmathematicaltradition of puttingnewly

discovered algorithms into the publicdomain Thereare betterwaystoearn alivingthantoprevent other peoplefrommakinguse of one’s contributions to

computerscience

BeforeIretiredfromteaching,Iusedthisbookasa textfora student’ssecond courseindata structures, at the junior-to-graduatelevel,omittingmost

ofthemathematicalmaterial.Ialsoused themathematicalportions ofthisbook

asthe basisforgraduate-level coursesinthe analysis of algorithms,emphasizing

especially Sections5.1,5.2.2, 6.3,and6.4.Agraduate-level courseonconcretecomputational complexity couldalsobe basedonSections5.3,and5.4.4,togetherwith Sections4.3.3, 4.6.3,and4.6.4ofVolume2

For themostpartthisbookisself-contained, exceptforoccasional sions relating to theMIX computerexplainedinVolume1.AppendixBcontains asummaryofthemathematicalnotations used,someofwhicharealittledifferentfromthosefoundintraditionalmathematicsbooks

discus-Prefacetothe SecondEdition

Thisneweditionmatchesthe third editions ofVolumes1and2, inwhichIhave

beenable to celebrate the completion ofT^XandMETFIFONTbyapplying thosesystems to the publications they were designedfor

Theconversion to electronic format has givenmethe opportunity to goover everywordofthe textandevery punctuationmark I’ve triedto retainthe youthful exuberance ofmyoriginal sentences while perhaps addingsomemore maturejudgment Dozensofnewexerciseshavebeenadded; dozens ofoldexerciseshavebeengivennew and improvedanswers Changes appear

everywhere, butmostsignificantly inSections5.1.4(about permutationsand

tableaux),5.3(aboutoptimumsorting), 5.4.9(about disksorting), 6.2.2(about

entropy), 6.4 (about universal hashing),and6.5(about multidimensional trees

Trang 11

/^\The ArtofComputerProgrammingis,however,stillaworkinprogress.

JL Researchonsortingandsearching continues togrowataphenomenalrate.Thereforesomeparts ofthisbookareheadedby an“under construction”icon,

toapologizeforthefactthatthe materialisnot up-to-date.Forexample,if Iwere teachinganundergraduateclassondata structures today,Iwouldsurelydiscussrandomizedstructuressuch as treapsatsomelength;butatpresent,I

amonly able tocitethe principalpapersonthe subject,andtoannounceplansfora future Section6.2.5 (seepage478).Myfilesareburstingwith importantmaterial thatIplan to includeinthefinal,glorious,third edition ofVolume3,perhaps 17 yearsfromnow.ButImustfinishVolumes4and5first,andIdo

notwanttodelay their publicationanymore thanabsolutely necessary

Iamenormouslygratefultothemanyhundredsofpeoplewhohave helped

metogatherandrefine thismaterial during the past 35years Mostofthehardworkofpreparing theneweditionwasaccomplishedbyPhyllisWinkler(whoput the text of thefirstedition into form),bySilvioLevy (who

editeditextensivelyandhelped to prepare severaldozenillustrations),and by

JeffreyOldham(whoconvertedmore than250 of the originalillustrationstoMETflPOSTformat) Theproductionstaff atAddison-Wesleyhas alsobeen

extremelyhelpful,as usual

Ihave corrected every error thatalertreaders detectedinthefirstedition—

aswell assomemistakes that,alas,nobodynoticed—andIhave tried to avoidintroducingnewerrorsinthenewmaterial.However,Isupposesomedefectsstillremain,andIwanttofixthemassoon aspossible.ThereforeIwillcheerfully

award$2.56 to thefirstfinder ofeach technical, typographical, orhistorical error.The webpagecitedon pageivcontains a currentlistingofallcorrections thathavebeenreported tome

Stanford,California D E.K.February1998

Thereare certaincommonPrivilegesofaWriter,theBenefit whereof,Ihope,therewillbe no Reasontodoubt;Particularly,thatwhereIamnot understood,itshallbe concluded,thatsomethingvery usefuland profoundiscoucht underneath

— JONATHANSWIFT,TaleofaTub,Preface(1704)

Trang 13

TheEXERCISESin this setofbookshavebeendesignedforself-study as well

asforclassroom study.Itisdifficult,ifnot impossible,foranyonetolearnasubject purelybyreadingaboutit,without applying the information tospecificproblems andthereby beingencouragedtothinkaboutwhathasbeenread.Furthermore,wealllearn best the things thatwehave discoveredforourselves.Therefore the exercisesformamajorpart ofthiswork; adefiniteattempthas

beenmadetokeepthemasinformative as possibleandtoselectproblemsthatareenjoyable as well asinstructive

Inmanybooks, easy exercises are foundmixed randomlyamongextremelydifficultones.Amotley mixtureis,however, often unfortunate because readersliketoknowinadvancehowlongaproblemought to take—otherwise theymayjustskip overalltheproblems.Aclassicexampleofsuch a situationisthebookDynamic Programming byRichard Bellman;thisisanimportant,pioneeringworkinwhichagroupofproblemsiscollectedtogetherattheend

ofsomechaptersundertheheading “ExercisesandResearch Problems,” withextremelytrivialquestions appearinginthemidst of deep, unsolved problems

It rumoredthatsomeoneonce asked Dr.Bellmanhowtotellthe exercisesapartfromthe research problems,andhereplied,“Ifyoucan solveit, it isanexercise;otherwiseit’sa research problem.”

Goodarguments can bemadeforincludingbothresearchproblemsand

very easy exercisesinabookofthiskind;therefore,tosave the readerfrom

the possibledilemmaofdeterminingwhicharewhich, ratingnumbershavebeen

provided to indicate thelevelofdifficulty Thesenumbershave the followinggeneralsignificance:

Rating Interpretation

00 Anextremely easy exercise that canbe answered immediatelyifthematerial of the text hasbeenunderstood; suchanexercisecanalmostalwaysbeworked“inyour head.”

10 Asimpleproblemthatmakes youthink over the material just read, but

isby nomeansdifficult.Youshouldbeable todothis inoneminuteatmost; penciland papermaybe usefulinobtaining the solution

20 Anaverageproblemthattestsbasicunderstanding of the text rial,butyoumayneedaboutfifteenortwenty minutes toansweritcompletely

Trang 14

mate-30 Aproblemofmoderatedifficultyand/orcomplexity;thisonemayinvolvemorethantwohours’workto solvesatisfactorily,orevenmore

iftheTVison

40 Quite adifficultorlengthyproblemthatwould besuitableforaterm

projectinclassroomsituations.Astudent should be able to solve the

problemina reasonableamountof time,but the solutionisnottrivial

50 Aresearchproblemthathas not yetbeensolvedsatisfactorily,asfar

asthe authorknewatthetime of writing, althoughmanypeoplehavetried Ifyouhave foundan answertosuch a problem,youought towriteitupforpublication; furthermore, the author ofthisbook would

appreciate hearingaboutthe solution as soon as possible (provided that

it iscorrect)

Byinterpolationinthis“logarithmic”scale,the significance of other rating

numbers becomesclear.For example, a rating of17wouldindicateanexercisethatisabitsimplerthanaverage Problemswith a rating of 50 that aresubsequently solvedbysomereadermayappearwith a45ratingin latereditions

ofthe book,andinthe errata postedonthe Internet(seepageiv).The remainderofthe ratingnumberdividedby5 indicates theamountofdetailedworkrequired.Thus,anexercise rated 2\maytake longer to solvethan

anexercise thatisrated25,but thelatter willrequiremorecreativity.Theauthor has tried earnestly to assign accurate ratingnumbers,butit is

difficult forthe personwhomakes upaproblemtoknowjusthowformidableit

willbeforsomeoneelseto finda solution;andeveryone hasmoreaptitudeforcertaintypes ofproblems thanforothers.Itishopedthat the ratingnumbers

represent agoodguessatthelevelofdifficulty,but they should be taken asgeneral guidelines, not as absolute indicators

Thisbookhasbeenwrittenforreaderswith varying degrees ofmathematical

trainingandsophistication; asaresult,someofthe exercises are intended onlyforthe use ofmoremathematically inclined readers.Theratingisprecededby anM

ifthe exercise involvesmathematicalconcepts or motivation to a greater extent

thannecessaryforsomeonewhoisprimarily interested onlyinprogramming

the algorithms themselves.Anexerciseismarkedwith theletters“HM”ifitssolution necessarily involves aknowledgeof calculus or other highermathematics

not developedin thisbook AnU

HM"designation does not necessarilyimplydifficulty

Someexercises areprecededby anarrowhead, thisdesignates lems that are especially instructiveandespeciallyrecommended Ofcourse,no

prob-reader/studentisexpected toworkallof theexercises,so those thatseemto

be themostvaluablehavebeensingled out (This distinctionisnotmeanttodetractfromthe otherexercises!)Eachreader shouldat leastmake an attempt

to solvealloftheproblems whoseratingis10 orless;andthearrowsmayhelp

toindicatewhichoftheproblemswith a higher rating shouldbegivenpriority.Solutions tomostofthe exercisesappearintheanswersection.Please use

Trang 15

solvetheproblem byyourself,orunlessyouabsolutelydonothave time toworkthisparticularproblem After getting yourownsolution or givingtheproblemadecenttry,youmayfindtheanswerinstructiveandhelpful.Thesolutiongivenwilloftenbe quiteshort,anditwillsketch thedetailsundertheassumption

thatyouhave earnestly tried to solveitbyyourownmeansfirst.Sometimesthesolution giveslessinformationthanwasasked;oftenitgivesmore It quitepossiblethatyoumayhave a betteranswer thantheone published here, oryou

mayhavefoundanerrorinthepublished solution;insuch acase,theauthorwillbe pleased toknowthedetails.Later printings ofthisbookwillgivethe

improvedsolutionstogetherwith thesolver’snamewhereappropriate.Whenworking anexerciseyoumaygenerally use theanswers to previousexercises,unlessspecificallyforbiddenfromdoingso.Theratingnumbershave

beenassignedwiththisinmind; thusit possibleforexercisen+1tohave alower ratingthanexercisen,eventhoughitincludestheresultof exercisenas

a specialcase

Summaryofcodes: 00 Immediate

10 Simple (one minute)

20 Medium(quarterhour)

Recommended 30 Moderately hard

M Mathematicallyoriented 40 Termproject

HM Requiring “highermath” 50 Research problemEXERCISES

1.[00]Whatdoes therating“M20”mean?

2.[10] Ofwhat value can theexercises inatextbookbetothereader?

3 [HM45]Prove thatwhennisaninteger,n>2,theequationx +y =z has

nosolution in positive integersx,y,z

Twohours'dailyexercise willbeenough

tokeepahackfitfor hiswork

—M.H.MAHON,The HandyHorseBook(1865)

Trang 16

055.2.3.SortingbySelection 13g5.2.4 Sortingby Merging 1585.2.5 Sortingby Distribution Igg5.3.OptimumSorting

18g5.3.1.Minimum-ComparisonSorting Igg

*5.3.2.Minimum-ComparisonMerging ig7

*5.3.3.Minimum-ComparisonSelection 207

*5.3.4.NetworksforSorting 2195.4 External Sorting

24g5.4.1.Multiway Merging and ReplacementSelection 252

*5.4.2 ThePolyphaseMerge 267

*5.4.3.TheCascadeMerge 288

*5.4.4 ReadingTapeBackwards 299

*5.4.5.TheOscillatingSort 3H

*5.4.6 PracticalConsiderationsforTape Merging 317

*5.4.7.ExternalRadix Sorting 343

*5.4.8.Two-TapeSorting 34g

*5.4.9.DisksandDrums 3gg5.5.Summary,History,and Bibliography 3gg

Chapter6—Searching 3926.1 Sequential Searching ggg6.2 Searching by ComparisonofKeys 4gg6.2.1 Searching an Ordered Table 4Qg6.2.2 Binary Tree Searching 42g6.2.3 Balanced Trees 4gg6.2.4 Multiway Trees 4gl

Trang 17

6.3 DigitalSearching 492

6.5.Retrievalon Secondary Keys 559

Answersto Exercises 584

AppendixA —Tables ofNumericalQuantities 748

1. Fundamental Constants (decimal) 748

2. Fundamental Constants(octal) 749

3. HarmonicNumbers, Bernoulli Numbers, FibonacciNumbers . 750

AppendixB —Index toNotations 752

AppendixC —Index toAlgorithmsandTheorems 757IndexandGlossary 759

Trang 19

Thereisnothingmoredifficulttotakeinhand,

moreperilous toconduct,ormoreuncertainin itssuccess,thantotake theleadinthe introductionof

aneworderofthings

— NICCOLOMACHIAVELLI, ThePrince(1513)

"Butyou can'ttookupallthoselicensenumbersintime,"Drakeobjected

"Wedon’thaveto,Paul Wemerely arrange alist

and lookforduplications."

— PERRY MASON,inTheCase of theAngryMourner (1951)

"Treesort"Computer—Withthisnew'computer-approach'

tonature studyyou canquickly identifyover260different treesofU.S.,Alaska,and Canada,even palms, deserttrees,and otherexotics

Tosort,you simplyinserttheneedle

— EDMUNDSCIENTIFICCOMPANY,Catalog(1964)

InTHISCHAPTERweshallstudy a topic thatarisesfrequentlyinprogramming:

therearrangement of items into ascending or descending order Imaginehow

harditwouldbe to use a dictionaryifitswords werenot alphabetized! We

willsee that,ina similar way, the orderinwhichitems are storedincomputer

memoryoftenhas aprofoundinfluenceonthespeedandsimplicity ofalgorithmsthatmanipulatethose items

Althoughdictionaries oftheEnglish language define “sorting” as the process

ofseparating or arranging things according toclassor kind,computer merstraditionally use thewordinthemuchmorespecial sense ofmarshalingthings into ascending or descending order.Theprocessshould perhaps be calledordering,notsorting;butanyonewhotriestocallit “ordering”issoon ledintoconfusionbecause of themanydifferentmeaningsattached to that word.Consider the following sentence,forexample:“Sinceonlytwoofour tape driveswereinworkingorder,Iwasordered to ordermoretape unitsinshort order,

program-inorder to order the data several orders ofmagnitudefaster.” Mathematical

terminologyaboundswithstillmoresenses of order (the order of a group, theorder of a permutation, the order of abranchpoint, relations of order,etc.,etc.).Thus wefindthattheword“order”canlead to chaos

Somepeoplehavesuggested that “sequencing”would bethe bestnameforthe process of sorting into order; butthiswordoftenseemsto lacktheright

Trang 20

connotation,especiallywhenequal elements are present, anditoccasionallyconflictswith other terminology It quite true that “sorting”isitselfan

overusedword(“Iwassortofout ofsortsaftersorting thatsortof data”),butithasbecomefirmly establishedincomputingparlance Thereforeweshalluse theword“sorting”chiefly inthestrictsense of sorting into order, withoutfurther apologies

Someofthemostimportant applications of sortingare:

a)Solvingthe“togetherness”problem,inwhichallitems with thesameficationarebrought together Supposethatwehave 10000 itemsinarbitraryorder,manyofwhichhave equal values;andsuppose thatwe wantto rearrangethe data so thatallitems with equal valuesappearinconsecutive positions This

identi-isessentiallytheproblemof sortinginthe older sense of the word;anditcanbesolvedeasilybysorting thefileinthenewsense of the word, so that the valuesareinascending order,Vi<v2< <tqoooo•Theefficiencyachievablein thisprocedure explainswhythe originalmeaningof “sorting” has changed.b)Matchingitems in two ormorefiles.Ifseveralfileshavebeensorted into the

sameorder,it ispossible to findallofthematchingentriesinone sequential pass

throughthem, without backing up Thisisthe principle that PerryMasonused

tohelp solve amurdercase(seethe quotationatthe beginning ofthischapter)

Wecanusually process alistof informationmostquicklybytraversingitinsequencefrombeginning to end, instead of skippingaroundatrandominthelist,unless the entirelist issmallenoughtofitina high-speed random-access

memory.Sortingmakesitpossible to use sequential accessingonlargefiles,as

afeasiblesubstitutefordirect addressing

c) Searching for information by keyvalues.Sortingisalsoanaid to searching,

asweshallseeinChapter6,henceithelps usmake computeroutputmore

suitableforhumanconsumption Infact,alistingthathasbeensorted intoalphabetic order often looks quite authoritative evenwhenthe associated nu-merical information hasbeenincorrectlycomputed

Althoughsorting has traditionallybeenused mostlyforbusiness data cessing,it isactually a basic tool that everyprogrammershould keepinmindforuseinawidevariety ofsituations Wehave discusseditsuseforsimplify-ing algebraic formulas,inexercise 2.3.2-17 Theexercisesbelowillustratethediversity of typical applications

pro-Oneof thefirstlarge-scale software systems todemonstratetheversatility

of sortingwastheLARCScientificCompilerdevelopedbyJ.Erdwinn,D E.Ferguson,andtheirassociatesatComputerSciences Corporationin1960.Thisoptimizing compilerforanextendedFORTRANlanguagemadeheavyuse ofsorting so that the various compilation algorithmswerepresented with relevantparts of the sourceprogramina convenient sequence Thefirstpasswasalexicalscan that divided theFORTRANsource code into individual tokens, each

representinganidentifierora constant oranoperator,etc Eachtokenwas

assigned several sequencenumbers;whensortedonthenameand anappropriate

Trang 21

“definingentries”by whicha userwouldspecifywhether anidentifierstoodforafunctionname,aparameter, or adimensionedvariablewere given low sequence

numbers,sothattheywould appearfirstamongthetokenshaving a givenidentifier;thismadeiteasy tocheckforconflictingusageandto allocatestoragewith respect toEQUIVALENCEdeclarations.Theinformation thus gatheredabout

eachidentifierwasnowattached toeach token;inthisway no “symboltable”

ofidentifiersneededtobemaintainedinthehigh-speedmemory.The updated

tokens werethen sortedonanothersequencenumber, whichessentiallybroughtthesourceprogram backintoitsoriginalorderexcept that thenumbering schemewascleverlydesigned to put arithmetic expressions into amoreconvenient

“Polishprefix”form Sortingwasalsousedinlaterphases of compilation, tofacilitateloop optimization, tomergeerrormessages into thelisting,etc Inshort,thecompilerwasdesigned so that virtuallyallthe processingcouldbe

donesequentiallyfromhiesthatwere storedinanauxiliarydrummemory,sinceappropriatesequencenumberswere attached to the datainsuch awaythatitcouldbe sorted into various convenient arrangements

Computermanufacturers of the 1960s estimated thatmore than25 percent

oftherunning timeontheircomputerswasspentonsorting,whenalltheircustomerswere taken into account Infact,thereweremanyinstallationsinwhichthetask of sortingwasresponsibleformore thanhalfofthecomputing

time Fromthesestatisticswemayconclude that either(i)there aremanyimportant applications ofsorting,or(ii)manypeople sortwhenthey shouldn’t,

or(iii)inefficientsortingalgorithmshavebeenincommonuse.Therealtruthprobably involvesallthree of thesepossibilities,butinanyeventwecan see thatsortingisworthyof serious study, asa practical matter

Evenifsortingwere almostuseless,therewould beplenty of rewardingsonsforstudyingitanyway!Theingenious algorithms thathavebeendiscovered

rea-showthat sortingisanextremely interesting topic to explorein itsownright.Manyfascinatingunsolvedproblemsremainin thisarea,as well as quitea fewsolved ones

Fromabroader perspectivewewillfind alsothat sortingalgorithmsmakeavaluable case study ofhowtoattackcomputerprogramming problemsingeneral.Manyimportant principles of data structure manipulationwillbeillustrated inthischapter.Wewillbe examiningthe evolution of various sortingtechniques

inan attemptto indicatehowthe ideaswere discoveredinthefirstplace.Byextrapolatingthiscasestudywecan learn agooddealaboutstrategiesthat help

us designgoodalgorithmsforothercomputerproblems

Sortingtechniques also provide excellentillustrationsofthe general ideasinvolvedinthe analysis of algorithms—the ideasused todetermineperformance

characteristicsofalgorithms so thatanintelligentchoicecanbemadebetweencompeting methods Readerswhoaremathematically inclinedwillfindquiteafew instructive techniquesin thischapterforestimating the speed ofcomputer

algorithmsandforsolvingcomplicated recurrencerelations.Onthe otherhand,the materialhasbeenarranged so that readerswithout amathematicalbentcan

skipover these calculations

Trang 22

4 SORTING

5Before going on,weought to define ourproblemalittlemoreclearly,and

introducesometerminology.Weare givenNitems

Ri,R2, ,Rn

tobe sorted;weshall callthemrecords,andthe entire collection ofNrecordswillbe calleda, file.EachrecordRjhas akey,Kj,whichgoverns the sortingprocess Additional data, besides thekey,isusually also present;thisextrasatelliteinformation” hasnoeffectonsorting except thatitmust becarriedalong as part of each record

Anordering relation“<”isspecifiedonthe keys so that the followingconditions aresatisfied foranykey valuesa, c:

1 Exactly one of thepossibilitiesa<b,a=b,b<aistrue.(Thisiscalledthelaw of trichotomy.)

ii)Ifa<bandb<c,then a<c.(Thisisthe familiar law oftransitivity.)Properties(i)and(ii)characterize themathematicalconcept of linear ordering,also calledtotalordering.Anyrelationship“<”satisfying thesetwopropertiescanbesortedby mostof themethodstobe mentionedin thischapter, although

somesorting techniques are designed toworkonly with numerical or alphabetickeys that have the usual ordering

Thegoal of sortingisto determine apermutationp(l) p(2) p(N) of theindices{1,2, A}thatwillput the keys into nondecreasing order:

Kp(i)<-Kp(2)<•••< KP

(N)• (i)Thesortingiscalledstable ifwe makethe further requirement that records withequal keys should retain their originalrelativeorder In other words, stablesorting has the additional property that

P(l )< PU whenever Kp(l)= Kp{]) and *<j (2)

Insomecaseswewillwantthe records to be physically rearrangedinstorage

so that their keys areinorder.Butinother casesitwillbesufficientmerely tohaveanauxiliary table thatspecifiesthepermutationinsomeway, so that therecordscanbe accessedinorder of their keys

Afew of the sortingmethodsin thischapterassumethe existence of either

orbothof the values “oo”and oo”,whichare defined to be greaterthanorlessthanallkeys, respectively:

-oo< Kj <oo, for 1<j<N

(3)

Such extremevalues are occasionally used asartificialkeys or as sentinel tors.Thecase of equalityisexcludedin(3);ifequalitycan occur, the algorithmscanbemodified so that theywillstillwork, but usually at the expense ofsome

indica-eleganceandefficiency

Sorting can beclassifiedgenerally into internalsorting, inwhichthe recordsarekept entirelyinthe computer’s high-speed random-accessmemory, andex-

Trang 23

memoryatonce.Internalsortingallowsmoreflexibilityinthe structuringand

accessing ofthe data, while external sortingshowsushowtolivewith ratherstringent accessing constraints

Thetime required to sortNrecords,usinga decent general-purpose sortingalgorithm,isroughly proportional toNlogIV;wemake aboutlogA?'“passes”over the data.Thisistheminimumpossible time, asweshallseeinSection5.3.1,

ifthe records areinrandomorderandifsortingisdonebypairwisecomparisons

of keys Thusifwedouble thenumberof records,itwilltakealittlemorethantwice aslong tosortthem,allother thingsbeing equal (Actually,asN

approachesinfinity,a better indication of the timeneededto sortisN(\ogN)2

be accomplishedinO(N)stepsonthe average

EXERCISES —FirstSet

1.[M20]Prove,from the lawsoftrichotomyandtransitivity,thatthe permutationp(l)p(2)..p(N)isuniquelydeterminedwhenthesortingisassumedtobestable

2.[21]Assumethateach recordRjinacertainfilecontainstwokeys,a “major key”

Kjand a “minor key”kj,with alinearordering<definedon eachofthesetsof keys.Then wecandefinelexicographicorderbetweenpairs ofkeys(K,k) inthe usualway:

(Ki,ki)<(Kj,kj) if Ki< Kj orif Ki=Kj and ki<kj.Alicetookthisfileandsortedit firston themajorkeys,obtainingngroupsofrecordswith equalmajor keysineach group,

Ap(i)— Ap(q)<--^p(*i+i)—* —A”p(i2

) "^ *

^p(in—i+1)—* —A^p(i n),

wherei„=N.Thenshe sortedeachofthengroupsRp(i_1+i), ,Rp(i )ontheirminorkeys

Billtook thesameoriginalfileandsortedit firston the minorkeys;then hetooktheresultingfile,andsortediton the majorkeys

Christook thesameoriginalfileand did asinglesortingoperationonit,usinglexicographicorderon themajorand minor keys (Kj,kj)

Did everyone obtain thesameresult?

3 [M25] Let<be arelationon K\,. Knthatsatisfiesthelawoftrichotomy butnot thetransitivelaw.Prove that even without thetransitivelawit ispossible to sorttherecordsinastablemanner, meeting conditions(l)and(2);in fact,thereare atleastthreearrangements thatsatisfytheconditions!

4 [21] Lexicographers don’tactuallyusestrictlexicographicorderindictionaries,becauseuppercaseand lowercaselettersmust beinterfiled.Thustheywant an orderingsuchas this:

a< A<aa<AA<AAA <Aachen<aah<• <zzz<ZZZ

Trang 24

6 SORTING

5

5 [M28]Design a binary codeforallnonnegativeintegers sothatifnisencodedasthestringp(n)wehavem< nifand onlyifp(rn)islexicographically lessthanp(n).Moreover, p(m) should not be aprefixofp(n)foranym #n.Ifpossible,thelengthofp(n) should beatmost lgn+O(loglogn)foralllarge n.(Such a codeisusefulifwe

wantto sort textsthatmixwords and numbers,orifwewanttomaparbitrarily largealphabetsintobinarystrings.)

6.[15] Mr B C Dull(aMIX programmer) wantedtoknowifthenumberstoredinlocationAisgreaterthan,lessthan,orequaltothenumberstoredinlocation B.So

hewrote‘LDAA;SUBB”andtestedwhetherregisterAwaspositive,negative, or zero.Whatseriousmistake did he make, and what should he have done instead?

7.[17] Write a MIX subroutineformultiprecision comparisonof keys,having thefollowing specifications:

Calling sequence: JMP COMPARE

Entryconditions: rll=n;CONTENTS(A+k)=akand CONTENTS(B+k)=bk ,for

1<A;<n;assume thatn >1.

Exitconditions: Cl=GREATER,if(a„, ,ai)>(b n, &i)

Cl=EQUAL, if(a„,. ai)=(bn, ,b1

)-Cl=LESS, if(a„,. ai)<(bn, ,bi);

rXandrllare possibly affected

Here therelation (a„,. ,ai)<(b n, ,bi)denoteslexicographicordering fromlefttoright;thatis,thereisan indexjsuch that ak=bkforn>k>j,but a3<b3

8.[30] Locations Aand B contain two numbers a andb,respectively.Showthatit is

possible towrite a MIXprogram that computes andstoresmin(a,b) location C,withoutusinganyjumpoperators.(Caution: Since youwillnotbeable to testwhetherornotarithmetic overflow has occurred,it wisetoguarantee that overflowisimpossibleregardless ofthevalues ofaandb.)

9

[M27 AfterNindependent, uniformlydistributedrandomvariablesbetween 0and1have been sortedintonondecreasingorder,whatistheprobabilitythat the rthsmallest ofthesenumbersis<x?

EXERCISES —SecondSet

Eachofthefollowing exercises statesaproblem that a computerprogrammermighthavehadto solve intheolddayswhencomputersdidn’thavemuchrandom-accessmemory Suggest a “good”wayto solvetheproblem, assumingthatonlya few thousandwordsof internalmemoryare available,supplemented by abouthalfa dozen tapeunits(enough tapeunits for sorting).Algorithms that workwellunder suchlimitations alsoprovetobeefficientonmodernmachines

10 [15]Youaregiven a tape containing onemillionwordsof data Howdo youdeterminehowmanydistinctwordsarepresenton the tape?

11.[18] YouaretheU.S.InternalRevenueService;youreceive millions oftionforms from organizationstellinghow muchincome they have paidto people,andmillions oftax forms from peopletellinghow muchincome they have beenpaid.How

“informa-do you catch peoplewhodon’t reportallof theirincome?

12.[M25] (Transposing amatrix.) Youaregiven a magnetic tape containing onemillionwords, representing the elementsofa 1000X 1000 matrix storedinorderbyrows:

Trang 25

elementsarestoredby columns 1 u 2,1 a1000, 1 1 2 a1000.2.uiooo,ioooinstead?(Trytomakelessthan a dozen passes over thedata.)

13.[M26]Howcouldyou“shuffle”alargefileofNwordsintoarandomment?

rearrange-14.[20] Youareworking with two computer systems that havedifferentconventionsforthe“collatingsequence” thatdefinestheorderingofalphamericcharacters.Howdoyoumakeonecomputersortalphamericfilesintheorder usedby the other computer?

15.[IS] Youaregiven alistofthenamesofafairlylargenumberofpeopleborninthe U.S.A., together with thenameofthestatewhere they were born.Howdo youcount thenumberofpeoplebornineachstate?(Assume thatnobodyappearsinthelistmorethanonce.)

16.[20] Inordertomakeiteasiertomakechangesto largeFORTRANprograms,youwanttodesign a“cross-reference”routine;such a routine takesFORTRANprograms

asinputandprintsthemtogetherwithan index that shows each useofeachidentifier(thatis,eachname)intheprogram.Howshould such a routine be designed?

17 [33](Library cardsorting.) Before the daysofcomputerized databases, everylibrarymaintained a catalogofcardssothatuserscouldfindthebooks they wanted.But the taskofputting catalog cardsintoan order convenientforhumanuseturned out

tobequitecomplicatedas library collectionsgrew.Thefollowing “alphabetical” listingindicatesmanyoftheproceduresrecommendedintheAmerican Library AssociationRulesforFilingCatalog Cards (Chicago:1942):

Textofcard

R.Accademianazionaledei Lincei,Rome

1812; ein historischerRoman

Bibliothequed’histoire revolutionnaire

Bibliotheque descuriosites

Brown, Mrs.J.Crosby

Brown, John

Brown, John, mathematician

Brown, John,ofBoston

LeXIXesiecle frangais

The1847issueof S.stamps

1812overture

Remarks

Ignoreforeignroyalty(exceptBritish)AchtzehnhundertzwolfTreat apostropheasspaceinFrenchIgnore accentsonlettersIgnore designationofrank

Nameswith datesfollowthosewithout

. and thelatteraresubarranged

bydescriptivewordsArrangeidenticalnamesby birthdate

Works“about”followworks “by”Sometimes birthdate must be estimatedIgnore designationofrankTreathyphenasspace

Booktitlesfollowcompound names

&inEnglishbecomes “and”Ignore apostropheinnames

Ignoreaninitial article

. providedit’sinnominative case

NamesprecedewordsDix-huitcentdouzeDix-neuviemeEighteenforty-sevenEighteen twelve

Trang 26

IBMjournalof researchand development

ha-Iha-ehad

Ia;alove story

InternationalBusiness Machines Corporation

al-KhuwarizmT,MuhammadibnMusa,

fl.813-846

Labour.Amagazineforallworkers

Laborresearch association

Labour,seeLabor

UncleTom’scabin

U.S.bureauofthecensus

Vandermonde, Alexandre Theophile,

1735-1796

VanValkenburg,MacElwyn,

1921-Von Neumann,John,1903-1957

Thewholeart oflegerdemain

Who’safraid ofVirginiaWoolf?

Wijngaarden, Adriaanvan,

1916-RemarksInitialsare like one-letterwordsIgnoreinitial articleIgnore punctuationin titlesIgnoreinitial“al-”inArabicnames

Respellit“Labor”

Cross-reference cardIgnore apostropheinEnglish

Me =Mac

TreathyphenasspaceIgnore designationofrank

“Mrs.”=“Mistress”

Don’t ignoreBritish royalty

“St.”=“Saint”,eveninGerman

TreathyphenasspaceSainte(abook by Donald Ervin Knuth)(abook by Harriet Beecher Stowe)

“U.S.”=“UnitedStates”Ignore spaceafter prefix insurnamesIgnoreinitialarticle

Ignore apostropheinEnglish

Surnamebegins with uppercaseletterexceptions,and therearemanyotherrules(Mostoftheserules aresubjectto certain

notillustrated here.)

Ifyou were given the jobof sorting large quantities ofcatalog cardsby computer,and eventually maintaining a verylargefileofsuchcards,andifyou had no chancetochangetheselong-standingpoliciesofcardfiling,howwould you arrange the datainsuch awaythat thesortingand merging operationsare facilitated?

18 [M25](E.T.Parker.)Leonhard Euler once conjectured [Nova Acta Acad.Sci.Petropolitanae 13(1795),45-63,§3;writtenin 1778]that therearenosolutions totheequation

Trang 27

5 SORTING 9Infinitelymanycounterexampleswhen n=4were subsequently found byNoamElkies[Math.Comp.51(1988),825-835],Canyou thinkofawayinwhichsortingwouldhelpinthe searchforcounterexamplesto Euler’sconjecturewhen n=6?

19 [24 Given afilecontaining amillion or so distinct 30-bitbinarywords xi, ,xN,

whatisagoodwayto findallcomplementarypairs{xi,Xj} thatarepresent?(Two

wordsarecomplementarywhenone has 0 wherever the other has1,andconversely;thus theyarecomplementaryifand onlyiftheirsumis(11. 1)2 ,whentheyaretreatedasbinary numbers.)

20 [25] Given afilecontaining100030-bitwordsx\, ,Xiooo,howwould youpare alistofallpairs(Xi,Xj such that xt=Xjexceptinatmost twobitpositions?

pre-21 [22]Howwould you go about lookingforfive-letteranagrams suchasCARET,CARTE, CATER, CRATE, REACT, RECTA, TRACE; CRUEL, LUCRE, ULCER; DOWRY, ROWDY, WORDY?[One might wishtoknowwhether thereareanysetsoftenormorefive-letterEnglishanagramsbesidestheremarkableset

APERS, ASPER, PARES, PARSE, PEARS, PRASE, PRESA, RAPES, REAPS, SPAER, SPARE, SPEAR,

towhichwemight add the French wordAPRES.]

22.[M28]Given thespecifications ofafairlylargenumberofdirectedgraphs,what

approachwillbeuseful forgrouping the isomorphic ones together? (Directed graphsareisomorphicifthereisa one-to-one correspondence betweentheir verticesand aone-to-one correspondence betweentheir arcs,where the correspondences preserveincidencebetweenverticesandarcs.)

23.[30] Inacertaingroupof4096people,everyone has about 100 acquaintances

Afilehasbeen preparedlistingallpairs ofpeoplewhoareacquaintances.(Therelation

issymmetric:Ifxisacquainted withy,then yisacquainted withx.Therefore thefilecontains roughly 200,000entries.)Howwould you design an algorithmtolist allthefc-person cliques in thisgroupof people,givenk?(Acliqueisan instanceofmutualacquaintances: Everyoneinthecliqueisacquainted with everyoneelse.)Assumethatthere arenocliques of size 25, sothetotalnumberof cliquescannot be enormous

24 [30]Threemillionmenwithdistinctnameswerelaidend-to-end, reachingfromNewYorkto California.Eachparticipantwas given aslipofpaperon which he wrote

downhisownnameand thenameofthe person immediately westofhimintheline.

Themanattheextreme western enddidn’tunderstandwhatto do, sohe threwhispaper away; the remaining 2,999,999slipsofpaperwere putintoahuge basket andtakentothe National ArchivesinWashington, D.C Here the contentsofthe basketwereshuffledcompletelyandtransferred tomagnetictapes

Atthispointan informationscientistobserved thattherewas enough information

on the tapestoreconstruct thelistofpeopleintheir original order.Andacomputerscientistdiscovered awaytodo the reconstruction with fewer than 1000 passes throughthedatatapes,using only•sequentialaccessingoftapefilesand a smallamountofrandom-accessmemory.Howwas thatpossible?

[Inother words,given thepairs(xi,Xj+i),for1<i<N,inrandomorder,where theXiare distinct,howcan the sequence X\X2 xjvbe obtained,restricting

alloperationsto serialtechniquessuitable forusewith magnetic tapes?Thisistheproblemof sorting intoorderwhenthereisno easywaytotellwhichoftwo given keys

Trang 28

25 [M21](Discretelogarithms.)You knowthat pisa(rather large)prime number,and that aisaprimitive rootmodulop.Therefore,forallb inthe range1<b<p,thereisaunique n such that an

modp =6, 1< n<p (Thisniscalledthe index

of bmodulop,withrespect to a.)Explainhowto find n,givenb,without needingO(n)steps [Hint: Letm = Wp)andtry to solveamni=ba~n2(modulop)for

0<ni ,«2<m.]

Trang 29

*5.1.COMBINATORIAL PROPERTIES OF PERMUTATIONS

A PERMUTATIONofafinitesetisan arrangementofitselements into a row

Permutationsare of specialimportanceinthestudy of sorting algorithms, sincethey represent the unsorted input data In order to study theefficiencyofdifferentsortingmethods,wewillwanttobeabletocount thenumberofpermutations that cause a certain step of a sorting procedure to be executed

a certainnumberof times

Wehave, of course,metpermutations frequentlyinprevious chapters Forexample,inSection1.2.5wediscussedtwobasic theoreticalmethodsof con-structing then\permutations ofnobjects;inSection1.3.3weanalyzedsome

algorithmsdealing with the cycle structureandmultiplicative properties ofpermutations;inSection3.3.2westudied their“runs up”and“runsdown.”Thepurpose of the present sectionistostudy several other properties of per-mutations,andto consider the general casewhereequal elements are allowed toappear In the course ofthisstudywewilllearnagooddealaboutcombinatorialmathematics

Theproperties of permutations aresufficientlypleasing tobe

interestingintheirownright,andit isconvenient to developthemsystematicallyinone placeinstead of scattering the material throughoutthischapter Butreaderswhoarenot mathematically inclinedandreaderswhoareanxious to dive right intosorting techniques are advised to goonto Section5.2immediately, since thepresent section actually haslittledirectconnection tosorting

*5.1.1.Inversions

Let a1a2 a n be apermutationoftheset{1,2, , n).Ifi<jandat>aj,

the pair(ai; ci,) iscalledaninversion of the permutation;forexample, the

permutation314 2 has three inversions:(3, 1), (3, 2),and(4, Eachinversionis

a pair of elements thatisout ofsort,sothe onlypermutationwithnoinversionsisthe sortedpermutation12 n.This connection with sortingisthe chief reasonwhywewillbe so interestedininversions, althoughwehave already used theconcept to analyze adynamicstorage allocation algorithm(seeexercise2 2.2-9)

Theconcept of inversionswasintroducedbyG.Cramerin1750[Intr.a

VAnalysedes LignesCourbesAlgebriques (Geneva: 1750), 657-659; seeThomasMuir,TheoryofDeterminants1 (1906), 11-14],inconnection withhisfamous

ruleforsolvinglinearequations In essence,Cramerdefined the determinant of

ann xnmatrixinthe following way:

E(

-1

)

inv(.1«2-«")xlai a:2a2 x,summedoverallpermutations cq a2.an of{1, 2, ,n},whereinv(aj a2. a n)

isthenumberof inversions of the permutation.

Theinversiontableb\b2.,bn of thepermutationcqa2. anisobtainedby

Trang 30

12 SORTING 5.1.1

In otherwords,bjisthenumberof inversionswhosesecondcomponentisj

Itfollows, forexample, that thepermutation

5 18 2 6 4 7 3 (r)has the inversion table

23 6 4 0 2 210, (2)since 5and9 are to theleftof1;5, 9,8areto theleftof2; etc.Thispermutation

has 20 inversionsinall.Bydefinitionthenumbersbjwillalwayssatisfy

0<bi<n-1, 0<62<n-2, , 0<6n _!<1, b =0 (3)Perhapsthemostimportantfactaboutinversionsisthe simple observationthatan inversion table uniquely determines the corresponding permutation.We

can gobackfrom anyinversion table b1b2. bnsatisfying

(3)totheunique

permutationthatproducesit,bysuccessivelydetermining therelativeplacement

of theelementsn,n—1, ,1 (in thisorder).For example,wecan construct the

permutationcorresponding to(2)asfollows: Writedownthenumber9;thenplace 8after 9,sincebg=1. Similarly,put 7afterboth8and9,since67=2.Then6mustfollowtwoof thenumbersalready writtendown,because be=2;the partialresultsofaristherefore

9 8 67

Continuebyplacing 5attheleft,sinceb5=0;put 4afterfour ofthenumbers;andput 3after sixnumbers (namelyattheextremeright),giving

5 9 8 6 4 73

Theinsertion of 2and1inananalogouswayyields(1)

This correspondenceisimportant becausewecan often translate aproblem

statedinterms of permutations intoanequivalentproblemstatedintermsofinversiontables,andthelatterproblemmaybeeasiertosolve For example,consider the simplest question ofall:How manypermutations of{1, 2,.

,n}arepossible?The answer must bethenumberof possible inversiontables,andtheyareeasilyenumeratedsincethere arenchoicesfor 61,independentlyn-1choicesfor 62,. 1choicefor bn,making n(n—) 1=n!choicesinall.Inversions areeasy to count, because theb’sarecompletely independent of each other, whilethea’smust bemutuallydistinct

InSection 1.2.10weanalyzed thenumberoflocalmaximathatoccurwhen

apermutationisreadfromright toleft;inotherwords,wecountedhowmanyelements are largerthan anyof their successors.(Theright-to-leftmaximain (1),forexample, are3, 8,and9.)Thisisthenumberof j such thatbjhasitsmaximumvalue,n-j Sincebiwillequaln-1with probability 1/n,and

(independently)b2 willbe equal ton-2with probability l/(n-1),etc.,it is

Trang 31

Fig.1.Thetruncated octahedron, which shows the changeininversionswhenadjacentelementsofapermutationareinterchanged.

Thecorresponding generating functionisalsoeasilyderivedina similar way

Ifweinterchangetwoadjacent elements of a permutation,it iseasy to seethat the totalnumberof inversionswillincrease or decreasebyunity.Figure1showsthe 24 permutations of {1,2, 3,4}, withlinesjoiningpermutations thatdifferby aninterchange of adjacent elements; followinganylinedownwardinvertsexactly onenewpair.Hencethenumberof inversions of apermutation7r thelength of adownward path from1234 totcinFig.1;allsuch pathsmusthavethesamelength

Incidentally, thediagraminFig.1maybeviewed as a three-dimensionalsolid,the“truncated octahedron,”whichhas 8 hexagonal facesand6squarefaces.Thisisone of theclassicaluniform polyhedra attributed toArchimedes(seeexercise10)

Thereader should not confuse inversions of apermutationwith the inverse

ofa permutation Recall thatwe canwrite apermutationintwo-lineform

the inverse a[

a'-2a'3. a'n ofthispermutationisthepermutationobtainedby

Trang 32

of thenewtop row:

andinversions:Theinverse of apermutationhas exactly asmanyinversions asthepermutationitself.Rothe’s proof ofthis factwasnot the simplest possibleone,butit isinstructiveandquite pretty nevertheless.Weconstructann x n

chessboard having a dotincolumnj ofrowiwhenevera,=j Then weput

xsin allsquares that have dots lyingboth below(inthesame column) andtotheirright (inthesamerow) For example, thediagramfor5 918 2 6 4 7 3is

n

°nsince

The numberofx’s isthenumberofinversions,sinceit iseasy to see thatbjisthe

numberof x’sincolumnj.Nowifwetranspose thediagram—interchangingrowsand columns wegetthediagramcorresponding to the inverse of theoriginal permutation Hencethenumberofx’s(thenumberof inversions)isthesameinbothcases.Rotheusedthis facttoprove that the determinant of amatrixisunchanged whenthe matrixistransposed

Theanalysis of several sorting algorithms involves theknowledgeofhow

manypermutations ofnelements have exactly k inversions Let us denote that

number byI(k)'.Table1 liststhefirstfew values ofthisfunction

Byconsidering the inversion tablebxb2. b, it isobvious that /„(0)=1,/„(!)=n—1,andthereisasymmetryproperty

n

)~k)=

Trang 33

5.1.1 INVERSIONS 15

Table1PERMUTATIONS WITHkINVERSIONS

nIn( 0)Ml)In{ 2) In (3)In( 4)/»(5) In( 6)In( 7)/n(8)In( 9)In (10)/n(ll)

isnotdifficultto see that the generating function

Gn(z)=In(0)+In (X)z+In (2)z2

+ (7)satisfiesGn(z)=(1+z+• +zn ~1

)Gn _i(z); henceithas the comparativelysimpleformnoticedbyO Rodrigues[J.deMath.4(1839), 236-240]:(1+2+• +2"-1

) (1+2)(1)=(1-2") (1-22)(1-Z )/(1-2)” (8)Fromthisgenerating function,wecaneasilyextend Table1,andwe canverifythat thenumbers belowthe zigzaglineinthat tablesatisfy

In(k)—In (k-1)+J„_i(fc), for k<n (9)(This relation does not hold above the zigzagline.)Amorecomplicated argu-

ment(seeexercise 14)showsthat,in fact,wehave the formulas

2—j)/ 2isa so-called “pentagonalnumber.”

IfwedivideGn(z)byn!wegetthe generating function gn(z)forthe

Trang 34

Aremarkable discoveryaboutthe distribution of inversionswasmadeby

P.A.MacMahon[Amer.J.Math 35(1913), 281-322],Let us define the index

ofthepermutationai a2. a n as thesumofallsubscripts j such thata.j>a3+\,

1<j<n.For example, the index of591826473is2+4+6+8=20.Bycoincidence the indexisthesameasthenumberof inversionsin this case.Ifwe

listthe 24 permutations of {1,2,3, 4},namely

Permutation Index Inversions Permutation Index Inversions

Atfirstthis factmight appeartobe almost obvious, but further scrutiny

makesitvery mysterious.MacMahongaveaningenious indirect proof, asfollows:Let ind(aia2. a„)betheindex of thepermutationa\a2 a n,andlet

Trang 35

5.1.1 INVERSIONS 17definea one-to-one correspondencebetweenarbitrary n-tuples(qi,q2, qn)ofnonnegativeintegers,ontheonehand,andordered pairs of n-tuples

Thegenerating function z9l+92+’+9n

)summedoveralln-tuples oftiveintegers (qi,q2, ,qn), isQn(z)=1/(1—z)n;andthegenerating function

nonnega-£zpi+P2+-+p» summed overalln-tuples of integers (pi,p2 , •,Pn) such that

Pi>P2>' >Pn>0,

Pn(z)=1/(1-Z)( 1-Z2

) (1-z), (16)

asshowninexercise15.Inview of(15),the one-to-one correspondenceweare

aboutto establishwillprove thatQn(z)= Hn(z)Pn(z),thatis,

^Qanina stablemanner, wherea,ia2. a nisapermutationsuch that qa.=

qaj+1impliesa,j<aj+ Weset(pi,P2,. ,p„)=(qai,qa2, ,qan)andthen,for

1<j<n,subtract1fromeach of pi, p3foreach j such that a3>aj+1.We

stillhave Pi>p2>• >p„, because pjwasstrictlygreaterthan pJ+iwhenever

Conversely,wecaneasilygobackto(qltq2, qn)when01 a2 anand

(PiiP2) • Pn) are given.(See exercise17.) Sothe desired correspondence has

beenestablished,andMacMahon’sindextheoremhasbeenproved

D.FoataandM.P.Schiitzenberger discovered a surprising extension of

MacMahon’stheorem,about65 yearsafterMacMahon’soriginal publication:

Thenumberofpermutationsofnelements thathavek inversionsandindexlisthesameas thenumberthathavelinversionsandindexk.Infact,Foataand

Schiitzenbergerfound a simple one-to-one correspondencebetweenpermutations

ofthefirstkindandpermutations of the second(seeexercise25).EXERCISES

1.[10]Whatistheinversion table forthepermutation271845936?Whatmutation has theinversion table50121200?

per-2 [M20]IntheclassicalproblemofJosephus(exercise1.3.2—22),nmenare initiallyarrangedinacircle;themthmanisexecuted, thecirclecloses,and everymthmanis

Trang 36

18 SORTING

5.1.1

°f{1,2 For example,when n=8andm =4 the orderis54613872(man1

is5thout, etc.);theinversion tablecorrespondingto thispermutationis36310010Give a simple recurrencerelation forthe elementsb1b2 bn oftheinversion table

mthe general Josephus problemfornmen,wheneverymth manisexecuted

3 [18]Ifthe permutation a1a2 ancorrespondstotheinversion table bi b2 bnwhatisthe permutationoia2 o„ that correspondstotheinversion table

(n-1-&i)(r 2—bo (0—b n)?

4 [20] Design an algorithmsuitable forcomputer implementation that constructsthe permutation a2 ancorrespondingtoa giveninversion table bxb2 b n satis-fying (3)- [Hint:Consider a linked-memorytechnique.]

5 [35]Thealgorithmof exercise4requiresan execution time roughly proportional

ton+ +• +6 nontypicalcomputers, andthisis©(n2

)on theaverage Is thereanalgorithm whose worst-case running timeissubstantially betterthan orderra 2?

6.[26]Design an algorithm that computes theinversion tablebib2 b ning toa given permutationa,a2 anof{l,2, ,n},wheretherunning timeis

correspond-essentiallyproportionaltonlogn ontypicalcomputers

7 [20] Several other kindsof inversion tablescan bedefined,correspondingtoagiven permutationehned «ia2 ,.a n of {1,2, n}, besidestheparticular table b2 bn

inthetext; in this exercisewewillconsider three other typesof inversion tablesthatarise in applications

LetCjbe thenumberof inversionswhosefirstcomponentisj,thatis,thenumber

ofelementstothertght of jthatare lessthanj.[Correspondingto (i)wehave thetable0 0014 2157;clearly0<e,<j.]LetBj=bajand C,=c0.

Showthat 0<Bj<jand0<Cj< n -j,for1<j<n;furthermoreshow

hat the permutationaia2 ancan be determined uniquelywheneithercic2 c

orBiB2 Bn orC\C2 Cnisgiven

8.[M2t ] Continuing the notationof exercise7,leta\ a'2 a'nbe theinverse of

he permutation axa2 .a,andletthe correspondinginversion tablesbeb\ b'2 b'n.

Ci c2 •c„,B1B2 Bn,and C, C'2 C'n.Findasmanyinteresting relations asyoucan between the numbersa,-,bhC] ,Bj, Cj,a'j, b), c',B),C'j.

9 [MSI]Provethat, inthe notationof exercise7,the permutation a1a2 a„isaninvolution (thatis, itsowninverse)ifand onlyifbj=Cjfor1<j<n

10 [HM20]ConsiderFig.1asa polyhedroninthreedimensions.Whatistheeterofthe truncated octahedron(thedistance between vertex 1234 and vertex 4321)

diam-if allofitsedges haveunitlength? '

Trang 37

5.1.1 INVERSIONS 19b)Conversely,letEbe anytransitivesubsetofT ={(a;,?/)

1<y< <n}whosecomplementE = T\Eisalso transitive.Prove that thereexistsapermutationnsuch thatE(n)=E

12.[M28] Continuing the notationofthe previousexercise,prove thatif 7Tiand7T2

arepermutationsandifEisthesmallest transitive setcontaining E(ni)UE(iV2 ),then

Eistransitive.[Hence,ifwesaymis“above”7t 2wheneverE(7Ti)CE(7 r 2 ),alattice

ofpermutationsisdefined; thereisaunique“lowest”permutation “above” two givenpermutations Figure1 isthelatticediagramwhenn=4.]

13.[M23]It iswellknownthathalfofthetermsintheexpansionofa determinanthave aplus sign,andhalfhave aminussign.Inother words, thereare just asmany

permutations with an evennumberofinversions aswithan odd number,when n>2.

Showthat,ingeneral,thenumberofpermutations having anumberof inversionscongruenttotmodulomisn!/m,regardless oftheintegert.whenevern > m

14.[M24] (F.Franklin.) Apartition ofnintokdistinctpartsisa representation

n=Pi+P2+• +Pk,wherepi> 2>• >Pk>0.Forexample, thepartitions of7into distinctpartsare7,6+1,5+2,4+3,4+2+1. Let fk{n)be thenumberofpartitions ofnintokdistinct parts;prove that Y.k(~l

)

k

fk(n)=0,unlessnhas theform(3j2±j)/2,forsomenonnegativeintegerj;inthelattercasethesumis(-1)+For example,when n=7thesumis-1 + 3-1 =1,and 7=(3•22

+2)/2.[Hint:Represent apartition asan arrayof dots,putting ptdotsintheithrow,for1<i<k.Find thesmallest jsuch that p3+i<pj—1,andencircletherightmost dotsinthefirst

j rows If j<pk,thesejdotscanusuallybe removed,tilted45°,and placedasanew(fc+l)strow.Ontheotherhandifj>pk ,thefcthrowofdotscanusuallybe removed,tilted45,and placedtotherightofthecircled dots.(See Fig.2.)This processpairsoffpartitionshaving an oddnumberofrows withpartitionshaving an evennumberofrows, inmostcases,soonlyunpairedpartitionsmust be consideredinthesum.]

Fig.2.Franklin’scorrespondence betweenpartitionswithdistinct parts.Note:Asa consequence,weobtainEuler’sformula

Trang 38

20 SORTING 5.1.1

15.[MSS]Prove that(16)isthe generating functionforpartitions into atmostnparts;thatis,prove that thecoefficientof zminl/(l-z)(l-z2

) (l-z) isthe

numberofwaystowritem =pi+p2+• +p„ withpi>p2>• >p„>0.

[Hint:Drawing dotsas in exercise 14,showthatthereisa one-to-one correspondencebetween n-tuples(pi.p2 pn)such thatPi> 2> •> n>0and sequences(Pi, P2, P3, )such thatn Pi> P2>P3>>0,with the property that

Pi+P2H \-Pn=P\+P2+Ps-\ .Inother words,partitions into atmostnpartscorrespondto partitions intopartsnot exceedingn.]

16 [M25](L.Euler.)Prove thefollowing identitiesbyinterpretingbothsidesoftheequationsintermsof partitions:

18.[M30](T.Hibbard,CACM6(1963), 210.)Letn>0,and assume that a sequence

of 2nn-bit integersXq,. X2n ~ihasbeen generatedatrandom, where eachbitofeachnumberisindependently equalto1withprobability p Consider the sequence

Xo®0,Xi©1, X2n-i©(2n—1),where©denotes the“exclusive or”operation

on the binaryrepresentations.Thusifp=0,thesequenceis0,1, ,

19 [M28](C.Meyer.)When misrelativelyprimeto n,weknowthat the sequence

(mmodn)(2mmodn). ((n— l)mmodn)isapermutationof (1,2, n 1}.Show

that thenumberofinversions of thispermutation can be expressedintermsofDedekind

sums(seeSection3.3.3)

20.[M43]ThefollowingfamousidentityduetoJacobi[Fundaments NovaTheoriseFunctionum Ellipticarum(1829), §64]isthebasisofmanyremarkablerelationshipsinvolving elliptic functions:

Trang 39

5.1.1 INVERSIONS 21For example,ifwesetu—z,v=z2

,weobtainEuler’sformulaof exercise 14.Ifwesetz=\/u/v,q=y/uv,weobtain

n(l-g2fc"1

^)(l-q2A;-1^1

)(l-q2 ' t

)= £ (-1Tzn \k>l —oo<n<oo

Istherea combinatorial proofof Jacobi’s identity,analogoustoFranklin’sproof

ofthe specialcaseinexercise14? (Thuswewanttoconsider“complexpartitions”

m +ni=(pi+q\i)+(p2+92*)H 1-{pk+qki

represen-21.[M25] (G.D Knott.) Showthatthepermutation a\ a„isobtainable with

astack, inthe senseof exercise 2.2.1-5or 2.3 1-6,ifand onlyifCj<Cj+i+1for

1<j<ninthenotationof exercise7.

22.[M26 Given a permutationaio2 an of {1,2, , n},lethjbe thenumberofindicesi<jsuch that at6{aj+1, a.j+2, aJ+i } (Ifa]+\<aj,theelementsof thisset“wrap around” fromntol.Whenj=n weuse theset{a„+l,a„+2,. n}.)Forexample, thepermutation591826473leads to hi hg=00 1 214 6.

a)Prove thataio2 a can be reconstructed from the numbershih2 h.

b)Prove thathi+h2+• +h istheindexof oia2 a„

23.[M27] (Russianroulette.)Agroupofn condemnedmen whopreferprobabilitytheorytonumbertheorymight choosetocommitsuicidebysitting inacircleandmodifying Josephus’smethod(exercise 2) as follows:Thefirstprisonerholds agunand aimsitat hishead;withprobabilityp hediesandleavesthecircle Thenthesecondmantakesthegun and proceedsinthesameway Play continuescyclically,with constantprobabilityp 0, untileveryoneisdead

Leta,j=kifmankisthejthto die Prove that the death orderaio2 .a„occurswith aprobabilitythatisa function onlyof n, p,and the indexofthedualpermutation(n+1—a) (n+1—a2 )(n+1—ai).Whatdeath orderisleast likely?

24.[M26]Givenintegers f(l) t(2). t(n)witht(j)>j,thegeneralizedindexofapermutationaia2 a isthesumofallsubscriptsj such thataj>t(aj+1 ),plusthetotalnumberof inversionssuch thati<jandt(aj)>Oj>aj.Thuswhent(j)=j for

allj, thegeneralizedindexisthesameastheindex;butwhent(j)>nforalljit the

numberof inversions.Prove that thenumberofpermutationswhose generalized indexequalskisthesameasthenumberofpermutations having kinversions.[Hint:Showthat,ifwetakeany permutationai. an -i of {1,. n—1}andinsertthenumbern

inallpossible places,weincreasethegeneralizedindexby thenumbers{0,1, n—1}

insomeorder.]

25.[M30](FoataandSchiitzenberger.)Ifa=ai..a nisa permutation,letind(a)

beitsindex,andletinv(a)countitsinversions

a)Definea one-to-one correspondence that takes each permutationaof {1,. ,n}

toapermutation /(a) that has thefollowingtwoproperties: (i)ind(/(a))=inv(a);(ii)for1<j<n,thenumberjappearstotheleftof j+1inf(a)

Trang 40

22 SORTING 5.1.1constructionassign tof(a)when a= 198263745?Forwhatpermutationaisf(a)= 198263745? [Hint:Ifn>1,writea=xiaix2a2 XkOtkan,where

Xi,. xkarealltheelements<a ifa\<a,otherwisex\,. Xkarealltheelements>a;theother elementsappearin(possiblyempty)strings ai,. afc

Comparethenumberof inversions ofh(a)=axxia2X2 ctkXk to inv(a); in thisconstructionthenumbera„ does not appearin h(a).]

b)Use /to defineanother one-to-one correspondence g having thefollowingtwoproperties: (i)ind(g(a))=inv(a);(ii)inv(g(a)) =ind(a) [Hint:Considerinversepermutations.]

26.[M25]Whatisthestatisticalcorrelation coefficientbetween thenumberof sionsand the indexofarandompermutation?(SeeEq.3.3.2-( ).)

inver-27.[M37]Provethat,inadditionto(15),thereisa simplerelationshipbetweeninv(oi 02. an)and the n-tuple(91,92, •9n)-Usethis factto generalizethederiva-tionof(17),obtaininganalgebraic characterization ofthebivariategenerating function

H„(w,z)= J2winV{ai“2•a n)

;z

ind(a1a2 an

)

where thesumisoveralln!permutations axa2 n

-28.[25] Ifaia2 a„isapermutationof {1,2, ,n},itstotaldisplacementis

defined tobe

1aJ~j\- Find upperand lower boundsfortotaldisplacement

intermsofthenumberof inversions

29.[28] If7 r=a\a2 a„andn'=a[a2 a'n arepermutationsof {1,2, , n},theirproduct7T 7r' is a'ai a'„ 2 a'an.Letinv(7 r)denote thenumberof inversions, as inexercise 25.Showthatinv(7 T 7 r')<inv(7r) -t-inv(Tr'),and thatequalityholdsifand only

if 7T 7 r' is“below”k' inthe senseof exercise 12.

*5.1.2.Permutationsof a Multiset

Sofarwehavebeendiscussingpermutations of asetofelements;thisisjustaspecial case ofthe concept of permutations of amultiset.(Amultisetislikeasetexcept thatitcan have repetitions of identical elements.Somebasic properties

ofmultisetshavebeendiscussedinexercise 4.6.3-19.)

Forexample, consider the multiset

M ={a, a, a, b, b, c,d, d, d,d}, (1)

whichcontains 3a’s,2b's,1c,and4d’s.We mayalsoindicate themultiplicities

ofelementsinanother way,namely

How manypermutations ofMare possible? Ifweregarded the elements

ofMasdistinct,bysubscriptingthemax ,a2,a3,bx ,b2 ,ci,dx ,d2 ,d3,d4,

Ngày đăng: 10/12/2022, 23:23

TRÍCH ĐOẠN

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN