Introduction to Algorithms Second Edition Instructor’s Manual 2nd phần 3 docx

With such values, MAX-HEAPIFY will be called h times where h is the heap height, which is the number of edges in the longest pathfrom the root to a leaf, so its running time will beh sin

Trang 1

Analysis: constant time assignments+ time for HEAP-INCREASE-KEY.

Time: O (lg n).

Min-priority queue operations are implemented similarly with min-heaps

Trang 2

tree, possibly even at more than one location Let m be the index at which the

maximum appears (the lowest such index if the maximum appears more than once)

Since the maximum is not at the root of the subtree, node m has a parent Since the parent of a node has a lower index than the node, and m was chosen to be the smallest index of the maximum value, A[PARENT (m)] < A[m] But by the max-

heap property, we must have A[PARENT (m)] ≥ A[m] So our assumption is false,

and the claim is true

Solution to Exercise 6.2-6

If you put a value at the root that is less than every value in the left and rightsubtrees, then MAX-HEAPIFYwill be called recursively until a leaf is reached To

Trang 3

make the recursive calls traverse the longest path to a leaf, choose values that makeMAX-HEAPIFYalways recurse on the left child It follows the left branch when theleft child is≥ the right child, so putting 0 at the root and 1 at all the other nodes, forexample, will accomplish that With such values, MAX-HEAPIFY will be called h times (where h is the heap height, which is the number of edges in the longest path

from the root to a leaf), so its running time will be(h) (since each call does (1)

work), which is(lg n) Since we have a case in which MAX-HEAPIFY’s runningtime is(lg n), its worst-case running time is (lg n).

Let H be the height of the heap.

Two subtleties to beware of:

• Be careful not to confuse the height of a node (longest distance from a leaf)with its depth (distance from the root)

• If the heap is not a complete binary tree (bottom level is not full), then the nodes

at a given level (depth) don’t all have the same height For example, although all

nodes at depth H have height 0, nodes at depth H − 1 can have either height 0

or height 1

For a complete binary tree, it’s easy to show that there aren/2 h+1 nodes of

height h But the proof for an incomplete tree is tricky and is not derived from the

proof for a complete tree

Proof By induction on h.

In fact, we’ll show that the # of leaves= n/2.

The tree leaves (nodes at height 0) are at depths H and H− 1 They consist of

• all nodes at depth H , and

• the nodes at depth H − 1 that are not parents of depth-H nodes.

Let x be the number of nodes at depth H —that is, the number of nodes in the

bottom (possibly incomplete) level

Note that n − x is odd, because the n − x nodes above the bottom level form a

complete binary tree, and a complete binary tree has an odd number of nodes (1

less than a power of 2) Thus if n is odd, x is even, and if n is even, x is odd.

To prove the base case, we must consider separately the case in which n is even (x is odd) and the case in which n is odd (x is even) Here are two ways to do

this: The Þrst requires more cleverness, and the second requires more algebraicmanipulation

1 First method of proving the base case:

• If n is odd, then x is even, so all nodes have siblings—i.e., all internal

nodes have 2 children Thus (see Exercise B.5-3), # of internal nodes =

# of leaves− 1

Trang 4

So, n= # of nodes = # of leaves + # of internal nodes = 2 · # of leaves − 1.Thus, # of leaves = (n + 1)/2 = n/2 (The latter equality holds because n

is odd.)

• If n is even, then x is odd, and some leaf doesn’t have a sibling If we gave

it a sibling, we would have n + 1 nodes, where n + 1 is odd, so the case

we analyzed above would apply Observe that we would also increase thenumber of leaves by 1, since we added a node to a parent that already had

a child By the odd-node case above, # of leaves+ 1 = (n + 1)/2 =

n/2 + 1 (The latter equality holds because n is even.)

In either case, # of leaves= n/2.

2 Second method of proving the base case:

Note that at any depth d < H there are 2 d nodes, because all such tree levelsare complete

• If x is even, there are x /2 nodes at depth H − 1 that are parents of depth H

nodes, hence 2H−1−x/2 nodes at depth H −1 that are not parents of depth-H

nodes Thus,total # of height-0 nodes = x + 2 H−1 − x/2

= 2H−1+ x/2

= (2 H + x)/2

= (2 H + x − 1)/2 (because x is even)

= n/2 (n= 2H + x − 1 because the complete tree down to depth H − 1 has 2 H− 1

nodes and depth H has x nodes.)

• If x is odd, by an argument similar to the even case, we see that

# of height-0 nodes = x + 2 H−1 − (x + 1)/2

= 2H−1 + (x − 1)/2

= (2 H + x − 1)/2

= n/2

= n/2 (because x odd ⇒ n even)

Let n h be the number of nodes at height h in the n-node tree T

Consider the tree Tformed by removing the leaves of T It has n = n − n0nodes

We know from the base case that n0= n/2, so n= n−n0

Note that the nodes at height h in T would be at height h− 1 if the leaves of the

tree were removed—that is, they are at height h − 1 in T Letting nh−1denote the

number of nodes at height h − 1 in T, we have

Trang 5

17 13

8 7

20 25

7

17 13

8 2

20 25

13 5 8

4 2

20 25

17

2 4

7 8

25 20

20

2 4

7 8

25 5

7 8

4

5

Trang 6

2 2

3 1 2

3 2 1

A

BUILD-MAX-HEAP(A):

1 -∞

2 -∞

1

3 2 1

3 1 2

A

b An upper bound of O (n lg n) time follows immediately from there being n − 1

calls to MAX-HEAP-INSERT, each taking O(lg n) time For a lower bound of

Trang 7

(n lg n), consider the case in which the input array is given in strictly

increas-ing order Each call to MAX-HEAP-INSERT causes HEAP-INCREASE-KEY to

go all the way up to the root Since the depth of node i is n

In the worst case, therefore, BUILD-MAX-HEAP requires (n lg n) time to

build an n-element heap.

Solution to Problem 6-2

a A d-ary heap can be represented in a 1-dimensional array as follows The root

is kept in A[1], its d children are kept in order in A[2] through A[d + 1], their

children are kept in order in A[d + 2] through A[d2+ d + 1], and so on The following two procedures map a node with index i to its parent and to its j th

child (for 1≤ j ≤ d), respectively.

c The procedure HEAP-EXTRACT-MAXgiven in the text for binary heaps works

Þne for d-ary heaps too The change needed to support d-ary heaps is in

MAX-HEAPIFY, which must compare the argument node to all d children instead ofjust 2 children The running time of HEAP-EXTRACT-MAXis still the runningtime for MAX-HEAPIFY, but that now takes worst-case time proportional to theproduct of the height of the heap by the number of children examined at each

node (at most d), namely (d log d n ) = (d lg n/ lg d).

Trang 8

d The procedure MAX-HEAP-INSERT given in the text for binary heaps works

Þne for d-ary heaps too The worst-case running time is still (h), where h

is the height of the heap (Since only parent pointers are followed, the number

of children a node has is irrelevant.) For a d-ary heap, this is (log d n ) =

(lg n/ lg d).

e. D-ARY-HEAP-INCREASE-KEY can be implemented as a slight modiÞcation

of MAX-HEAP-INSERT (only the Þrst couple lines are different) ing an element may make it larger than its parent, in which case it must bemoved higher up in the tree This can be done just as for insertion, travers-ing a path from the increased node toward the root In the worst case, theentire height of the tree must be traversed, so the worst-case running time is

Increas-(h) = (log d n ) = (lg n/ lg d).

D-ARY-HEAP-INCREASE-KEY(A, i, k) A[i] ← max(A[i], k)

while i > 1 and A[PARENT(i)] < A[i]

i ← PARENT(i)

Trang 9

Chapter 7 overview

[The treatment in the second edition differs from that of the Þrst edition We use

a different partitioning method—known as “Lomuto partitioning”—in the secondedition, rather than the “Hoare partitioning” used in the Þrst edition Using Lomutopartitioning helps simplify the analysis, which uses indicator random variables inthe second edition.]

Quicksort

• Worst-case running time: (n2).

• Expected running time:(n lg n).

• Constants hidden in(n lg n) are small.

• Sorts in place

Description of quicksort

Quicksort is based on the three-step process of divide-and-conquer

• To sort the subarray A[ p r]:

and A[q + 1 r], such that each element in the Þrst subarray A[p q − 1]

is≤ A[q] and A[q] is ≤ each element in the second subarray A[q + 1 r].

Combine: No work is needed to combine the subarrays, because they are sorted

in place

• Perform the divide step by a procedure PARTITION, which returns the index qthat marks the position separating the subarrays

Trang 10

• PARTITIONalways selects the last element A[r] in the subarray A[ p r] as the

pivot—the element around which to partition.

• As the procedure executes, the array is partitioned into four regions, some ofwhich may be empty:

Loop invariant:

1 All entries in A[ p i] are ≤ pivot.

2 All entries in A[i + 1 j − 1] are > pivot.

3 A[r]= pivot

It’s not needed as part of the loop invariant, but the fourth region is A[ j r −1],

whose entries have not yet been examined, and so we don’t know how theycompare to the pivot

Example: On an 8-element subarray.

Trang 11

A[i+1 j–1]: known to be > pivot A[p i]: known to be ≤ pivot

[The index j disappears because it is no longer needed once the for loop is exited.]

Correctness: Use the loop invariant to prove correctness of PARTITION:

Initialization: Before the loop starts, all the conditions of the loop invariant are

satisÞed, because r is the pivot and the subarrays A[ p i] and A[i + 1 j − 1]

are empty

are swapped and then i and j are incremented If A[ j ] > pivot, then increment

only j

parti-tioned into one of the three cases: A[ p i] ≤ pivot, A[i + 1 r − 1] > pivot,

and A[r]= pivot

The last two lines of PARTITIONmove the pivot element from the end of the array

to between the two subarrays This is done by swapping the pivot and the Þrst

element of the second subarray, i.e., by swapping A[i + 1] and A[r].

Time for partitioning: (n) to partition an n-element subarray.

Trang 12

Performance of quicksort

The running time of quicksort depends on the partitioning of the subarrays:

• If the subarrays are balanced, then quicksort can run as fast as mergesort

• If they are unbalanced, then quicksort can run as slowly as insertion sort

Worst case

• Occurs when the subarrays are completely unbalanced

• Have 0 elements in one subarray and n− 1 elements in the other subarray

• Get the recurrence

T (n) = T (n − 1) + T (0) + (n)

= T (n − 1) + (n)

= (n2)

• Same running time as insertion sort

• In fact, the worst-case running time occurs when quicksort takes a sorted array

as input, but insertion sort runs in O (n) time in this case.

Best case

• Occurs when the subarrays are completely balanced every time

• Each subarray has≤ n/2 elements.

• Imagine that PARTITIONalways produces a 9-to-1 split

T (n) ≤ T (9n/10) + T (n/10) + (n)

= O(n lg n)

• Intuition: look at the recursion tree

• It’s like the one for T (n) = T (n/3) + T (2n/3) + O(n) in Section 4.2.

• Except that here the constants are different; we get log10n full levels and

log10/9 n levels that are nonempty.

• As long as it’s a constant, the base of the log doesn’t matter in asymptoticnotation

• Any split of constant proportionality will yield a recursion tree of depth

(lg n).

Trang 13

Intuition for the average case

• Splits in the recursion tree will not always be constant

• There will usually be a mix of good and bad splits throughout the recursiontree

• To see that this doesn’t affect the asymptotic running time of quicksort, assumethat levels alternate between best-case and worst-case splits

• The extra level in the left-hand Þgure only adds to the constant hidden in the

-notation.

• There are still the same number of subarrays to sort, and only twice as muchwork was done to get to that point

• Both Þgures result in O (n lg n) time, though the constant for the Þgure on the

left is higher than that of the Þgure on the right

Randomized version of quicksort

• We have assumed that all input permutations are equally likely

• This is not always true

• To correct this, we add randomization to quicksort

• We could randomly permute the input array

• Instead, we use random sampling, or picking one element at random.

• Don’t always use A[r] as the pivot Instead, randomly pick an element from the

subarray that is being sorted

We add this randomization by not always using A[r] as the pivot, but instead

ran-domly picking an element from the subarray that is being sorted

RANDOMIZED-PARTITION(A, p, r)

i ← RANDOM(p, r)

exchange A[r] ↔ A[i]

Randomly selecting the pivot element will, on average, cause the split of the inputarray to be reasonably well balanced

Trang 14

• Guess: T (n) ≤ cn2, for some c.

• Substituting our guess into the above recurrence:

T (n) ≤ max

= c · max

0≤q≤n−1 (q2+ (n − q − 1)2) + (n)

• The maximum value of(q2+ (n − q − 1)2) occurs when q is either 0 or n − 1.

(Second derivative with respect to q is positive.) This means that

• Pick c so that c (2n − 1) dominates (n).

• Therefore, the worst-case running time of quicksort is O (n2).

• Can also show that the recurrence’s solution is (n2) Thus, the worst-case

running time is(n2).

Trang 15

Average-case analysis

• The dominant cost of the algorithm is partitioning

• PARTITIONremoves the pivot element from future consideration each time

• Thus, PARTITIONis called at most n times.

• QUICKSORTrecurses on the partitions

• The amount of work that each call to PARTITION does is a constant plus the

number of comparisons that are performed in its for loop.

• Let X = the total number of comparisons performed in all calls to PARTITION.

• Therefore, the total work done over the entire execution is O (n + X).

We will now compute a bound on the overall number of comparisons

For ease of analysis:

• Rename the elements of A as z1, z2, , z n , with z i being the ith smallest

Now all we have to do is Þnd the probability that two elements are compared

• Think about when two elements are not compared.

• For example, numbers in separate partitions will not be compared

• In the previous example,8, 1, 6, 4, 0, 3, 9, 5 and the pivot is 5, so that none

of the set{1, 4, 0, 3} will ever be compared to any of the set {8, 6, 9}.

Trang 16

• Once a pivot x is chosen such that z i < x < z j , then z i and z j will never becompared at any later time.

• If either z i or z j is chosen before any other element of Z i j, then it will be

compared to all the elements of Z i j, except itself

• The probability that z i is compared to z j is the probability that either z i or z j isthe Þrst element chosen

• There are j −i +1 elements, and pivots are chosen randomly and independently.

Thus, the probability that any particular one of them is the Þrst one chosen is

1/( j − i + 1).

Therefore,

Pr{z i is compared to z j } = Pr {z i or z j is the Þrst pivot chosen from Z i j}

= Pr {z i is the Þrst pivot chosen from Z i j}

+ Pr {z j is the Þrst pivot chosen from Z i j}

[The second line follows because the two events are mutually exclusive.]

Substituting into the equation for E[X ]:

series in equation (A.7):

Trang 17

PARTITION does a “worst-case partitioning” when the elements are in decreasingorder It reduces the size of the subarray under consideration by only 1 at each step,which we’ve seen has running time(n2).

In particular, PARTITION, given a subarray A[ p r] of distinct elements in

de-creasing order, produces an empty partition in A[ p q − 1], puts the pivot

(orig-inally in A[r]) into A[ p], and produces a partition A[ p + 1 r] with only one fewer element than A[ p r] The recurrence for QUICKSORT becomes T (n) =

T (n − 1) + (n), which has the solution T (n) = (n2).

The minimum depth follows a path that always takes the smaller part of the tition—i.e., that multiplies the number of elements by α One iteration reduces

par-the number of elements from n to αn, and i iterations reduces the number of

ele-ments toα i n At a leaf, there is just one remaining element, and so at a

minimum-depth leaf of minimum-depth m, we have α m n = 1 Thus, α m = 1/n Taking logs, we get

m lg α = − lg n, or m = − lg n/ lg α.

Similarly, maximum depth corresponds to always taking the larger part of the tition, i.e., keeping a fraction 1− α of the elements each time The maximum depth M is reached when there is one element left, that is, when (1 − α) M n = 1

Trang 18

To show that quicksort’s best-case running time is(n lg n), we use a technique

similar to the one used in Section 7.4.1 to show that its worst-case running time

is O (n2).

Let T (n) be the best-case time for the procedure QUICKSORTon an input of size n.

We have the recurrence

As we’ll show below, the expression q lg q + (n − q − 1) lg(n − q − 1) achieves a

minimum over the range 1≤ q ≤ n−1 when q = n−q−1, or q = (n−1)/2, since the Þrst derivative of the expression with respect to q is 0 when q = (n − 1)/2 and the second derivative of the expression is positive (It doesn’t matter that q is not

an integer when n is even, since we’re just trying to determine the minimum value

of a function, knowing that when we constrain q to integer values, the function’s

value will be no lower.)

Choosing q = (n − 1)/2 gives us the bound

quantity 2cn + c lg(n − 1) − c Thus, the best-case running time of quicksort is

(n lg n).

Letting f (q) = q lg q + (n − q − 1) lg(n − q − 1), we now show how to Þnd

the minimum value of this function in the range 1 ≤ q ≤ n − 1 We need to Þnd the value of q for which the derivative of f with respect to q is 0 We rewrite this

function as

Trang 19

The derivative f(q) is 0 when q = n − q − 1, or when q = (n − 1)/2 To verify

that q = (n − 1)/2 is indeed a minimum (not a maximum or an inßection point),

we need to check that the second derivative of f is positive at q = (n − 1)/2:

ln 2

2

itself with arguments A , p, q − 1 QUICKSORT then calls itself again, with

arguments A , q + 1, r QUICKSORT instead sets p ← q + 1 and performs

another iteration of its while loop This executes the same operations as calling

itself with A , q + 1, r, because in both cases, the Þrst and third arguments (A

and r) have the same values as before, and p has the old value of q+ 1

b The stack depth of QUICKSORT will be (n) on an n-element input array if

there are (n) recursive calls to QUICKSORT This happens if every call toPARTITION(A, p, r) returns q = r The sequence of recursive calls in this

scenario isQUICKSORT(A, 1, n) ,

Trang 20

c The problem demonstrated by the scenario in part (b) is that each invocation of

QUICKSORT calls QUICKSORT again with almost the same range To avoidsuch behavior, we must change QUICKSORT so that the recursive call is on asmaller interval of the array The following variation of QUICKSORT checkswhich of the two subarrays returned from PARTITION is smaller and recurses

on the smaller subarray, which is at most half the size of the current array Sincethe array size is reduced by at least half on each recursive call, the number ofrecursive calls, and hence the stack depth, is (lg n) in the worst case Note

that this method works no matter how partitioning is performed (as long asthe PARTITIONprocedure has the same functionality as the procedure given inSection 7.1)

Trang 21

Sorting in Linear Time

Chapter 8 overview

How fast can we sort?

We will prove a lower bound, then beat it by playing a different game

Comparison sorting

• The only operation that may be used to gain order information about a sequence

is comparison of pairs of elements

• All sorts seen so far are comparison sorts: insertion sort, selection sort, mergesort, quicksort, heapsort, treesort

Lower bounds for sorting

Lower bounds

• (n) to examine all the input.

• All sorts seen so far are(n lg n).

• We’ll show that(n lg n) is a lower bound for comparison sorts.

Decision tree

• Abstraction of any comparison sort

• Represents comparisons made by

• a speciÞc sorting algorithm

• on inputs of a given size

• Abstracts away everything else: control and data movement

• We’re counting only comparisons.

Tiêu đề	Lecture Notes for Chapter 6: Heapsort
Trường học	University of Example
Chuyên ngành	Computer Science
Thể loại	Tài liệu giảng dạy
Năm xuất bản	2023
Thành phố	Example City

Định dạng
Số trang	43
Dung lượng	274,75 KB