Đánh giá độ phức tạp
Trang 1CMSC 451 (NTU 520B): Design and Analysis of Computer
Algorithms1
Fall 1999 Dave Mount
Lecture 1: Course Introduction
(Thursday, Sep 2, 1999)
Reading: Chapter 1 in CLR (Cormen, Leiserson, and Rivest)
Professor Carl Smith reviewed material from Chapter 1 in CLR
Lecture 2: Asymptotics and Summations
(Tuesday, Sep 7, 1999)
Read: Review Chapters 1, 2, and 3 in CLR (Cormen, Leiserson, and Rivest)
What is an algorithm? Our text de nes analgorithmto be any well-de ned computational cedure that takes some values as input and produces some values as output Like a cookingrecipe, an algorithm provides a step-by-step method for solving a computational problem Un-like programs, algorithms are not dependent on a particular programming language, machine,system, or compiler They are mathematical entities, which can be thought of as running
pro-on some sort ofidealized computerwith an in nite random access memory and an unlimitedword size Algorithm design is all about the mathematical theory behind the design of goodprograms
Why study algorithm design? Programming is a very complex task There are a number ofaspects of programming that make it so complex The rst is that most programming projectsare very large, requiring the coordinated eorts of many people (This is the topic a courselike CMSC 435 in software engineering.) The next is that many programming projects involvestoring and accessing large quantities of data eciently (This is the topic of courses on datastructures and databases like CMSC 420 and 424.) The last is that many programmingprojectsinvolve solving complex computational problems, for which simplistic or naive solutions maynot be ecient enough The complex problems may involve numerical data (the subject ofcourses on numerical analysis, like CMSC 466), but often they involve discrete data This iswhere the topic of algorithm design and analysis is important
Although the algorithms discussed in this course will often represent only a tiny fraction of thecode that is generated in a large software system, this small fraction may be very importantfor the success of the overall project An unfortunately common approach to this problem is to rst design an inecient algorithm and data structure to solve the problem, and then take thispoor design and attempt to ne-tune its performance The problem is that if the underlyingdesign is bad, then often no amount of ne-tuning is going to make a substantial dierence.The focus of this course is on how to design good algorithms, and how to analyze their eciency
We will study a number of dierent techniques for designing algorithms (divide-and-conquer,dynamic programming, depth- rst search), and apply them to a number of dierent problems
1 Copyright, David M Mount, 1999, Dept of Computer Science, University of Maryland, College Park, MD, 20742 These lecture notes were prepared by David Mount for the course CMSC 451 (NTU 520B), Design and Analysis of Computer Algorithms, at the University of Maryland, College Park Permission to use, copy, modify, and distribute these notes for educational purposes and without fee is hereby granted, provided that this copyright notice appear in all copies.
Trang 2An understanding of good design techniques is critical to being able to good programs Inaddition, it is important to be able to quickly analyze the running times of these designs(without expensive prototyping and testing) We will begin with a review of the analysistechniques, which were covered in the prerequisite course, CMSC 251 See Chapters 1-3 ofCLR for more information.
Asymptotics: The formulas that are derived for the running times of program may often be quitecomplex When designing algorithms, the main purpose of the analysis is to get a sensefor the trend in the algorithm's running time (An exact analysis is probably best done byimplementing the algorithm and measuring CPU seconds.) We would like a simple way ofrepresenting complex functions, which captures the essential growth rate properties This isthe purpose ofasymptotics
Asymptotic analysis is based on two simplifying assumptions, which hold in most (but not all)cases But it is important to understand these assumptions and the limitations of asymptoticanalysis
Large input sizes: We are most interested in how the running time grows for large values
of n
Ignore constant factors: The actual running time of the program depends on various stant factors in the implementation (coding tricks, optimizations in compilation, speed ofthe underlying hardware, etc) Therefore, we will ignore constant factors
con-The justi cation for considering large n is that if n is small, then almost any algorithm isfast enough People are most concerned about running times for large inputs For the mostpart, these assumptions are reasonable when making comparisons between functions that havesigni x code tree T made with this new set of n;1 characters We canconvert it into a pre x code tree T for the original set of characters by undoing the
Trang 35previous operation and replacing z with x and y (adding a \0" bit for x and a \1" bit fory) The cost of the new tree is
Lecture 10: Greedy Algorithms: Activity Selection and tional Knapack
Frac-(Tuesday, Oct 5, 1999)
Read: Section 17.1 and 17.2 in CLR
Activity Scheduling: Last time we showed one greedy algorithm, Human's algorithm Today
we consider a couple more examples The rst is called activity scheduling and it is a verysimple scheduling problem We are given a set S =f1;2;:::;ngof nactivities that are to bescheduled to use some resource, where each activity must be started at a given start time siand ends at a given nish time fi For example, these might be lectures that are to be given
in a lecture hall, where the lecture times have been set up in advance, or requests for boats touse a repair facility while they are in port
Because there is only one resource, and some start and nish times may overlap (and twolectures cannot be given in the same room at the same time), not all the requests can behonored We say that two activities i and j are noninterfering if their start- nish intervals
do not overlap, that is [si;fi)\[sj;fj) = ; The activity scheduling problem is to select amaximum-size set of mutually noninterfering activities for use of the resource (Notice thatthere are many other criteria that we might have considered instead For example, we mightwant to maximize the total utilization time for the facility, instead.)
So how do we schedule the largest number of activities on the resource? Intuitively, we do notlike long activities, because they occupy the resource and keep us from honoring other requests.This suggests the following greedy strategy: repeated select the job with the smallest duration(fi;si) and schedule it, provided that it does not interfere with any previously scheduledactivities This turn out to be nonoptimal (See Problem 17.1-3 in CLR)
Here is a simple greedy algorithm that does work The intuition is the same Since we do notlike jobs that take a long time, let us select the job that nishes rst and schedule it Then,among all jobs that do not interfere with this rst job, we schedule the one that nishes rst,and so on We begin by assuming that the activities have all be sorted by nish times, so that
f1
f2
:::fn;(and of course the si's are sorted in parallel) The pseudocode is presented below, and assumesthat this sorting has already been done The output is the list A of scheduled activities Thevariableprevholds the index of the most recently scheduled activity at any time, in order todetermine interferences
Trang 36Greedy Activity Scheduler
schedule(int n, int s[1 n], int f[1 n]) {
// we assume f[1 n] is already sorted
A = <1>; prev = 1; // schedule activity 1 first
for i = 2 to n {
if (s[i] >= f[prev]) { // no interference?
append i to A; prev = i; // schedule i next }
4
1
4
1 1
8
8
6 5 4
8 6 5 4 3
7
Input:
7 Add 1:
7
Add 7:
7 Add 4:
8 6 5 2
6 5 3 2
3 2
3
2 1
Figure 18: Activity Scheduler
Correctness: Our proof of correctness is based on showing that the rst choice made by thealgorithm is the best possible, and then using induction to show that the algorithm is globallyoptimal The proof's structure is noteworthy, because many greedy correctness proofs arebased on the same idea: show that any other solution can be converted into the greedy solutionwithout increasing its cost
Claim: Let S =f1;2;:::;ngbe a set of activities to be scheduled, sorted by increasing nishtimes Then there is an optimal schedule in which activity 1 is scheduled rst
Proof: Let A be an optimal schedule Let x be the activity in A with the smallest