Considering a Hardware Solution While every optimization problem is different, here are some general guidelines that can help you decide whether adding hardware to help you with your pe
Trang 1programs and system programs, as they attempted to add more and more features
to those available in the previous versions.5 This illustrates what I call the Iron Law
of Programming: whenever more resources are made available, a more ambitious project is attempted This means that optimal use of these resources is important no matter how fast or capacious the machine
Why Optimization Is Often Neglected
In view of this situation, why has optimization of application programs not gained more attention? I suspect that a major reason is the tremendous and continuing change in the relative costs of programming time and computer hardware To
illustrate this, let us examine two situations where the same efficiency
improvement is gained, but the second example occurs after twenty years of
technological improvement
In the early 1970's a programmer's starting salary was about $3 per hour, and an hour of timesharing connect time cost about $15 Therefore, if a program originally took one hour to run every day, and the programmer spent a week to reduce this time by 20%, in about eight weeks the increased speed would have paid for the programmer's time
In the late 1990's, the starting salary is in the vicinity of $15 per hour, and if we assume that the desktop computer costs $2500 and is amortized over three years, the weekly cost of running the unoptimized program is about $2.00 Holding the other assumptions constant, the optimization payback time is about 30 years!6 This long-term trend seems to favor hardware solutions for performance problems over the investment of programming effort However, the appropriate strategy for
performance improvement of a program depends on how much control you have over the hardware on which the program will run The less control you have over the hardware, the greater the advantage of software solutions to performance
problems, as illustrated by the situations below
Considering a Hardware Solution
While every optimization problem is different, here are some general guidelines that can help you decide whether adding hardware to help you with your
performance problems makes sense
1 If you are creating a system that includes both software and hardware, and you can improve the functioning of the program, ease maintenance, or speed
Trang 2up development at a small additional expense for extra hardware resources, you should almost certainly do so
A few years ago, I was involved in a classic example of this situation The project was to create a point-of-sale system, including specialized hardware and programs, that tied a number of terminals into a common database Since the part of the
database that had to be accessed rapidly could be limited to about 1 1/2 megabytes,
I suggested that we put enough memory in the database server machine to hold the entire database That way, the response time would be faster than even a very good indexing system could provide, and the programming effort would be greatly
reduced The additional expense came to about $150 per system, which was less than 3% of the price of the system Of course, just adding hardware doesn't always mean that no software optimization is needed In this case, as you will see below, adding the hardware was only the beginning of the optimization effort
2 If you are writing programs for your own use on one computer, and you can afford to buy a machine powerful enough to perform adequately, then you might very well purchase such a machine rather than optimizing the
program.7
3 If you are writing a program that will be running on more than one
computer, even if it is only for internal use at your company, the expense and annoyance of requiring the other users to upgrade their computers may
outweigh the difficulty of optimization
4 If your program is to run on many computers at your company, it is almost always a good idea to try to optimize it rather than to require all those users
to upgrade their hardware
5 If your program is to be sold in the open market to run on standard
hardware, you must try to optimize it Otherwise, the users are likely to reject the program
An excellent example of what happens when you try to make a number of users upgrade their computers is the relatively disappointing sales record (as of this writing) of the Windows NTTM operating system
In order to get any reasonable performance under version 4.0 of Windows NT (the latest extant as of this writing), you need at least 64 megabytes of memory and a Pentium II/233 processor Since almost the only people who have such machines
Trang 3are professional software developers, the sales of this program to other users have been much smaller than expected
Categories of Optimization
There are two broad categories of optimization: using a better algorithm, and
improving the implementation of the algorithm we are already using Generally,
we should replace an algorithm by a better one as soon as possible, as this usually does not hinder understanding or later modification of the program An example is the use of a distribution counting sort (see Chapter mail.htm) in preference to
Quicksort Of course, if we employ these efficient algorithms from the start of our programming effort, then we are not optimizing in the strict sense of our definition (changing a working program to make it more efficient) However, the end result is still a better program than we would have obtained otherwise.8 The second
category of optimization is the modification of the implementation of an existing algorithm to take advantage of peculiarities of the environment in which it is
running or of the characteristics of the data to which it is applied This type of modification often has the unfortunate side effect of making the algorithm much harder to understand or modify in the future; it also impairs portability among different hardware and software architectures Therefore, such an optimization should be postponed until the last possible moment, in order to reduce its negative effects on the development and maintenance of the program This is an application
of the First Law of Optimization: don't try to optimize a program (in the strict sense of modifying an existing program) until it is working correctly
Finding the Critical Resource
It may seem obvious that, before you can optimize a program, you have to know what is making it inefficient Of course, if you run out of memory or disk space while executing the program, this determination becomes much simpler
Depending on which language and machine you are using, there may be "profiling" tools available which allow you to determine where your program is spending most
of its time These, of course, are most useful when the problem is CPU time, but even if that is not the problem, you may still be able to find out that, e.g., your program is spending 95% of its time in the disk reading and/or writing routines This is a valuable clue to where the problem lies
However, even if no profiler is available for your system, it isn't hard to gather some useful information yourself One way to do this is to insert a call to a system
Trang 4timer routine (such as clock() in ANSI C) at the beginning of the segment to be timed and another call at the end of the segment and subtract the two times
Depending on the resolution of the timer, the length of the routine, and the speed of your processor, you may have to execute the segment more than once to gather any useful information This is illustrated in the real-life example below
Determining How Much Optimization Is Needed
Sometimes your task is simply to make the program run as fast (or take as little memory) as possible In this case, you must use the most effective optimization available, regardless of the effort involved However, you often have (or can get) a specific target, such as a memory budget of 1.4 megabytes for your data If you can achieve this goal by relatively simple means, it would be a waste of your time to try to squeeze the last few kilobytes out of the data by a fancier compression
algorithm In fact, it may be worse than a waste of time; the simpler algorithm may also have other desirable characteristics (other than its raw performance)
A good example of such a simple algorithm is the Radix40 data compression method (see Chapter superm.htm, Figures radix40.00 through radix40.03), which
is, on average, considerably less effective at reducing the size of a data file than a number of other data compression algorithms On the other hand, it is quite fast, requires very little storage, and always produces the same amount of output for the same amount of input, which means that its compression efficiency can be
calculated exactly in advance The (statistically) more effective routines such as those using arithmetic coding take more memory and more time, and they
generally produce different amounts of output for a given amount of input
(depending on what has gone before), so that their compression efficiency cannot
be predicted exactly (In fact, in rare circumstances, they produce output that is larger than the input.) This context dependence also means that they are more difficult to use in applications where random access to compressed data is needed
The moral is that you should use the simplest algorithm that will meet your needs
If you can define your needs precisely, you probably won't have to implement as sophisticated an algorithm to solve your problem, which will leave you more time
to work on other areas that need improvement
A Real-Life Example
The point-of-sale database program I mentioned earlier is an excellent example of the fact that optimization rarely follows a straight-line path The first problem was
Trang 5the speed of access to the database by multiple users Since the software is supplied with specialized hardware, it was reasonable to solve this problem by adding
enough memory to hold the portion of the database that requires rapid access My employer determined that 15,000 invoice records and 5000 customer records
would be sufficient, resulting in a memory requirement of about 1.25 megabytes The expense of this amount of memory was within reason
Unfortunately, that part of memory (conventional memory) which allows normal
program access is limited to 640 kilobytes on IBM-compatible systems running the MS-DOS operating system, the environment in which this program operated
While our actual hardware allowed for more memory, it could be referenced only
as expanded memory, which cannot be allocated as simply as conventional
memory
Luckily, the problem of using expanded memory for data storage has been
addressed by libraries of routines which allow any particular 16 Kbyte "page" of expanded memory to be loaded when the data in it are required Storing the records
in expanded memory solved the speed problem by eliminating excessive disk I/O
However, the amount of expanded memory required was very close to the total available In the very likely event of adding even a single field to the record
definition, there would not be enough room for all the records Therefore, I had to consider ways to reduce the space taken by these records
The makeup of the database is important to the solution of this new problem It consists almost entirely of 15,000 invoice records of approximately 35 bytes each and 5000 customer records of approximately 145 bytes each Fortunately, the
majority of the fields in the customer record contained only uppercase alphabetic characters, numeric digits, a few special characters (".", ",", and "-"), and spaces This limited character set allows the use of Radix40 compression, which packs three characters into two bytes (See Chapter superm.htm for more details on this algorithm)
However, the time required to convert these fields from ASCII to Radix40
representation seemed excessive Some testing disclosed that converting 5000 records containing 6 fields of 12 characters each from ASCII to Radix40 on a 33 MHz i386 took about 40 seconds!9 Although the most common operations in this system do not require wholesale conversion, it is required in such cases as
importing an old-style database into this new system, and such inefficiency was unacceptable So my space problem had become a speed problem again