THE OPTIMIZATION PROBLEM 1 1.1 Introduction 1 1.2 The Basic Optimization Problem 4 1.3 General Structure of Optimization Algorithms 8 1.4 Constraints 10 1.5 The Feasible Region 17 1.6 Br
Trang 2PRACTICAL OPTIMIZATION
Algorithms and Engineering Applications
Trang 3PRACTICAL OPTIMIZATION
Algorithms and Engineering Applications
Andreas Antoniou Wu-Sheng Lu
Department of Electrical and Computer Engineering
University of Victoria, Canada
Spri inger
Trang 4University of V ictoria University of V ictoria
British Columbia British Columbia
Canada Canada aantoniou@shaw.ca wslu@ece.uvic,ca
Library of Congress Control Number: 2007922511
Practical Optimization: Algorithms and Engineering Applications
by Andreas Antoniou and Wu-Sheng Lu
ISBN-10: 0-387-71106-6 e-ISBN-10: 0-387-71107-4
ISBN-13: 978-0-387-71106-5 e-ISBN-13: 978-0-387-71107-2
Printed on acid-free paper
© 2007 Springer Science+Business Media, LLC
All rights reserved This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York,
NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights
9 8 7 6 5 4 3 2 1
springer.com
Trang 5Lynne and Chi'Tang Catherine with our love
Trang 6Andreas Antoniou received the Ph.D degree in Electrical Engineering from
the University of London, UK, in 1966 and is a Fellow of the lET and IEEE
He served as the founding Chair of the Department of Electrical and Computer Engineering at the University of Victoria, B.C., Canada, and is now Professor Emeritus in the same department He is the author of Digital Filters: Analysis, Design, and Applications (McGraw-Hill, 1993) and Digital Signal Processing: Signals, Systems, and Filters (McGraw-Hill, 2005) He served as Associate Editor/Editor of IEEE Transactions on Circuits and Systems from June 1983 to May 1987, as a Distinguished Lecturer of the IEEE Signal Processing Society
in 2003, as General Chair of the 2004 International Symposium on Circuits and Systems, and is currently serving as a Distinguished Lecturer of the IEEE Circuits and Systems Society He received the Ambrose Fleming Premium for
1964 from the lEE (best paper award), the CAS Golden Jubilee Medal from the IEEE Circuits and Systems Society, the B.C Science Council Chairman's Award for Career Achievement for 2000, the Doctor Honoris Causa degree from the Metsovio National Technical University of Athens, Greece, in 2002, and the IEEE Circuits and Systems Society 2005 Technical Achievement Award
Wu-Sheng Lu received the B.S degree in Mathematics from Fudan University,
Shanghai, China, in 1964, the M.E degree in Automation from the East China Normal University, Shanghai, in 1981, the M.S degree in Electrical Engineer-ing and the Ph.D degree in Control Science from the University of Minnesota, Minneapolis, in 1983 and 1984, respectively He was a post-doctoral fellow at the University of Victoria, Victoria, BC, Canada, in 1985 and Visiting Assistant Professor with the University of Minnesota in 1986 Since 1987, he has been with the University of Victoria where he is Professor His current teaching and research interests are in the general areas of digital signal processing and application of optimization methods He is the co-author with A Antoniou of Two-Dimensional Digital Filters (Marcel Dekker, 1992) He served as an As-sociate Editor of the Canadian Journal of Electrical and Computer Engineering
in 1989, and Editor of the same journal from 1990 to 1992 He served as an Associate Editor for the IEEE Transactions on Circuits and Systems, Part II, from 1993 to 1995 and for Part I of the same journal from 1999 to 2001 and from 2004 to 2005 Presently he is serving as Associate Editor for the Inter-national Journal of Multidimensional Systems and Signal Processing He is a Fellow of the Engineering Institute of Canada and the Institute of Electrical and Electronics Engineers
Trang 7Biographies of the authors vii
Preface xv Abbreviations xix
1 THE OPTIMIZATION PROBLEM 1
1.1 Introduction 1
1.2 The Basic Optimization Problem 4
1.3 General Structure of Optimization Algorithms 8
1.4 Constraints 10
1.5 The Feasible Region 17
1.6 Branches of Mathematical Programming 22
2.5 Necessary and Sufficient Conditions for
Local Minima and Maxima 33
2.6 Classification of Stationary Points 40
2.7 Convex and Concave Functions 51
2.8 Optimization of Convex Functions 58
References 60
Problems 60
3 GENERAL PROPERTIES OF ALGORITHMS 65
3.1 Introduction 65
3.2 An Algorithm as a Point-to-Point Mapping 65
3.3 An Algorithm as a Point-to-Set Mapping 67
3.4 Closed Algorithms 68
3.5 Descent Functions 71
3.6 Global Convergence 72
Trang 8References 79 Problems 79
4 ONE-DIMENSIONAL OPTIMIZATION 81
4.1 Introduction 81 4.2 Dichotomous Search 82
4.3 Fibonacci Search 85
4.4 Golden-Section Search 92
4.5 Quadratic Interpolation Method 95
4.6 Cubic Interpolation 99
4.7 The Algorithm of Davies, Swann, and Campey 101
4.8 Inexact Line Searches 106
References 114 Problems 114
5 BASIC MULTIDIMENSIONAL GRADIENT METHODS 119
5.1 Introduction 119 5.2 Steepest-Descent Method 120
5.3 Newton Method 128
5.4 Gauss-Newton Method 138
References 140 Problems 140
6 CONJUGATE-DIRECTION METHODS 145
6.1 Introduction 145 6.2 Conjugate Directions 146
6.3 Basic Conjugate-Directions Method 149
Trang 9Problems 172
7 QUASI-NEWTON METHODS 175
7.1 Introduction 175 7.2 The Basic Quasi-Newton Approach 176
7.9 The Huang Family 194
7.10 Practical Quasi-Newton Algorithm 195
References 199 Problems 200
8 MINIMAX METHODS 203
8.1 Introduction 203 8.2 Problem Formulation 203
8.3 Minimax Algorithms 205
8.4 Improved Minimax Algorithms 211
References 228 Problems 228
9 APPLICATIONS OF UNCONSTRAINED OPTIMIZATION 231
9.1 Introduction 231 9.2 Point-Pattern Matching 232
9.3 Inverse Kinematics for Robotic Manipulators 237
9.4 Design of Digital Filters 247
References 260 Problems 262
10 FUNDAMENTALS OF CONSTRAINED OPTIMIZATION 265
10.1 Introduction 265 10.2 Constraints 266
Trang 1010.3 Classification of Constrained Optimization Problems 273
10.4 Simple Transformation Methods 277
10.5 Lagrange Multipliers 285
10.6 First-Order Necessary Conditions 294
10.7 Second-Order Conditions 302
10.8 Convexity 308 10.9 Duality 311 References 312 Problems 313
11 LINEAR PROGRAMMING PART I: THE SIMPLEX METHOD 321
11.1 Introduction 321 11.2 General Properties 322
11.3 Simplex Method 344
References 368 Problems 368
12 LINEAR PROGRAMMING PART II:
INTERIOR-POINT METHODS 373
12.1 Introduction 373 12.2 Primal-Dual Solutions and Central Path 374
12.3 Primal Affine-Scaling Method 379
12.4 Primal Newton Barrier Method 383
12.5 Primal-Dual Interior-Point Methods 388
References 402 Problems 402
13 QUADRATIC AND CONVEX PROGRAMMING 407
13.1 Introduction 407 13.2 Convex QP Problems with Equality Constraints 408
13.3 Active-Set Methods for Strictly Convex QP Problems 411
13.4 Interior-Point Methods for Convex QP Problems 417
13.5 Cutting-Plane Methods for CP Problems 428
13.6 Ellipsoid Methods 437
References 443
Trang 11Problems 444
14 SEMIDEFINITE AND SECOND-ORDER CONE
PROGRAMMING 449
14.1 Introduction 449
14.2 Primal and Dual SDP Problems 450
14.3 Basic Properties of SDP Problems 455
14.4 Primal-Dual Path-Following Method 458
14.5 Predictor-Corrector Method 465
14.6 Projective Method of Nemirovski and Gahinet 470
14.7 Second-Order Cone Programming 484
14.8 A Primal-Dual Method for SOCP Problems 491
16.2 Design of Digital Filters 534
16.3 Model Predictive Control of Dynamic Systems 547
16.4 Optimal Force Distribution for Robotic Systems with Closed
Kinematic Loops 558
16.5 Multiuser Detection in Wireless Communication Channels 570
References 586
Problems 588 Appendices 591
A Basics of Linear Algebra 591
A 1 Introduction 591
Trang 12A.2 Linear Independence and Basis of a Span 592
A.3 Range, Null Space, and Rank 593
A.4 Sherman-Morrison Formula 595
A.5 Eigenvalues and Eigenvectors 596
A.6 Symmetric Matrices 598
A.7 Trace 602
A.8 Vector Norms and Matrix Norms 602
A.9 Singular-Value Decomposition 606
A 15 Vector Spaces of Symmetric Matrices 623
A 16 Polygon, Polyhedron, Polytope, and Convex Hull 626
B.6 Time-Domain Response Using the Z Transform 635
B.7 Z-Domain Condition for Stability 635
B.8 Frequency, Amplitude, and Phase Responses 636
B.9 Design 639
Reference 644
Index 645
Trang 13The rapid advancements in the efficiency of digital computers and the lution of reliable software for numerical computation during the past three decades have led to an astonishing growth in the theory, methods, and algo-rithms of numerical optimization This body of knowledge has, in turn, mo-tivated widespread applications of optimization methods in many disciplines, e.g., engineering, business, and science, and led to problem solutions that were considered intractable not too long ago
evo-Although excellent books are available that treat the subject of optimization with great mathematical rigor and precision, there appears to be a need for a book that provides a practical treatment of the subject aimed at a broader au-dience ranging from college students to scientists and industry professionals This book has been written to address this need It treats unconstrained and constrained optimization in a unified manner and places special attention on the algorithmic aspects of optimization to enable readers to apply the various algo-rithms and methods to specific problems of interest To facilitate this process, the book provides many solved examples that illustrate the principles involved, and includes, in addition, two chapters that deal exclusively with applications of unconstrained and constrained optimization methods to problems in the areas of pattern recognition, control systems, robotics, communication systems, and the design of digital filters For each application, enough background information
is provided to promote the understanding of the optimization algorithms used
to obtain the desired solutions
Chapter 1 gives a brief introduction to optimization and the general structure
of optimization algorithms Chapters 2 to 9 are concerned with unconstrained optimization methods The basic principles of interest are introduced in Chap-ter 2 These include the first-order and second-order necessary conditions for
a point to be a local minimizer, the second-order sufficient conditions, and the optimization of convex functions Chapter 3 deals with general properties of algorithms such as the concepts of descent function, global convergence, and
Trang 14rate of convergence Chapter 4 presents several methods for one-dimensional optimization, which are commonly referred to as line searches The chapter also deals with inexact line-search methods that have been found to increase the efficiency in many optimization algorithms Chapter 5 presents several basic gradient methods that include the steepest descent, Newton, and Gauss-Newton methods Chapter 6 presents a class of methods based on the concept of conjugate directions such as the conjugate-gradient, Fletcher-Reeves, Powell, and Partan methods An important class of unconstrained optimization meth-ods known as quasi-Newton methods is presented in Chapter 7 Representa-tive methods of this class such as the Davidon-Fletcher-Powell and Broydon-Fletcher-Goldfarb-Shanno methods and their properties are investigated The chapter also includes a practical, efficient, and reliable quasi-Newton algorithm that eliminates some problems associated with the basic quasi-Newton method Chapter 8 presents minimax methods that are used in many applications in-cluding the design of digital filters Chapter 9 presents three case studies in which several of the unconstrained optimization methods described in Chap-ters 4 to 8 are applied to point pattern matching, inverse kinematics for robotic manipulators, and the design of digital filters
Chapters 10 to 16 are concerned with constrained optimization methods Chapter 10 introduces the fundamentals of constrained optimization The con-cept of Lagrange multipliers, the first-order necessary conditions known as Karush-Kuhn-Tucker conditions, and the duality principle of convex program-ming are addressed in detail and are illustrated by many examples Chapters
11 and 12 are concerned with linear programming (LP) problems The eral properties of LP and the simplex method for standard LP problems are addressed in Chapter 11 Several interior-point methods including the primal affine-scaling, primal Newton-barrier, and primal dual-path following meth-ods are presented in Chapter 12 Chapter 13 deals with quadratic and general convex programming The so-called active-set methods and several interior-point methods for convex quadratic programming are investigated The chapter also includes the so-called cutting plane and ellipsoid algorithms for general convex programming problems Chapter 14 presents two special classes of con-vex programming known as semidefinite and second-order cone programming, which have found interesting applications in a variety of disciplines Chapter
gen-15 treats general constrained optimization problems that do not belong to the class of convex programming; special emphasis is placed on several sequential quadratic programming methods that are enhanced through the use of efficient line searches and approximations of the Hessian matrix involved Chapter 16, which concludes the book, examines several applications of constrained opti-mization for the design of digital filters, for the control of dynamic systems, for evaluating the force distribution in robotic systems, and in multiuser detection for wireless communication systems
Trang 15The book also includes two appendices, A and B, which provide additional
support material Appendix A deals in some detail with the relevant parts of
linear algebra to consolidate the understanding of the underlying mathematical
principles involved whereas Appendix B provides a concise treatment of the
basics of digital filters to enhance the understanding of the design algorithms
included in Chaps 8, 9, and 16
The book can be used as a text for a sequence of two one-semester courses
on optimization The first course comprising Chaps 1 to 7, 9, and part of
Chap 10 may be offered to senior undergraduate or first-year graduate students
The prerequisite knowledge is an undergraduate mathematics background of
calculus and linear algebra The material in Chaps 8 and 10 to 16 may be
used as a text for an advanced graduate course on minimax and constrained
optimization The prerequisite knowledge for thi^ course is the contents of the
first optimization course
The book is supported by online solutions of the end-of-chapter problems
under password as well as by a collection of MATLAB programs for free access
by the readers of the book, which can be used to solve a variety of
optimiza-tion problems These materials can be downloaded from the book's website:
http://www.ece.uvic.ca/~optimization/
We are grateful to many of our past students at the University of Victoria,
in particular, Drs M L R de Campos, S Netto, S Nokleby, D Peters, and
Mr J Wong who took our optimization courses and have helped improve the
manuscript in one way or another; to Chi-Tang Catherine Chang for typesetting
the first draft of the manuscript and for producing most of the illustrations; to
R Nongpiur for checking a large part of the index; and to R Ramachandran
for proofreading the entire manuscript We would also like to thank Professors
M Ahmadi, C Charalambous, P S R Diniz, Z Dong, T Hinamoto, and P P
Vaidyanathan for useful discussions on optimization theory and practice; Tony
Antoniou of Psicraft Studios for designing the book cover; the Natural Sciences
and Engineering Research Council of Canada for supporting the research that
led to some of the new results described in Chapters 8, 9, and 16; and last but
not least the University of Victoria for supporting the writing of this book over
anumber of years
Andreas Antoniou and Wu-Sheng Lu
Trang 16AWGN additive white Gaussian noise
BER bit-error rate
BFGS Broyden-Fletcher-Goldfarb-Shanno CDMA code-division multiple access
CMBER constrained minimum BER
FDMA frequency-division multiple access
FIR finite-duration impulse response
LCP linear complementarity problem
LMI linear matrix inequality
MPC model predictive control
PAS primal affine-scaling
SOCP second-order cone programming
SQP sequential quadratic programming
SVD singular-value decomposition
TDMA time-division multiple access
Trang 17of birds When the circumstances were appropriate, the timing was thought to
be auspicious (or optimum) for planting the crops or embarking on a war
As the ages advanced and the age of reason prevailed, unscientific ritualswere replaced by rules of thumb and later, with the development of mathematics,mathematical calculations began to be applied
Interest in the process of optimization has taken a giant leap with the advent ofthe digital computer in the early fifties In recent years, optimization techniquesadvanced rapidly and considerable progress has been achieved At the sametime, digital computers became faster, more versatile, and more efficient As aconsequence, it is now possible to solve complex optimization problems whichwere thought intractable only a few years ago
The process of optimization is the process of obtaining the ‘best’, if it is
pos-sible to measure and change what is ‘good’ or ‘bad’ In practice, one wishes the
‘most’ or ‘maximum’ (e.g., salary) or the ‘least’ or ‘minimum’ (e.g., expenses)
Therefore, the word ‘optimum’ is taken to mean ‘maximum’ or ‘minimum’
de-pending on the circumstances; ‘optimum’ is a technical term which impliesquantitative measurement and is a stronger word than ‘best’ which is more
appropriate for everyday use Likewise, the word ‘optimize’, which means to
achieve an optimum, is a stronger word than ‘improve’ Optimization theory
is the branch of mathematics encompassing the quantitative study of optimaand methods for finding them Optimization practice, on the other hand, is the
Trang 18collection of techniques, methods, procedures, and algorithms that can be used
to find the optima
Optimization problems occur in most disciplines like engineering, physics,mathematics, economics, administration, commerce, social sciences, and evenpolitics Optimization problems abound in the various fields of engineering likeelectrical, mechanical, civil, chemical, and building engineering Typical areas
of application are modeling, characterization, and design of devices, circuits,and systems; design of tools, instruments, and equipment; design of structuresand buildings; process control; approximation theory, curve fitting, solution
of systems of equations; forecasting, production scheduling, quality control;maintenance and repair; inventory control, accounting, budgeting, etc Somerecent innovations rely almost entirely on optimization theory, for example,neural networks and adaptive systems
Most real-life problems have several solutions and occasionally an infinitenumber of solutions may be possible Assuming that the problem at handadmits more than one solution, optimization can be achieved by finding thebest solution of the problem in terms of some performance criterion If theproblem admits only one solution, that is, only a unique set of parameter values
is acceptable, then optimization cannot be applied
Several general approaches to optimization are available, as follows:
cal-is determined by finding the values of parameters x1, x2, , x nthat cause the
derivatives of f (x1, x2, , x n ) with respect to x1, x2, , x nto assume zerovalues The problem to be solved must obviously be described in mathematicalterms before the rules of calculus can be applied The method need not entailthe use of a digital computer However, it cannot be applied to highly nonlinearproblems or to problems where the number of independent parameters exceedstwo or three
A graphical method can be used to plot the function to be maximized or imized if the number of variables does not exceed two If the function depends
min-on min-only min-one variable, say, x1, a plot of f (x1) versus x1will immediately revealthe maxima and/or minima of the function Similarly, if the function depends
on only two variables, say, x1and x2, a set of contours can be constructed A
contour is a set of points in the (x1, x2) plane for which f (x1, x2) is constant,and so a contour plot, like a topographical map of a specific region, will revealreadily the peaks and valleys of the function For example, the contour plot of
f (x , x ) depicted in Fig 1.1 shows that the function has a minimum at point
Trang 19A Unfortunately, the graphical method is of limited usefulness since in mostpractical applications the function to be optimized depends on several variables,usually in excess of four.
A 10
Figure 1.1. Contour plot off (x1, x2 ).
The optimum performance of a system can sometimes be achieved by directexperimentation In this method, the system is set up and the process variablesare adjusted one by one and the performance criterion is measured in eachcase This method may lead to optimum or near optimum operating conditions.However, it can lead to unreliable results since in certain systems, two or morevariables interact with each other, and must be adjusted simultaneously to yieldthe optimum performance criterion
The most important general approach to optimization is based on numericalmethods In this approach, iterative numerical procedures are used to generate aseries of progressively improved solutions to the optimization problem, startingwith an initial estimate for the solution The process is terminated when someconvergence criterion is satisfied For example, when changes in the indepen-dent variables or the performance criterion from iteration to iteration becomeinsignificant
Numerical methods can be used to solve highly complex optimization lems of the type that cannot be solved analytically Furthermore, they can bereadily programmed on the digital computer Consequently, they have all butreplaced most other approaches to optimization
Trang 20prob-The discipline encompassing the theory and practice of numerical
optimiza-tion methods has come to be known as mathematical programming [1]–[5].
During the past 40 years, several branches of mathematical programming haveevolved, as follows:
Before optimization is attempted, the problem at hand must be properly
formulated A performance criterion F must be derived in terms of n parameters
x1, x2, , x nas
F = f (x1, x2, , x n)
F is a scalar quantity which can assume numerous forms It can be the cost of a
product in a manufacturing environment or the difference between the desired
performance and the actual performance in a system Variables x1, x2, , x n
are the parameters that influence the product cost in the first case or the actualperformance in the second case They can be independent variables, like time,
or control parameters that can be adjusted
The most basic optimization problem is to adjust variables x1, x2, , x n
in such a way as to minimize quantity F This problem can be stated
mathe-matically as
Quantity F is usually referred to as the objective or cost function.
The objective function may depend on a large number of variables, sometimes
as many as 100 or more To simplify the notation, matrix notation is usually
employed If x is a column vector with elements x1, x2, , x n, the transpose
of x, namely, xT, can be expressed as the row vector
Trang 21On many occasions, the optimization problem consists of finding the mum of the objective function Since
In many applications, a number of distinct functions of x need to be optimized
simultaneously For example, if the system of nonlinear simultaneous equations
f i(x) = 0 for i = 1, 2, , m
needs to be solved, a vector x is sought which will reduce all f i(x) to zero
simultaneously In such a problem, the functions to be optimized can be used
to construct a vector
F(x) = [f1(x) f2(x) · · · f m(x)]TThe problem can be solved by finding a point x = x∗ such that F(x∗) = 0.
Very frequently, a point x∗ that reduces all the f
i(x) to zero simultaneously may not exist but an approximate solution, i.e., F(x∗) ≈ 0, may be available
which could be entirely satisfactory in practice
A similar problem arises in scientific or engineering applications when the
function of x that needs to be optimized is also a function of a continuous
independent parameter (e.g., time, position, speed, frequency) that can assume
an infinite set of values in a specified range The optimization might entail
adjusting variables x1, x2, , x n so as to optimize the function of interestover a given range of the independent parameter In such an application, thefunction of interest can be sampled with respect to the independent parameter,and a vector of the form
A solution of such a problem can be obtained by optimizing functions f i(x)
for i = 1, 2, , m simultaneously Such a solution would, of course, be
Trang 22approximate because any variations in f (x, t) between sample points are
ig-nored Nevertheless, reasonable solutions can be obtained in practice by using
a sufficiently large number of sample points This approach is illustrated by thefollowing example
Example 1.1 The step response y(x, t) of an nth-order control system is
re-quired to satisfy the specification
as closely as possible Construct a vector F(x) that can be used to obtain a
function f (x, t) such that
Solution The difference between the actual and specified step responses, which
constitutes the approximation error, can be expressed as
f (x, t) = y(x, t) − y0(x, t) and if f (x, t) is sampled at t = 0, 1, 2, , 5, we obtain
The problem is illustrated in Fig 1.2 It can be solved by finding a point x = x∗
such that F(x∗)≈ 0 Evidently, the quality of the approximation obtained for
the step response of the system will depend on the density of the samplingpoints and the higher the density of points, the better the approximation.Problems of the type just described can be solved by defining a suitable objec-
tive function in terms of the element functions of F(x) The objective function
Trang 23must be a scalar quantity and its optimization must lead to the simultaneous
optimization of the element functions of F(x) in some sense Consequently, a
norm of some type must be used An objective function can be defined in terms
is minimized, and if the square root is omitted, the sum of the squares is
mini-mized Such a problem is commonly referred as a least-squares problem.
linear algebra that are important to optimization.
Trang 24In the case where p = ∞, if we assume that there is a unique maximum of
|f i(x)| designated ˆ F such that
Since all the terms in the summation except one are less than unity, they tend
to zero when raised to a large positive power Therefore, we obtain
Evidently, if the L ∞ norm is used in Example 1.1, the maximum approximation error is minimized and the problem is said to be a minimax problem.
Often the individual element functions of F(x) are modified by using
con-stants w1, w2, , w m as weights For example, the least-squares objective
function can be expressed as
so as to emphasize important or critical element functions and de-emphasize
unimportant or uncritical ones If F is minimized, the residual errors in w i f i(x)
at the end of the minimization would tend to be of the same order of magnitude,i.e.,
error in |w i f i(x)| ≈ ε
and so
error in |f i(x)| ≈ ε
|w i | Consequently, if a large positive weight w i is used with f i(x), a small residual
error is achieved in|f i(x)|.
Most of the available optimization algorithms entail a series of steps whichare executed sequentially A typical pattern is as follows:
Trang 25Algorithm 1.1 General optimization algorithm
Check if convergence has been achieved by using an appropriate
crite-rion, e.g., by checking ∆F kand/or ∆xk If this is the case, continue toStep 4; otherwise, go to Step 2
arbitrary solution may be assumed, say, x0 = 0 Steps 2 and 3 are then
executed repeatedly until convergence is achieved Each execution of Steps 2
and 3 constitutes one iteration, that is, k is the number of iterations.
When convergence is achieved, Step 4 is executed In this step, columnvector
are output The column vector x∗ is said to be the optimum, minimum, solution
value of the objective function The pair x∗ and F ∗ constitute the solution of
the optimization problem
Convergence can be checked in several ways, depending on the optimizationproblem and the optimization technique used For example, one might decide
to stop the algorithm when the reduction in F kbetween any two iterations hasbecome insignificant, that is,
Trang 26where ε F is an optimization tolerance for the objective function Alternatively,
one might decide to stop the algorithm when the changes in all variables havebecome insignificant, that is,
where ε x is an optimization tolerance for variables x1, x2, , x n A thirdpossibility might be to check if both criteria given by Eqs (1.2) and (1.3) aresatisfied simultaneously
There are numerous algorithms for the minimization of an objective function.However, we are primarily interested in algorithms that entail the minimumamount of effort Therefore, we shall focus our attention on algorithms that aresimple to apply, are reliable when applied to a diverse range of optimizationproblems, and entail a small amount of computation A reliable algorithm is
often referred to as a ‘robust’ algorithm in the terminology of mathematical
programming
In many optimization problems, the variables are interrelated by physicallaws like the conservation of mass or energy, Kirchhoff’s voltage and currentlaws, and other system equalities that must be satisfied In effect, in theseproblems certain equality constraints of the form
where i = 1, 2, , p must be satisfied before the problem can be considered
solved In other optimization problems a collection of inequality constraintsmight be imposed on the variables or parameters to ensure physical realizability,reliability, compatibility, or even to simplify the modeling of the problem Forexample, the power dissipation might become excessive if a particular current
in a circuit exceeds a given upper limit or the circuit might become unreliable
if another current is reduced below a lower limit, the mass of an element in aspecific chemical reaction must be positive, and so on In these problems, acollection of inequality constraints of the form
where j = 1, 2, , q must be satisfied before the optimization problem can
be considered solved
An optimization problem may entail a set of equality constraints and possibly
a set of inequality constraints If this is the case, the problem is said to be a
constrained optimization problem The most general constrained optimization
problem can be expressed mathematically as
Trang 27minimize f (x) for x∈ E n
(1.4a)
A problem that does not entail any equality or inequality constraints is said
to be an unconstrained optimization problem.
Constrained optimization is usually much more difficult than unconstrainedoptimization, as might be expected Consequently, the general strategy thathas evolved in recent years towards the solution of constrained optimizationproblems is to reformulate constrained problems as unconstrained optimiza-tion problems This can be done by redefining the objective function suchthat the constraints are simultaneously satisfied when the objective function
is minimized Some real-life constrained optimization problems are given asExamples 1.2 to 1.4 below
Example 1.2 Consider a control system that comprises a double inverted
pen-dulum as depicted in Fig 1.3 The objective of the system is to maintain thependulum in the upright position using the minimum amount of energy This
is achieved by applying an appropriate control force to the car to damp out
any displacements θ1(t) and θ2(t) Formulate the problem as an optimization
Figure 1.3. The double inverted pendulum.
Solution The dynamic equations of the system are nonlinear and the standard
practice is to apply a linearization technique to these equations to obtain asmall-signal linear model of the system as [6]
Trang 28θ2(t) represent the first derivatives of x(t), θ1(t), and θ2(t), respectively, with
respect to time, ¨θ1(t) and ¨ θ2(t) would be the second derivatives of θ1(t) and
θ2(t), and parameters α and β depend on system parameters such as the length
and weight of each pendulum, the mass of the car, etc Suppose that at instant
t = 0 small nonzero displacements θ1(t) and θ2(t) occur, which would call for
immediate control action in order to steer the system back to the equilibrium
state x(t) = 0 at time t = T0 In order to develop a digital controller, thesystem model in (1.5) is discretized to become
where Φ = I + ∆tA, g = ∆tf , ∆t is the sampling interval, and I is the
identity matrix Let x(0)= 0 be given and assume that T0is a multiple of ∆t, i.e., T0 = K∆t where K is an integer We seek to find a sequence of control actions u(k) for k = 0, 1, , K − 1 such that the zero equilibrium state is achieved at t = T0, i.e., x(T0) = 0.
Let us assume that the energy consumed by these control actions, namely,
From Eq (1.6), we know that the state of the system at t = K∆t is determined
by the initial value of the state and system model in Eq (1.6) as
Trang 29where h =−Φ Kx(0) and gk = ΦK −k−1g Hence constraint (1.7b) is
equiv-alent to
K−1
k=0
If we define u = [u(0) u(1) · · · u(K − 1)] T and G = [g0 g1 · · · g K −1],
then the constraint in Eq (1.8) can be expressed as Gu = h, and the optimal control problem at hand can be formulated as the problem of finding a u that
solves the minimization problem
arbitrarily large in magnitude Consequently, additional constraints are oftenimposed on|u(i)|, for instance,
Obviously, the problem in Eq (1.9) fits nicely into the standard form of mization problems given by Eq (1.4)
opti-Example 1.3 High performance in modern optical instruments depends on the
quality of components like lenses, prisms, and mirrors These components havereflecting or partially reflecting surfaces, and their performance is limited by thereflectivities of the materials of which they are made The surface reflectivity
Trang 30can, however, be altered by the deposition of a thin transparent film In fact, thistechnique facilitates the control of losses due to reflection in lenses and makespossible the construction of mirrors with unique properties [7][8].
As is depicted in Fig 1.4, a typical N -layer thin-film system consists of N
layers of thin films of certain transparent media deposited on a glass substrate
The thickness and refractive index of the ith layer are denoted as x i and n i,respectively The refractive index of the medium above the first layer is denoted
as n0 If φ0is the angle of incident light, then the transmitted ray in the (i −1)th layer is refracted at an angle φ iwhich is given by Snell’s law, namely,
Figure 1.4. AnN -layer thin-film system.
Given angle φ0and the wavelength of light, λ, the energy of the light reflected
from the film surface and the energy of the light transmitted through the film
surface are usually measured by the reflectance R and transmittance T which
satisfy the relation
R + T = 1 For an N -layer system, R is given by (see [9] for details)
Trang 31n k / cos φ k for light polarized with the electric
vector lying in the plane of incidence
n k cos φ k for light polarized with the electric
vector perpendicular to theplane of incidence
(1.14)
The design of a multilayer thin-film system can now be accomplished as follows:
Given a range of wavelenghs λ l ≤ λ ≤ λ u and an angle of incidence φ0, find
x1, x2, , x N such that the reflectance R(x, λ) best approximates a desired
reflectance R d (λ) for λ ∈ [λ l , λ u] Formulate the design problem as anoptimization problem
Solution In practice, the desired reflectance is specified at grid points λ1, λ2, , λ K in the interval [λ l , λ u]; hence the design may be carried out by selecting
x isuch that the objective function
and w i > 0 is a weight to reflect the importance of term [R(x, λ i)− R d (λ i)]2
in Eq (1.15) If we letη = [1 η N +1]T , e+ = [η01]T , e − = [η0 −1] T, and
Trang 32Finally, we note that the thickness of each layer cannot be made arbitrarilythin or arbitrarily large and, therefore, constraints must be imposed on the
subject to: x i − d il ≥ 0 for i = 1, 2, , N (1.18b)
d iu − x i ≥ 0 for i = 1, 2, , N (1.18c)
Example 1.4 Quantities q1, q2, , q m of a certain product are produced by
m manufacturing divisions of a company, which are at distinct locations The product is to be shipped to n destinations that require quantities b1, b2, , b n
Assume that the cost of shipping a unit from manufacturing division i to tination j is c ij with i = 1, 2, , m and j = 1, 2, , n Find the quantity
des-x ij to be shipped from division i to destination j so as to minimize the total
cost of transportation, i.e.,
Second, the quantity to be shipped to a specific destination has to meet the need
of that destination and so
Trang 33where cTx is the inner product of c and x The problem in Eq (1.19) like
those in Examples 1.2 and 1.3 fits into the standard optimization problem in
Eq (1.4) Since both the objective function in Eq (1.19a) and the constraints in
Eqs (1.19b) and (1.19c) are linear, the problem is known as a linear ming (LP) problem (see Sect 1.6.1).
Any point x that satisfies both the equality as well as the inequality constraints
is said to be a feasible point of the optimization problem The set of all points that
satisfy the constraints constitutes the feasible domain region of f (x) Evidently,
the constraints define a subset of E n Therefore, the feasible region can bedefined as a set2
R = {x : a i (x) = 0 for i = 1, 2, , p and c j(x)≥ 0 for j = 1, 2, , q}
whereR ⊂ E n
The optimum point x∗ must be located in the feasible region, and so the
general constrained optimization problem can be stated as
Trang 34Any point x not inR is said to be a nonfeasible point.
If the constraints in an optimization problem are all inequalities, the
con-straints divide the points in the E nspace into three types of points, as follows:
1 Interior points
2 Boundary points
3 Exterior points
Aninterior point is a point for which c j (x) > 0 for all j A boundary point is a
point for which at least one c j (x) = 0, and an exterior point is a point for which
at least one c j (x) < 0 Interior points are feasible points, boundary points may
or may not be feasible points, whereas exterior points are nonfeasible points
If a constraint c m(x) is zero during a specific iteration, the constraint is said
to be active, and if c m(x∗) is zero when convergence is achieved, the optimum
point x∗is located on the boundary In such a case, the optimum point is said to
be constrained If the constraints are all equalities, the feasible points must be
located on the intersection of all the hypersurfaces corresponding to a i(x) = 0
for i = 1, 2, , p The above definitions and concepts are illustrated by the
following two examples
Example 1.5 By using a graphical method, solve the following optimization
problem
minimize f (x) = x21+ x22− 4x1+ 4subject to: c1(x) = x1− 2x2+ 6≥ 0
f (x) centered at x1 = 2, x2 = 0 Constraints c1(x) and c2(x) dictate
respectively, while constraints c3(x) and c4(x) dictate that x1and x2be positive
The contours of f (x) and the boundaries of the constraints can be constructed
as shown in Fig 1.5
The feasible region for this problem is the shaded region in Fig 1.5 The
solution is located at point A on the boundary of constraint c (x) In effect,
Trang 35Figure 1.5. Graphical construction for Example 1.5.
the solution is a constrained optimum point Consequently, if this problem is
solved by means of mathematical programming, constraint c2(x) will be active
when the solution is reached
In the absence of constraints, the minimization of f (x) would yield point B
as the solution
Example 1.6 By using a graphical method, solve the optimization problem
minimize f (x) = x21+ x22+ 2x2subject to: a1(x) = x21+ x22− 1 = 0
Trang 36Hence the contours of f (x) in the (x1, x2) plane are concentric circles withradius
f (x) + 1, centered at x1= 0, x2 =−1 Constraint a1(x) is a circle
centered at the origin with radius 1 On the other hand, constraint c1(x) is a
straight line since it is required that
x2 ≥ −x1+ 0.5 The last two constraints dictate that x1 and x2 be nonnegative Hence therequired construction can be obtained as depicted in Fig 1.6
In this case, the feasible region is the arc of circle a1(x) = 0 located in the
first quadrant of the (x1, x2) plane The solution, which is again a constrainedoptimum point, is located at point A There are two active constraints in this
example, namely, a1(x) and c3(x).
In the absence of constraints, the solution would be point B in Fig 1.6
-2 0
-1
Figure 1.6. Graphical construction for Example 1.6.
Trang 37In the above examples, the set of points comprising the feasible region aresimply connected as depicted in Fig 1.7a Sometimes the feasible region mayconsist of two or more disjoint sub-regions, as depicted in Fig 1.7b If this isthe case, the following difficulty may arise A typical optimization algorithm
is an iterative numerical procedure that will generate a series of progressivelyimproved solutions, starting with an initial estimate for the solution Therefore,
if the feasible region consists of two sub-regions, say, A and B, an initial estimatefor the solution in sub-region A is likely to yield a solution in sub-region A, and
a better solution in sub-region B may be missed Fortunately, however, in mostreal-life optimization problems, this difficulty can be avoided by formulatingthe problem carefully
Feasible region B
Figure 1.7. Examples of simply connected and disjoint feasible regions.
Trang 381.6 Branches of Mathematical Programming
Several branches of mathematical programming were enumerated in Sec 1.1,namely, linear, integer, quadratic, nonlinear, and dynamic programming Eachone of these branches of mathematical programming consists of the theory andapplication of a collection of optimization techniques that are suited to a specificclass of optimization problems The differences among the various branches
of mathematical programming are closely linked to the structure of the mization problem and to the mathematical nature of the objective and constraintfunctions A brief description of each branch of mathematical programming is
opti-as follows
If the objective and constraint functions are linear and the variables are strained to be positive, as in Example 1.4, the general optimization problemassumes the form
Trang 391.6.2 Integer programming
In certain linear programming problems, at least some of the variables are quired to assume only integer values This restriction renders the programmingproblem nonlinear Nevertheless, the problem is referred to as linear since theobjective and constraint functions are linear [10]
and Q is a positive definite or semidefinite symmetric square matrix, then the
constraints are linear and the objective function is quadratic Such an tion problem is said to be a quadratic programming (QP) problem (see Chap 10
optimiza-of [5]) A typical example optimiza-of this type optimiza-of problem is as follows:
minimize f (x) = 12x21+12x22− x1− 2x2subject to: c1(x) = 6− 2x1− 3x2≥ 0
Trang 40problems by using nonlinear programming algorithms, the specialized rithms developed for linear or quadratic programming should be used for theseproblems since they are usually much more efficient.
algo-The choice of optimization algorithm depends on the mathematical behaviorand structure of the objective function Most of the time, the objective function
is a well behaved nonlinear function and all that is necessary is a purpose, robust, and efficient algorithm For certain applications, however,specialized algorithms exist which are often more efficient than general-purposeones These are often referred to by the type of norm minimized, for example,
general-an algorithm that minimizes general-an L1, L2, or L ∞ norm is said to by an L1, L2, or
minimax algorithm.
In many applications, a series of decisions must be made in sequence, wheresubsequent decisions are influenced by earlier ones In such applications, anumber of optimizations have to be performed in sequence and a general strat-egy may be required to achieve an overall optimum solution For example, alarge system which cannot be optimized owing to the size and complexity ofthe problem can be partitioned into a set of smaller sub-systems that can beoptimized individually Often individual sub-systems interact with each otherand, consequently, a general solution strategy is required if an overall optimumsolution is to be achieved Dynamic programming is a collection of techniquesthat can be used to develop general solution strategies for problems of the typejust described It is usually based on the use of linear, integer, quadratic ornonlinear optimization algorithms
References
1 G B Dantzig, Linear Programming and Extensions, Princeton University Press, Princeton,
N.J., 1963.
2 D M Himmelblau, Applied Nonlinear Programming, McGraw-Hill, New York, 1972.
3 P E Gill, W Murray, and M H Wright, Practical Optimization, Academic Press, London,
1981.
4 D G Luenberger, Linear and Nonlinear Programming, 2nd ed., Addison-Wesley, Reading,
MA, 1984.
5 R Fletcher, Practical Methods of Optimization, 2nd ed., Wiley, Chichester, UK, 1987.
6 B C Kuo, Automatic Control Systems, 5th ed., Prentice Hall, Englewood Cliffs, N.J., 1987.
7 K D Leaver and B N Chapman, Thin Films, Wykeham, London, 1971.
8 O S Heavens, Thin Film Physics, Methuen, London, 1970.
9 Z Knittl, Optics of Thin Films, An Optical Multilayer Theory, Wiley, New York, 1976.
10 G L Nemhauser and L A Wolsey, Integer and Combinatorial Optimization, Wiley, New
York, 1988.