Springer practical optimization algorithms and engineering applications mar 2007 ISBN 0387711066 pdf

THE OPTIMIZATION PROBLEM 1 1.1 Introduction 1 1.2 The Basic Optimization Problem 4 1.3 General Structure of Optimization Algorithms 8 1.4 Constraints 10 1.5 The Feasible Region 17 1.6 Br

Trang 2

PRACTICAL OPTIMIZATION

Algorithms and Engineering Applications

Trang 3

PRACTICAL OPTIMIZATION

Algorithms and Engineering Applications

Andreas Antoniou Wu-Sheng Lu

Department of Electrical and Computer Engineering

University of Victoria, Canada

Spri inger

Trang 4

University of V ictoria University of V ictoria

British Columbia British Columbia

Canada Canada aantoniou@shaw.ca wslu@ece.uvic,ca

Library of Congress Control Number: 2007922511

Practical Optimization: Algorithms and Engineering Applications

by Andreas Antoniou and Wu-Sheng Lu

ISBN-10: 0-387-71106-6 e-ISBN-10: 0-387-71107-4

ISBN-13: 978-0-387-71106-5 e-ISBN-13: 978-0-387-71107-2

Printed on acid-free paper

NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights

9 8 7 6 5 4 3 2 1

springer.com

Trang 5

Lynne and Chi'Tang Catherine with our love

Trang 6

Andreas Antoniou received the Ph.D degree in Electrical Engineering from

the University of London, UK, in 1966 and is a Fellow of the lET and IEEE

He served as the founding Chair of the Department of Electrical and Computer Engineering at the University of Victoria, B.C., Canada, and is now Professor Emeritus in the same department He is the author of Digital Filters: Analysis, Design, and Applications (McGraw-Hill, 1993) and Digital Signal Processing: Signals, Systems, and Filters (McGraw-Hill, 2005) He served as Associate Editor/Editor of IEEE Transactions on Circuits and Systems from June 1983 to May 1987, as a Distinguished Lecturer of the IEEE Signal Processing Society

in 2003, as General Chair of the 2004 International Symposium on Circuits and Systems, and is currently serving as a Distinguished Lecturer of the IEEE Circuits and Systems Society He received the Ambrose Fleming Premium for

1964 from the lEE (best paper award), the CAS Golden Jubilee Medal from the IEEE Circuits and Systems Society, the B.C Science Council Chairman's Award for Career Achievement for 2000, the Doctor Honoris Causa degree from the Metsovio National Technical University of Athens, Greece, in 2002, and the IEEE Circuits and Systems Society 2005 Technical Achievement Award

Wu-Sheng Lu received the B.S degree in Mathematics from Fudan University,

Shanghai, China, in 1964, the M.E degree in Automation from the East China Normal University, Shanghai, in 1981, the M.S degree in Electrical Engineer-ing and the Ph.D degree in Control Science from the University of Minnesota, Minneapolis, in 1983 and 1984, respectively He was a post-doctoral fellow at the University of Victoria, Victoria, BC, Canada, in 1985 and Visiting Assistant Professor with the University of Minnesota in 1986 Since 1987, he has been with the University of Victoria where he is Professor His current teaching and research interests are in the general areas of digital signal processing and application of optimization methods He is the co-author with A Antoniou of Two-Dimensional Digital Filters (Marcel Dekker, 1992) He served as an As-sociate Editor of the Canadian Journal of Electrical and Computer Engineering

in 1989, and Editor of the same journal from 1990 to 1992 He served as an Associate Editor for the IEEE Transactions on Circuits and Systems, Part II, from 1993 to 1995 and for Part I of the same journal from 1999 to 2001 and from 2004 to 2005 Presently he is serving as Associate Editor for the Inter-national Journal of Multidimensional Systems and Signal Processing He is a Fellow of the Engineering Institute of Canada and the Institute of Electrical and Electronics Engineers

Trang 7

Biographies of the authors vii

Preface xv Abbreviations xix

1 THE OPTIMIZATION PROBLEM 1

1.1 Introduction 1

1.2 The Basic Optimization Problem 4

1.3 General Structure of Optimization Algorithms 8

1.4 Constraints 10

1.5 The Feasible Region 17

1.6 Branches of Mathematical Programming 22

2.5 Necessary and Sufficient Conditions for

Local Minima and Maxima 33

2.6 Classification of Stationary Points 40

2.7 Convex and Concave Functions 51

2.8 Optimization of Convex Functions 58

References 60

Problems 60

3 GENERAL PROPERTIES OF ALGORITHMS 65

3.1 Introduction 65

3.2 An Algorithm as a Point-to-Point Mapping 65

3.3 An Algorithm as a Point-to-Set Mapping 67

3.4 Closed Algorithms 68

3.5 Descent Functions 71

3.6 Global Convergence 72

Trang 8

References 79 Problems 79

4 ONE-DIMENSIONAL OPTIMIZATION 81

4.1 Introduction 81 4.2 Dichotomous Search 82

4.3 Fibonacci Search 85

4.4 Golden-Section Search 92

4.5 Quadratic Interpolation Method 95

4.6 Cubic Interpolation 99

4.7 The Algorithm of Davies, Swann, and Campey 101

4.8 Inexact Line Searches 106

5 BASIC MULTIDIMENSIONAL GRADIENT METHODS 119

5.1 Introduction 119 5.2 Steepest-Descent Method 120

5.3 Newton Method 128

5.4 Gauss-Newton Method 138

6 CONJUGATE-DIRECTION METHODS 145

6.1 Introduction 145 6.2 Conjugate Directions 146

6.3 Basic Conjugate-Directions Method 149

Trang 9

Problems 172

7 QUASI-NEWTON METHODS 175

7.1 Introduction 175 7.2 The Basic Quasi-Newton Approach 176

7.9 The Huang Family 194

7.10 Practical Quasi-Newton Algorithm 195

8 MINIMAX METHODS 203

8.1 Introduction 203 8.2 Problem Formulation 203

8.3 Minimax Algorithms 205

8.4 Improved Minimax Algorithms 211

9 APPLICATIONS OF UNCONSTRAINED OPTIMIZATION 231

9.1 Introduction 231 9.2 Point-Pattern Matching 232

9.3 Inverse Kinematics for Robotic Manipulators 237

9.4 Design of Digital Filters 247

10 FUNDAMENTALS OF CONSTRAINED OPTIMIZATION 265

10.1 Introduction 265 10.2 Constraints 266

Trang 10

10.3 Classification of Constrained Optimization Problems 273

10.4 Simple Transformation Methods 277

10.5 Lagrange Multipliers 285

10.6 First-Order Necessary Conditions 294

10.7 Second-Order Conditions 302

10.8 Convexity 308 10.9 Duality 311 References 312 Problems 313

11 LINEAR PROGRAMMING PART I: THE SIMPLEX METHOD 321

11.1 Introduction 321 11.2 General Properties 322

11.3 Simplex Method 344

12 LINEAR PROGRAMMING PART II:

INTERIOR-POINT METHODS 373

12.1 Introduction 373 12.2 Primal-Dual Solutions and Central Path 374

12.3 Primal Affine-Scaling Method 379

12.4 Primal Newton Barrier Method 383

12.5 Primal-Dual Interior-Point Methods 388

13 QUADRATIC AND CONVEX PROGRAMMING 407

13.1 Introduction 407 13.2 Convex QP Problems with Equality Constraints 408

13.3 Active-Set Methods for Strictly Convex QP Problems 411

13.4 Interior-Point Methods for Convex QP Problems 417

13.5 Cutting-Plane Methods for CP Problems 428

13.6 Ellipsoid Methods 437

References 443

Trang 11

Problems 444

14 SEMIDEFINITE AND SECOND-ORDER CONE

PROGRAMMING 449

14.1 Introduction 449

14.2 Primal and Dual SDP Problems 450

14.3 Basic Properties of SDP Problems 455

14.4 Primal-Dual Path-Following Method 458

14.5 Predictor-Corrector Method 465

14.6 Projective Method of Nemirovski and Gahinet 470

14.7 Second-Order Cone Programming 484

14.8 A Primal-Dual Method for SOCP Problems 491

16.2 Design of Digital Filters 534

16.3 Model Predictive Control of Dynamic Systems 547

16.4 Optimal Force Distribution for Robotic Systems with Closed

Kinematic Loops 558

16.5 Multiuser Detection in Wireless Communication Channels 570

References 586

Problems 588 Appendices 591

A Basics of Linear Algebra 591

A 1 Introduction 591

Trang 12

A.2 Linear Independence and Basis of a Span 592

A.3 Range, Null Space, and Rank 593

A.4 Sherman-Morrison Formula 595

A.5 Eigenvalues and Eigenvectors 596

A.6 Symmetric Matrices 598

A.7 Trace 602

A.8 Vector Norms and Matrix Norms 602

A.9 Singular-Value Decomposition 606

A 15 Vector Spaces of Symmetric Matrices 623

A 16 Polygon, Polyhedron, Polytope, and Convex Hull 626

B.6 Time-Domain Response Using the Z Transform 635

B.7 Z-Domain Condition for Stability 635

B.8 Frequency, Amplitude, and Phase Responses 636

B.9 Design 639

Reference 644

Index 645

Trang 13

The rapid advancements in the efficiency of digital computers and the lution of reliable software for numerical computation during the past three decades have led to an astonishing growth in the theory, methods, and algo-rithms of numerical optimization This body of knowledge has, in turn, mo-tivated widespread applications of optimization methods in many disciplines, e.g., engineering, business, and science, and led to problem solutions that were considered intractable not too long ago

evo-Although excellent books are available that treat the subject of optimization with great mathematical rigor and precision, there appears to be a need for a book that provides a practical treatment of the subject aimed at a broader au-dience ranging from college students to scientists and industry professionals This book has been written to address this need It treats unconstrained and constrained optimization in a unified manner and places special attention on the algorithmic aspects of optimization to enable readers to apply the various algo-rithms and methods to specific problems of interest To facilitate this process, the book provides many solved examples that illustrate the principles involved, and includes, in addition, two chapters that deal exclusively with applications of unconstrained and constrained optimization methods to problems in the areas of pattern recognition, control systems, robotics, communication systems, and the design of digital filters For each application, enough background information

is provided to promote the understanding of the optimization algorithms used

to obtain the desired solutions

Chapter 1 gives a brief introduction to optimization and the general structure

of optimization algorithms Chapters 2 to 9 are concerned with unconstrained optimization methods The basic principles of interest are introduced in Chap-ter 2 These include the first-order and second-order necessary conditions for

a point to be a local minimizer, the second-order sufficient conditions, and the optimization of convex functions Chapter 3 deals with general properties of algorithms such as the concepts of descent function, global convergence, and

Trang 14

rate of convergence Chapter 4 presents several methods for one-dimensional optimization, which are commonly referred to as line searches The chapter also deals with inexact line-search methods that have been found to increase the efficiency in many optimization algorithms Chapter 5 presents several basic gradient methods that include the steepest descent, Newton, and Gauss-Newton methods Chapter 6 presents a class of methods based on the concept of conjugate directions such as the conjugate-gradient, Fletcher-Reeves, Powell, and Partan methods An important class of unconstrained optimization meth-ods known as quasi-Newton methods is presented in Chapter 7 Representa-tive methods of this class such as the Davidon-Fletcher-Powell and Broydon-Fletcher-Goldfarb-Shanno methods and their properties are investigated The chapter also includes a practical, efficient, and reliable quasi-Newton algorithm that eliminates some problems associated with the basic quasi-Newton method Chapter 8 presents minimax methods that are used in many applications in-cluding the design of digital filters Chapter 9 presents three case studies in which several of the unconstrained optimization methods described in Chap-ters 4 to 8 are applied to point pattern matching, inverse kinematics for robotic manipulators, and the design of digital filters

Chapters 10 to 16 are concerned with constrained optimization methods Chapter 10 introduces the fundamentals of constrained optimization The con-cept of Lagrange multipliers, the first-order necessary conditions known as Karush-Kuhn-Tucker conditions, and the duality principle of convex program-ming are addressed in detail and are illustrated by many examples Chapters

11 and 12 are concerned with linear programming (LP) problems The eral properties of LP and the simplex method for standard LP problems are addressed in Chapter 11 Several interior-point methods including the primal affine-scaling, primal Newton-barrier, and primal dual-path following meth-ods are presented in Chapter 12 Chapter 13 deals with quadratic and general convex programming The so-called active-set methods and several interior-point methods for convex quadratic programming are investigated The chapter also includes the so-called cutting plane and ellipsoid algorithms for general convex programming problems Chapter 14 presents two special classes of con-vex programming known as semidefinite and second-order cone programming, which have found interesting applications in a variety of disciplines Chapter

gen-15 treats general constrained optimization problems that do not belong to the class of convex programming; special emphasis is placed on several sequential quadratic programming methods that are enhanced through the use of efficient line searches and approximations of the Hessian matrix involved Chapter 16, which concludes the book, examines several applications of constrained opti-mization for the design of digital filters, for the control of dynamic systems, for evaluating the force distribution in robotic systems, and in multiuser detection for wireless communication systems

Trang 15

The book also includes two appendices, A and B, which provide additional

support material Appendix A deals in some detail with the relevant parts of

linear algebra to consolidate the understanding of the underlying mathematical

principles involved whereas Appendix B provides a concise treatment of the

basics of digital filters to enhance the understanding of the design algorithms

included in Chaps 8, 9, and 16

The book can be used as a text for a sequence of two one-semester courses

on optimization The first course comprising Chaps 1 to 7, 9, and part of

Chap 10 may be offered to senior undergraduate or first-year graduate students

The prerequisite knowledge is an undergraduate mathematics background of

calculus and linear algebra The material in Chaps 8 and 10 to 16 may be

used as a text for an advanced graduate course on minimax and constrained

optimization The prerequisite knowledge for thi^ course is the contents of the

first optimization course

The book is supported by online solutions of the end-of-chapter problems

under password as well as by a collection of MATLAB programs for free access

by the readers of the book, which can be used to solve a variety of

optimiza-tion problems These materials can be downloaded from the book's website:

http://www.ece.uvic.ca/~optimization/

We are grateful to many of our past students at the University of Victoria,

in particular, Drs M L R de Campos, S Netto, S Nokleby, D Peters, and

Mr J Wong who took our optimization courses and have helped improve the

manuscript in one way or another; to Chi-Tang Catherine Chang for typesetting

the first draft of the manuscript and for producing most of the illustrations; to

R Nongpiur for checking a large part of the index; and to R Ramachandran

for proofreading the entire manuscript We would also like to thank Professors

M Ahmadi, C Charalambous, P S R Diniz, Z Dong, T Hinamoto, and P P

Vaidyanathan for useful discussions on optimization theory and practice; Tony

Antoniou of Psicraft Studios for designing the book cover; the Natural Sciences

and Engineering Research Council of Canada for supporting the research that

led to some of the new results described in Chapters 8, 9, and 16; and last but

not least the University of Victoria for supporting the writing of this book over

anumber of years

Andreas Antoniou and Wu-Sheng Lu

Trang 16

AWGN additive white Gaussian noise

BER bit-error rate

BFGS Broyden-Fletcher-Goldfarb-Shanno CDMA code-division multiple access

CMBER constrained minimum BER

FDMA frequency-division multiple access

FIR finite-duration impulse response

LCP linear complementarity problem

LMI linear matrix inequality

MPC model predictive control

PAS primal affine-scaling

SOCP second-order cone programming

SQP sequential quadratic programming

SVD singular-value decomposition

TDMA time-division multiple access

Trang 17

of birds When the circumstances were appropriate, the timing was thought to

be auspicious (or optimum) for planting the crops or embarking on a war

As the ages advanced and the age of reason prevailed, unscientiﬁc ritualswere replaced by rules of thumb and later, with the development of mathematics,mathematical calculations began to be applied

Interest in the process of optimization has taken a giant leap with the advent ofthe digital computer in the early ﬁfties In recent years, optimization techniquesadvanced rapidly and considerable progress has been achieved At the sametime, digital computers became faster, more versatile, and more efﬁcient As aconsequence, it is now possible to solve complex optimization problems whichwere thought intractable only a few years ago

The process of optimization is the process of obtaining the ‘best’, if it is

pos-sible to measure and change what is ‘good’ or ‘bad’ In practice, one wishes the

‘most’ or ‘maximum’ (e.g., salary) or the ‘least’ or ‘minimum’ (e.g., expenses)

Therefore, the word ‘optimum’ is taken to mean ‘maximum’ or ‘minimum’

de-pending on the circumstances; ‘optimum’ is a technical term which impliesquantitative measurement and is a stronger word than ‘best’ which is more

appropriate for everyday use Likewise, the word ‘optimize’, which means to

achieve an optimum, is a stronger word than ‘improve’ Optimization theory

is the branch of mathematics encompassing the quantitative study of optimaand methods for ﬁnding them Optimization practice, on the other hand, is the

Trang 18

collection of techniques, methods, procedures, and algorithms that can be used

to ﬁnd the optima

Optimization problems occur in most disciplines like engineering, physics,mathematics, economics, administration, commerce, social sciences, and evenpolitics Optimization problems abound in the various ﬁelds of engineering likeelectrical, mechanical, civil, chemical, and building engineering Typical areas

of application are modeling, characterization, and design of devices, circuits,and systems; design of tools, instruments, and equipment; design of structuresand buildings; process control; approximation theory, curve ﬁtting, solution

of systems of equations; forecasting, production scheduling, quality control;maintenance and repair; inventory control, accounting, budgeting, etc Somerecent innovations rely almost entirely on optimization theory, for example,neural networks and adaptive systems

Most real-life problems have several solutions and occasionally an inﬁnitenumber of solutions may be possible Assuming that the problem at handadmits more than one solution, optimization can be achieved by ﬁnding thebest solution of the problem in terms of some performance criterion If theproblem admits only one solution, that is, only a unique set of parameter values

is acceptable, then optimization cannot be applied

Several general approaches to optimization are available, as follows:

cal-is determined by ﬁnding the values of parameters x1, x2, , x nthat cause the

derivatives of f (x1, x2, , x n ) with respect to x1, x2, , x nto assume zerovalues The problem to be solved must obviously be described in mathematicalterms before the rules of calculus can be applied The method need not entailthe use of a digital computer However, it cannot be applied to highly nonlinearproblems or to problems where the number of independent parameters exceedstwo or three

A graphical method can be used to plot the function to be maximized or imized if the number of variables does not exceed two If the function depends

min-on min-only min-one variable, say, x1, a plot of f (x1) versus x1will immediately revealthe maxima and/or minima of the function Similarly, if the function depends

on only two variables, say, x1and x2, a set of contours can be constructed A

contour is a set of points in the (x1, x2) plane for which f (x1, x2) is constant,and so a contour plot, like a topographical map of a speciﬁc region, will revealreadily the peaks and valleys of the function For example, the contour plot of

f (x , x ) depicted in Fig 1.1 shows that the function has a minimum at point

Trang 19

A Unfortunately, the graphical method is of limited usefulness since in mostpractical applications the function to be optimized depends on several variables,usually in excess of four.

A 10

Figure 1.1. Contour plot off (x1, x2 ).

The optimum performance of a system can sometimes be achieved by directexperimentation In this method, the system is set up and the process variablesare adjusted one by one and the performance criterion is measured in eachcase This method may lead to optimum or near optimum operating conditions.However, it can lead to unreliable results since in certain systems, two or morevariables interact with each other, and must be adjusted simultaneously to yieldthe optimum performance criterion

The most important general approach to optimization is based on numericalmethods In this approach, iterative numerical procedures are used to generate aseries of progressively improved solutions to the optimization problem, startingwith an initial estimate for the solution The process is terminated when someconvergence criterion is satisﬁed For example, when changes in the indepen-dent variables or the performance criterion from iteration to iteration becomeinsigniﬁcant

Numerical methods can be used to solve highly complex optimization lems of the type that cannot be solved analytically Furthermore, they can bereadily programmed on the digital computer Consequently, they have all butreplaced most other approaches to optimization

Trang 20

prob-The discipline encompassing the theory and practice of numerical

optimiza-tion methods has come to be known as mathematical programming [1]–[5].

During the past 40 years, several branches of mathematical programming haveevolved, as follows:

Before optimization is attempted, the problem at hand must be properly

formulated A performance criterion F must be derived in terms of n parameters

x1, x2, , x nas

F = f (x1, x2, , x n)

F is a scalar quantity which can assume numerous forms It can be the cost of a

product in a manufacturing environment or the difference between the desired

performance and the actual performance in a system Variables x1, x2, , x n

are the parameters that inﬂuence the product cost in the ﬁrst case or the actualperformance in the second case They can be independent variables, like time,

or control parameters that can be adjusted

The most basic optimization problem is to adjust variables x1, x2, , x n

in such a way as to minimize quantity F This problem can be stated

mathe-matically as

Quantity F is usually referred to as the objective or cost function.

The objective function may depend on a large number of variables, sometimes

as many as 100 or more To simplify the notation, matrix notation is usually

employed If x is a column vector with elements x1, x2, , x n, the transpose

of x, namely, xT, can be expressed as the row vector

Trang 21

On many occasions, the optimization problem consists of ﬁnding the mum of the objective function Since

In many applications, a number of distinct functions of x need to be optimized

simultaneously For example, if the system of nonlinear simultaneous equations

f i(x) = 0 for i = 1, 2, , m

needs to be solved, a vector x is sought which will reduce all f i(x) to zero

simultaneously In such a problem, the functions to be optimized can be used

to construct a vector

F(x) = [f1(x) f2(x) · · · f m(x)]TThe problem can be solved by ﬁnding a point x = x∗ such that F(x∗) = 0.

Very frequently, a point x∗ that reduces all the f

i(x) to zero simultaneously may not exist but an approximate solution, i.e., F(x∗) ≈ 0, may be available

which could be entirely satisfactory in practice

A similar problem arises in scientiﬁc or engineering applications when the

function of x that needs to be optimized is also a function of a continuous

independent parameter (e.g., time, position, speed, frequency) that can assume

an inﬁnite set of values in a speciﬁed range The optimization might entail

adjusting variables x1, x2, , x n so as to optimize the function of interestover a given range of the independent parameter In such an application, thefunction of interest can be sampled with respect to the independent parameter,and a vector of the form

A solution of such a problem can be obtained by optimizing functions f i(x)

for i = 1, 2, , m simultaneously Such a solution would, of course, be

Trang 22

approximate because any variations in f (x, t) between sample points are

ig-nored Nevertheless, reasonable solutions can be obtained in practice by using

a sufﬁciently large number of sample points This approach is illustrated by thefollowing example

Example 1.1 The step response y(x, t) of an nth-order control system is

re-quired to satisfy the speciﬁcation

as closely as possible Construct a vector F(x) that can be used to obtain a

function f (x, t) such that

Solution The difference between the actual and speciﬁed step responses, which

constitutes the approximation error, can be expressed as

f (x, t) = y(x, t) − y0(x, t) and if f (x, t) is sampled at t = 0, 1, 2, , 5, we obtain

The problem is illustrated in Fig 1.2 It can be solved by ﬁnding a point x = x∗

such that F(x∗)≈ 0 Evidently, the quality of the approximation obtained for

the step response of the system will depend on the density of the samplingpoints and the higher the density of points, the better the approximation.Problems of the type just described can be solved by deﬁning a suitable objec-

tive function in terms of the element functions of F(x) The objective function

Trang 23

must be a scalar quantity and its optimization must lead to the simultaneous

optimization of the element functions of F(x) in some sense Consequently, a

norm of some type must be used An objective function can be deﬁned in terms

is minimized, and if the square root is omitted, the sum of the squares is

mini-mized Such a problem is commonly referred as a least-squares problem.

linear algebra that are important to optimization.

Trang 24

In the case where p = ∞, if we assume that there is a unique maximum of

|f i(x)| designated ˆ F such that

Since all the terms in the summation except one are less than unity, they tend

to zero when raised to a large positive power Therefore, we obtain

Evidently, if the L ∞ norm is used in Example 1.1, the maximum approximation error is minimized and the problem is said to be a minimax problem.

Often the individual element functions of F(x) are modiﬁed by using

con-stants w1, w2, , w m as weights For example, the least-squares objective

function can be expressed as

so as to emphasize important or critical element functions and de-emphasize

unimportant or uncritical ones If F is minimized, the residual errors in w i f i(x)

at the end of the minimization would tend to be of the same order of magnitude,i.e.,

error in |w i f i(x)| ≈ ε

and so

error in |f i(x)| ≈ ε

|w i | Consequently, if a large positive weight w i is used with f i(x), a small residual

error is achieved in|f i(x)|.

Most of the available optimization algorithms entail a series of steps whichare executed sequentially A typical pattern is as follows:

Trang 25

Algorithm 1.1 General optimization algorithm

Check if convergence has been achieved by using an appropriate

crite-rion, e.g., by checking ∆F kand/or ∆xk If this is the case, continue toStep 4; otherwise, go to Step 2

arbitrary solution may be assumed, say, x0 = 0 Steps 2 and 3 are then

executed repeatedly until convergence is achieved Each execution of Steps 2

and 3 constitutes one iteration, that is, k is the number of iterations.

When convergence is achieved, Step 4 is executed In this step, columnvector

are output The column vector x∗ is said to be the optimum, minimum, solution

value of the objective function The pair x∗ and F ∗ constitute the solution of

the optimization problem

Convergence can be checked in several ways, depending on the optimizationproblem and the optimization technique used For example, one might decide

to stop the algorithm when the reduction in F kbetween any two iterations hasbecome insigniﬁcant, that is,

Trang 26

where ε F is an optimization tolerance for the objective function Alternatively,

one might decide to stop the algorithm when the changes in all variables havebecome insigniﬁcant, that is,

where ε x is an optimization tolerance for variables x1, x2, , x n A thirdpossibility might be to check if both criteria given by Eqs (1.2) and (1.3) aresatisﬁed simultaneously

There are numerous algorithms for the minimization of an objective function.However, we are primarily interested in algorithms that entail the minimumamount of effort Therefore, we shall focus our attention on algorithms that aresimple to apply, are reliable when applied to a diverse range of optimizationproblems, and entail a small amount of computation A reliable algorithm is

often referred to as a ‘robust’ algorithm in the terminology of mathematical

programming

In many optimization problems, the variables are interrelated by physicallaws like the conservation of mass or energy, Kirchhoff’s voltage and currentlaws, and other system equalities that must be satisﬁed In effect, in theseproblems certain equality constraints of the form

where i = 1, 2, , p must be satisﬁed before the problem can be considered

solved In other optimization problems a collection of inequality constraintsmight be imposed on the variables or parameters to ensure physical realizability,reliability, compatibility, or even to simplify the modeling of the problem Forexample, the power dissipation might become excessive if a particular current

in a circuit exceeds a given upper limit or the circuit might become unreliable

if another current is reduced below a lower limit, the mass of an element in aspeciﬁc chemical reaction must be positive, and so on In these problems, acollection of inequality constraints of the form

where j = 1, 2, , q must be satisﬁed before the optimization problem can

be considered solved

An optimization problem may entail a set of equality constraints and possibly

a set of inequality constraints If this is the case, the problem is said to be a

constrained optimization problem The most general constrained optimization

problem can be expressed mathematically as

Trang 27

minimize f (x) for x∈ E n

(1.4a)

A problem that does not entail any equality or inequality constraints is said

to be an unconstrained optimization problem.

Constrained optimization is usually much more difficult than unconstrainedoptimization, as might be expected Consequently, the general strategy thathas evolved in recent years towards the solution of constrained optimizationproblems is to reformulate constrained problems as unconstrained optimiza-tion problems This can be done by redefining the objective function suchthat the constraints are simultaneously satisfied when the objective function

is minimized Some real-life constrained optimization problems are given asExamples 1.2 to 1.4 below

Example 1.2 Consider a control system that comprises a double inverted

pen-dulum as depicted in Fig 1.3 The objective of the system is to maintain thependulum in the upright position using the minimum amount of energy This

is achieved by applying an appropriate control force to the car to damp out

any displacements θ1(t) and θ2(t) Formulate the problem as an optimization

Figure 1.3. The double inverted pendulum.

Solution The dynamic equations of the system are nonlinear and the standard

practice is to apply a linearization technique to these equations to obtain asmall-signal linear model of the system as [6]

Trang 28

θ2(t) represent the ﬁrst derivatives of x(t), θ1(t), and θ2(t), respectively, with

respect to time, ¨θ1(t) and ¨ θ2(t) would be the second derivatives of θ1(t) and

θ2(t), and parameters α and β depend on system parameters such as the length

and weight of each pendulum, the mass of the car, etc Suppose that at instant

t = 0 small nonzero displacements θ1(t) and θ2(t) occur, which would call for

immediate control action in order to steer the system back to the equilibrium

state x(t) = 0 at time t = T0 In order to develop a digital controller, thesystem model in (1.5) is discretized to become

where Φ = I + ∆tA, g = ∆tf , ∆t is the sampling interval, and I is the

identity matrix Let x(0)= 0 be given and assume that T0is a multiple of ∆t, i.e., T0 = K∆t where K is an integer We seek to ﬁnd a sequence of control actions u(k) for k = 0, 1, , K − 1 such that the zero equilibrium state is achieved at t = T0, i.e., x(T0) = 0.

Let us assume that the energy consumed by these control actions, namely,

From Eq (1.6), we know that the state of the system at t = K∆t is determined

by the initial value of the state and system model in Eq (1.6) as

Trang 29

where h =−Φ Kx(0) and gk = ΦK −k−1g Hence constraint (1.7b) is

equiv-alent to

K−1

k=0

If we deﬁne u = [u(0) u(1) · · · u(K − 1)] T and G = [g0 g1 · · · g K −1],

then the constraint in Eq (1.8) can be expressed as Gu = h, and the optimal control problem at hand can be formulated as the problem of ﬁnding a u that

solves the minimization problem

arbitrarily large in magnitude Consequently, additional constraints are oftenimposed on|u(i)|, for instance,

Obviously, the problem in Eq (1.9) ﬁts nicely into the standard form of mization problems given by Eq (1.4)

opti-Example 1.3 High performance in modern optical instruments depends on the

quality of components like lenses, prisms, and mirrors These components havereflecting or partially reflecting surfaces, and their performance is limited by thereflectivities of the materials of which they are made The surface reflectivity

Trang 30

can, however, be altered by the deposition of a thin transparent ﬁlm In fact, thistechnique facilitates the control of losses due to reﬂection in lenses and makespossible the construction of mirrors with unique properties [7][8].

As is depicted in Fig 1.4, a typical N -layer thin-ﬁlm system consists of N

layers of thin ﬁlms of certain transparent media deposited on a glass substrate

The thickness and refractive index of the ith layer are denoted as x i and n i,respectively The refractive index of the medium above the ﬁrst layer is denoted

as n0 If φ0is the angle of incident light, then the transmitted ray in the (i −1)th layer is refracted at an angle φ iwhich is given by Snell’s law, namely,

Figure 1.4. AnN -layer thin-ﬁlm system.

Given angle φ0and the wavelength of light, λ, the energy of the light reﬂected

from the ﬁlm surface and the energy of the light transmitted through the ﬁlm

surface are usually measured by the reﬂectance R and transmittance T which

satisfy the relation

R + T = 1 For an N -layer system, R is given by (see [9] for details)

Trang 31

n k / cos φ k for light polarized with the electric

vector lying in the plane of incidence

n k cos φ k for light polarized with the electric

vector perpendicular to theplane of incidence

(1.14)

The design of a multilayer thin-ﬁlm system can now be accomplished as follows:

Given a range of wavelenghs λ l ≤ λ ≤ λ u and an angle of incidence φ0, ﬁnd

x1, x2, , x N such that the reﬂectance R(x, λ) best approximates a desired

reﬂectance R d (λ) for λ ∈ [λ l , λ u] Formulate the design problem as anoptimization problem

Solution In practice, the desired reﬂectance is speciﬁed at grid points λ1, λ2, , λ K in the interval [λ l , λ u]; hence the design may be carried out by selecting

x isuch that the objective function

and w i > 0 is a weight to reﬂect the importance of term [R(x, λ i)− R d (λ i)]2

in Eq (1.15) If we letη = [1 η N +1]T , e+ = [η01]T , e − = [η0 −1] T, and

Trang 32

Finally, we note that the thickness of each layer cannot be made arbitrarilythin or arbitrarily large and, therefore, constraints must be imposed on the

subject to: x i − d il ≥ 0 for i = 1, 2, , N (1.18b)

d iu − x i ≥ 0 for i = 1, 2, , N (1.18c)

Example 1.4 Quantities q1, q2, , q m of a certain product are produced by

m manufacturing divisions of a company, which are at distinct locations The product is to be shipped to n destinations that require quantities b1, b2, , b n

Assume that the cost of shipping a unit from manufacturing division i to tination j is c ij with i = 1, 2, , m and j = 1, 2, , n Find the quantity

des-x ij to be shipped from division i to destination j so as to minimize the total

cost of transportation, i.e.,

Second, the quantity to be shipped to a speciﬁc destination has to meet the need

of that destination and so

Trang 33

where cTx is the inner product of c and x The problem in Eq (1.19) like

those in Examples 1.2 and 1.3 ﬁts into the standard optimization problem in

Eq (1.4) Since both the objective function in Eq (1.19a) and the constraints in

Eqs (1.19b) and (1.19c) are linear, the problem is known as a linear ming (LP) problem (see Sect 1.6.1).

Any point x that satisﬁes both the equality as well as the inequality constraints

is said to be a feasible point of the optimization problem The set of all points that

satisfy the constraints constitutes the feasible domain region of f (x) Evidently,

the constraints deﬁne a subset of E n Therefore, the feasible region can bedeﬁned as a set2

R = {x : a i (x) = 0 for i = 1, 2, , p and c j(x)≥ 0 for j = 1, 2, , q}

whereR ⊂ E n

The optimum point x∗ must be located in the feasible region, and so the

general constrained optimization problem can be stated as

Trang 34

Any point x not inR is said to be a nonfeasible point.

If the constraints in an optimization problem are all inequalities, the

con-straints divide the points in the E nspace into three types of points, as follows:

1 Interior points

2 Boundary points

3 Exterior points

Aninterior point is a point for which c j (x) > 0 for all j A boundary point is a

point for which at least one c j (x) = 0, and an exterior point is a point for which

at least one c j (x) < 0 Interior points are feasible points, boundary points may

or may not be feasible points, whereas exterior points are nonfeasible points

If a constraint c m(x) is zero during a speciﬁc iteration, the constraint is said

to be active, and if c m(x∗) is zero when convergence is achieved, the optimum

point x∗is located on the boundary In such a case, the optimum point is said to

be constrained If the constraints are all equalities, the feasible points must be

located on the intersection of all the hypersurfaces corresponding to a i(x) = 0

for i = 1, 2, , p The above deﬁnitions and concepts are illustrated by the

following two examples

Example 1.5 By using a graphical method, solve the following optimization

problem

minimize f (x) = x21+ x22− 4x1+ 4subject to: c1(x) = x1− 2x2+ 6≥ 0

f (x) centered at x1 = 2, x2 = 0 Constraints c1(x) and c2(x) dictate

respectively, while constraints c3(x) and c4(x) dictate that x1and x2be positive

The contours of f (x) and the boundaries of the constraints can be constructed

as shown in Fig 1.5

The feasible region for this problem is the shaded region in Fig 1.5 The

solution is located at point A on the boundary of constraint c (x) In effect,

Trang 35

Figure 1.5. Graphical construction for Example 1.5.

the solution is a constrained optimum point Consequently, if this problem is

solved by means of mathematical programming, constraint c2(x) will be active

when the solution is reached

In the absence of constraints, the minimization of f (x) would yield point B

as the solution

Example 1.6 By using a graphical method, solve the optimization problem

minimize f (x) = x21+ x22+ 2x2subject to: a1(x) = x21+ x22− 1 = 0

Trang 36

Hence the contours of f (x) in the (x1, x2) plane are concentric circles withradius

f (x) + 1, centered at x1= 0, x2 =−1 Constraint a1(x) is a circle

centered at the origin with radius 1 On the other hand, constraint c1(x) is a

straight line since it is required that

x2 ≥ −x1+ 0.5 The last two constraints dictate that x1 and x2 be nonnegative Hence therequired construction can be obtained as depicted in Fig 1.6

In this case, the feasible region is the arc of circle a1(x) = 0 located in the

ﬁrst quadrant of the (x1, x2) plane The solution, which is again a constrainedoptimum point, is located at point A There are two active constraints in this

example, namely, a1(x) and c3(x).

In the absence of constraints, the solution would be point B in Fig 1.6

-2 0

-1

Figure 1.6. Graphical construction for Example 1.6.

Trang 37

In the above examples, the set of points comprising the feasible region aresimply connected as depicted in Fig 1.7a Sometimes the feasible region mayconsist of two or more disjoint sub-regions, as depicted in Fig 1.7b If this isthe case, the following difﬁculty may arise A typical optimization algorithm

is an iterative numerical procedure that will generate a series of progressivelyimproved solutions, starting with an initial estimate for the solution Therefore,

if the feasible region consists of two sub-regions, say, A and B, an initial estimatefor the solution in sub-region A is likely to yield a solution in sub-region A, and

a better solution in sub-region B may be missed Fortunately, however, in mostreal-life optimization problems, this difﬁculty can be avoided by formulatingthe problem carefully

Feasible region B

Figure 1.7. Examples of simply connected and disjoint feasible regions.

Trang 38

1.6 Branches of Mathematical Programming

Several branches of mathematical programming were enumerated in Sec 1.1,namely, linear, integer, quadratic, nonlinear, and dynamic programming Eachone of these branches of mathematical programming consists of the theory andapplication of a collection of optimization techniques that are suited to a speciﬁcclass of optimization problems The differences among the various branches

of mathematical programming are closely linked to the structure of the mization problem and to the mathematical nature of the objective and constraintfunctions A brief description of each branch of mathematical programming is

opti-as follows

If the objective and constraint functions are linear and the variables are strained to be positive, as in Example 1.4, the general optimization problemassumes the form

Trang 39

1.6.2 Integer programming

In certain linear programming problems, at least some of the variables are quired to assume only integer values This restriction renders the programmingproblem nonlinear Nevertheless, the problem is referred to as linear since theobjective and constraint functions are linear [10]

and Q is a positive deﬁnite or semideﬁnite symmetric square matrix, then the

constraints are linear and the objective function is quadratic Such an tion problem is said to be a quadratic programming (QP) problem (see Chap 10

optimiza-of [5]) A typical example optimiza-of this type optimiza-of problem is as follows:

minimize f (x) = 12x21+12x22− x1− 2x2subject to: c1(x) = 6− 2x1− 3x2≥ 0

Trang 40

problems by using nonlinear programming algorithms, the specialized rithms developed for linear or quadratic programming should be used for theseproblems since they are usually much more efﬁcient.

algo-The choice of optimization algorithm depends on the mathematical behaviorand structure of the objective function Most of the time, the objective function

is a well behaved nonlinear function and all that is necessary is a purpose, robust, and efﬁcient algorithm For certain applications, however,specialized algorithms exist which are often more efﬁcient than general-purposeones These are often referred to by the type of norm minimized, for example,

general-an algorithm that minimizes general-an L1, L2, or L ∞ norm is said to by an L1, L2, or

minimax algorithm.

In many applications, a series of decisions must be made in sequence, wheresubsequent decisions are inﬂuenced by earlier ones In such applications, anumber of optimizations have to be performed in sequence and a general strat-egy may be required to achieve an overall optimum solution For example, alarge system which cannot be optimized owing to the size and complexity ofthe problem can be partitioned into a set of smaller sub-systems that can beoptimized individually Often individual sub-systems interact with each otherand, consequently, a general solution strategy is required if an overall optimumsolution is to be achieved Dynamic programming is a collection of techniquesthat can be used to develop general solution strategies for problems of the typejust described It is usually based on the use of linear, integer, quadratic ornonlinear optimization algorithms

References

1 G B Dantzig, Linear Programming and Extensions, Princeton University Press, Princeton,

N.J., 1963.

2 D M Himmelblau, Applied Nonlinear Programming, McGraw-Hill, New York, 1972.

3 P E Gill, W Murray, and M H Wright, Practical Optimization, Academic Press, London,

1981.

4 D G Luenberger, Linear and Nonlinear Programming, 2nd ed., Addison-Wesley, Reading,

MA, 1984.

5 R Fletcher, Practical Methods of Optimization, 2nd ed., Wiley, Chichester, UK, 1987.

6 B C Kuo, Automatic Control Systems, 5th ed., Prentice Hall, Englewood Cliffs, N.J., 1987.

7 K D Leaver and B N Chapman, Thin Films, Wykeham, London, 1971.

8 O S Heavens, Thin Film Physics, Methuen, London, 1970.

9 Z Knittl, Optics of Thin Films, An Optical Multilayer Theory, Wiley, New York, 1976.

10 G L Nemhauser and L A Wolsey, Integer and Combinatorial Optimization, Wiley, New

York, 1988.

Định dạng
Số trang	675
Dung lượng	5,03 MB