David G. Luenberger, Yinyu Ye - Linear and Nonlinear Programming International Series Episode 2 Part 11 docx

For general nonlinear programming problems, many of the standard methodsfor solving systems of equations can be adapted to the corresponding necessaryequations.. Another class of methods

Trang 1

15.9 Semideﬁnite Programming 497

transformed to semidefinite constraints, and hence the entire problem converted to

a semidefinite program This approach is useful in many applications, especially invarious problems of control theory

As in other instances of duality, the duality of semidefinite programs is weakunless other conditions hold We state here, but do not prove, a version of the strongduality theorem

Strong Duality in SDP Suppose (SDP) and (SDD) are both feasible and at

least one of them has an interior Then, there are optimal solutions to the primal and the dual and their optimal values are equal.

If the non-empty interior condition of the above theorem does not hold, thenthe duality gap may not be zero at optimality

Example 8 The following semidefinite program has a duality gap:

Interior-Point Algorithms for SDP

Let the primal SDP and dual SDD semidefinite programs both have interiorpoint feasible solutions Then, the central path can be expressed as

The primal-dual potential function for SDP, a descent merit function, is

n+X S = n + logX • S − logdetX · detS

where 0 Note that if X and S are diagonal matrices, these definitions reduce

to those for linear programming

Trang 2

498 Chapter 15 Primal-Dual Methods

Once we have an interior feasible point X y S, we can generate a new iterate

X+ y+ S+ by solving for D X d y D S from the primal-dual system of linearequations

is toward the solution of the corresponding equality constrained problem Thismethod will solve a quadratic program in a finite number of steps

For general nonlinear programming problems, many of the standard methodsfor solving systems of equations can be adapted to the corresponding necessaryequations One class consists of first-order methods that move in a direction related

to the residual (that is, the error) in the equations Another class of methods is based

on extending the method of conjugate directions to nonpositive-definite systems.Finally, a third class is based on Newton’s method for solving systems of nonlinearequations, and solving a linearized version of the system at each iteration Underappropriate assumptions, Newton’s method has excellent global as well as localconvergence properties, since the simple merit function, 1

2fx + T

 hx2+

1hx2, decreases in the Newton direction An individual step of Newton’s method

Trang 3

15.11 Exercises 499

is equivalent to solving a quadratic programming problem, and thus Newton’smethod can be extended to problems with inequality constraints through recursivequadratic programming

More effective methods are developed by accounting for the special structure ofthe linearized version of the necessary conditions and by introducing approximations

to the second-order information In order to assure global convergence of thesemethods, a penalty (or merit) function must be specified that is compatible withthe method of direction selection, in the sense that the direction is a direction ofdescent for the merit function The absolute-value penalty function and the standardquadratic penalty function are both compatible with some versions of recursivequadratic programming

The best of the primal-dual methods take full account of special structure,and are based on direction-finding procedures that are closely related to methodsdescribed in earlier chapters It is not surprising therefore that the convergenceproperties of these methods are also closely related to those of other chapters Again

we find that the canonical rate is fundamental for properly designed first-ordermethods

Interior point methods in the primal–dual mode are very effective for treatingproblems with inequality constraints, for they avoid (or at least minimize) the diffi-culties associated with determining which constraints will be active at the solution.Applied to general nonlinear programming problems, these methods closely parallelthe interior point methods for linear programming There is again a central path,and Newton’s method is a good way to follow the path

A relatively new class of mathematical programming problems is semidefiniteprogramming, where the unknown is a matrix and at least some of the constraintsrequire the unknown matrix to be positive semidefinite (or negative semidefinite).There is a variety of interesting and important practical problems that can benaturally cast in this form Because many problems which appear nonlinear (such

as quadratic problems) become essentially linear in semidefinite form, the efficientinterior point algorithms for linear programming can be extended to these problems

as well

1 Solve the quadratic program

minimize x2− xy + y2− 3xsubject to x 0

y 0

x+ y 4

by use of the active set method starting at x= y = 0

Trang 4

Assume that Lx∗ ∗ is positive definite and that hx∗ is of full rank

a) Show that the real part of each eigenvalue of C is positive.

xk+1= xk k kT

k+1= k k

converges locally to x∗, ∗ (That is, if started sufficiently close to x∗, ∗, the process

converges to x∗, ∗.) Hint: Use Ostroski’s Theorem: Let Az be a continuously

differentiable mapping from Epto Ep, assume Az∗ = 0, and let Az∗ have all

eigenvalues strictly inside the unit circle of the complex plane Then zk+1= zk+Azk

converges locally to z∗

3 Let A be a real symmetric matrix A vector x is singular if xTAx= 0 A pair of vectors

x, y is a hyperbolic pair if both x and y are singular and xTAy= 0 Hyperbolic pairs can

be used to generalize the conjugate gradient method to the nonpositive definite case

a) If pkis singular, show that if pk+1is defined as

pk+1= Apk−ApkTA2pk

2Apk2 pk

then pk, pk+1 is a hyperbolic pair

b) Consider a modification of the conjugate gradient process of Section 8.3, where if

pkis singular, pk+1is generated as above, and then

pkApk+1 pk

Show that if pk+1 is the second member of a hyperbolic pair and rk= 0, then

x+2= x+1, which means the process does not get “stuck.”

Trang 5

15.11 Exercises 501

4 Another method for solving a system Ax = b when A is nonsingular and symmetric

is the conjugate residual method In this method the direction vectors are constructed

to be an A2-orthogonalized version of the residuals rk= b − Axk The error function

Ex = Ax −b2decreases monotonically in this process Since the directions are based

on rkrather than the gradient of E, which is 2Ark, the method extends the simplicity

of the conjugate gradient method by implicit use of the fact that A2 is positive definite

The method is this: Set p1= r1= b − Ax1and repeat the following steps, omitting (a,b) on the first step

Show that the directions pkare A2-orthogonal

5 Consider the n+ m-dimensional system of equations

is the first m components of x The system can then be written

a) Assume that L is positive definite on the tangent space x Ax = 0 Derive an

explicit statement equivalent to this assumption in terms of the positive definiteness

of some n− m × n − m matrix

b) Solve the system in terms of the submatrices of the partitioned form

6 Consider the partitioned square matrix M of the form

Trang 6

where Q = A − BD−1C−1, provided that all indicated inverses exist Use this result toverify the rate of convergence result in Section 15.7

7 For the problem

minimize fx

subject to gx 0

where gx is r-dimensional, define the penalty function

px = fx + c max0 g1x g2x grx Let d, d = 0 be a solution to the quadratic program

minimize 1

2T

a) Show that if d = 0 is a solution, then d is a descent direction for p.

b) If d = 0 is a solution, show that x is a critical point of p in the sense that for any

a) Under standard assumptions on the original problem, show that for sufficiently large

c, is (locally) an exact penalty function

Trang 7

c) Indicate how can be defined for problems with inequality constraints.

10 Let Bk be a sequence of positive definite symmetric matrices, and assume that thereare constants a > 0, b > 0 such that ax2 xTBkx bx2 for all x Suppose that B is replaced by Bkin the kth step of the recursive quadratic programming procedure of the

theorem in Section 15.5 Show that the conclusions of that theorem are still valid Hint:

Note that the set of allowable Bk’s is closed

11 (Central path theorem) Prove the central path theorem, Theorem 1 of Section 15.8, forconvex optimization

12 Prove the potential reduction theorem, Theorem 2 of Section 15.8, for convex quadraticprogramming This theorem can be generalized to non-quadratic convex objective

functions fx satisfying the following condition: let

Such condition is called the scaled Lipschitz condition in x x > 0

13 Let A and B be two symmetric and positive semidefinite matrices Prove that

15 Let X and S both be positive definite Prove that

n logX • S − logdetX · detS n log n

Trang 8

16 Consider a SDP and the potential level set

= X y S ∈ n+X S≤ Prove that

1⊂ 2 if 1≤ 2and for every , is bounded and its closure has non-empty intersection withthe SDP solution set

17 Let both (SDP) and (SDD) have interior feasible points Then for any 0 < <

path point X y S exists and is unique Moreover,

i) the central path point X y S is bounded where 0 <  0 for anygiven 0 < 0<

ii) For 0 < < ,

C • X < C • X and bT

y > bTy

if X = X and y = y

iii) X y S converges to an optimal solution pair for (SDP) and (SDD), and the rank of the limit of X is maximal among all optimal solutions of (SDP) and the rank of the limit S is maximal among all optimal solutions of (SDD).

REFERENCES

15.1 An early method for solving quadratic programming problems is the principal pivotingmethod of Dantzig and Wolfe; see Dantzig [D6] For a discussion of factorization methodsapplied to quadratic programming, see Gill, Murray, and Wright [G7]

15.4 Arrow and Hurwicz [A9] proposed a continuous process (represented as a system

of differential equations) for solving the Lagrange equations This early paper showed thevalue of the simple merit function in attacking the equations A formal discussion of theproperties of the simple merit function may be found in Luenberger [L17] The first-ordermethod was examined in detail by Polak [P4] Also see Zangwill [Z2] for an early analysis

of a method for inequality constraints The conjugate direction method was first extended tononpositive definite cases by the use of hyperbolic pairs and then by employing conjugateresiduals (See Exercises 3 and 4, and Luenberger [L9], [L11].) Additional methods withsomewhat better numerical properties were later developed by Paige and Saunders [P1] and

by Fletcher [F8] It is perhaps surprising that Newton’s method was analyzed in this formonly recently, well after the development of the SOLVER method discussed in Section 15.3.For a comprehensive account of Newton methods, see Bertsekas, Chapter 4 [B11] TheSOLVER method was proposed by Wilson [W2] for convex programming problems andwas later interpreted by Beale [B7] Garcia-Palomares and Mangasarian [G3] proposed aquadratic programming approach to the solution of the first-order equations See Fletcher[F10] for a good overview discussion

15.6–15.7 The discovery that the absolute-value penalty function is compatible with recursivequadratic programming was made by Pshenichny (see Pshenichny and Danilin [P10]) and

Trang 9

15.8 Many researchers have applied interior-point algorithms to convex quadratic problems.These algorithms can be divided into three groups: the primal algorithm, the dual algorithm,and the primal-dual algorithm Relations among these algorithms can be seen in den Hertog[H6], Anstreicher et al [A6], Sun and Qi [S12], Tseng [T12], and Ye [Y3].

15.9 There have been several remarkable applications of SDP; see, for example, Goemansand Williamson [G8], Boyd et al [B22], Vandenberghe and Boyd [V2], and Biswas and

Ye [B17] For the sensor localization problem see Biswas and Ye [B17] For discussion ofSchur complements see Boyd and Vanderberghe [B23] The SDP example with a dualitygap was constructed by Freund The primal potential reduction algorithm for positive semi-definite programming is due to Alizadeh [A4, A3] and to Nesterov and Nemirovskii [N2].The primal-dual SDP algorithm described here is due to Nesterov and Todd [N3].15.11 For results similar to those of Exercises 2,7, and 8, see Bertsekas [B11] For discussion

of Exercise 9, see Fletcher [F10]

Trang 10

The union of two sets S and T is denoted S∪ T and is the set consisting of

the elements that belong to either S or T The intersection of two sets S and T is

denoted S∩ T and is the set consisting of the elements that belong to both S and

T If S is a subset of T , that is, if every member of S is also a member of T , we

Sets of Real Numbers

a x b A rounded, instead of square, bracket denotes strict inequality in the

b

507

Trang 11

508 Appendix A Mathematical Review

If S is a set of real numbers bounded above, then there is a smallest real number

y such that x y for all x ∈ S The number y is called the least upper bound or

supremum of S and is denoted

A matrix is a rectangular array of numbers, called elements The matrix itself is

denoted by a boldface letter When specific numbers are not used, the elements aredenoted by italicized lower-case letters, having a double subscript Thus we write

for a matrix A having m rows and n columns Such a matrix is referred to as an

m×n matrix If we wish to specify a matrix by defining a general element, we use

the notation A= aij

An m× n matrix all of whose elements are zero is called a zero matrix and

denoted 0 A square matrix (a matrix with m= n) whose elements aij= 0 for i = j,and aii= 1 for i = 1 2 n is said to be an identity matrix and denoted I.

The sum of two m× n matrices A and B is written A + B and is the matrix

whose elements are the sum of the corresponding elements in A and B The product

of a matrix A and a scalar , written A or A, is obtained by multiplying each

element of A by The product AB of an m× n matrix A and an n × p matrix B

is the m× p matrix C with elements cij=n

k=1aikbkj

The transpose of an m×n matrix A is the n×m matrix ATwith elements aT

ij=

aji A (square) matrix A is symmetric if AT= A A square matrix A is nonsingular

if there is a matrix A−1, called the inverse of A, such that A−1A = I = AA−1 The

determinant of a square matrix A is denoted by det (A) The determinant is nonzero

if and only if the matrix is nonsingular Two square n× n matrices A and B are

similar if there is a nonsingular matrix S such that B= S−1AS.

Matrices having a single row are referred to as row vectors; matrices having a single column are referred to as column vectors Vectors of either type are usually

denoted by lower-case boldface letters To economize page space, row vectors are

written a= a1 a2 an = a1 a2 an.Since column vectors are used frequently, this notation avoids the necessity to

Trang 12

A.3 Spaces 509

display numerous columns To further distinguish rows from columns, we write

a∈ En if a is a column vector with n components, and we write b∈ En if b is a

row vector with n components

It is often convenient to partition a matrix into submatrices This is indicated

by drawing partitioning lines through the matrix, as for example,

The resulting submatrices are usually denoted Aij, as illustrated

A matrix can be partitioned into either column or row vectors, in which case

a special notation is convenient Denoting the columns of an m× n matrix A by

operations on the components We write x 0 if each component of x is

nonneg-ative

The line segment connecting two vectors x and y is denoted [x y] and consists

The scalar product of two vectors x= x1 x2 xn and y= y1 y2 yn

is defined as xTy = yTx=n

i=1xiyi The vectors x and y are said to be orthogonal

vectors x and y in En, the Cauchy-Schwarz Inequality holds: Ty

A set of vectors a1 a2 ak is said to be linearly dependent if there are

scalars 1 2 k, not all zero, such thatk

i =1iai= 0 If no such set of scalars

exists, the vectors are said to be linearly independent A linear combination of the

vectors a1 a2 ak is a vector of the formk

i =1iai The set of vectors that are

linear combinations of a1 a2 ak is the set spanned by the vectors A linearly

independent set of vectors that span Enis said to be a basis for En Every basis for

En contains exactly n vectors

The rank of a matrix A is equal to the maximum number of linearly independent

columns in A This number is also equal to the maximum number of linearly independent rows in A The m× n matrix A is said to be of full rank if the rank of

Ais equal to the minimum of m and n

A subspace M of En is a subset that is closed under the operations of vector

addition and scalar multiplication; that is, if a and b are vectors in M, then a +b

is also in M for every pair of scalars The dimension of a subspace M is equal

to the maximum number of linearly independent vectors in M If M is a subspace

denoted by lower-case boldface letters To economize page space, row vectors are

written a= a1 a2< /sub> an = a1 a2< /sub> ... avoids the necessity to

Trang 12< /span>

A.3 Spaces 509

display numerous columns... write x if each component of x is

nonneg-ative

The line segment connecting two vectors x and y is denoted [x y] and consists

The scalar product

Định dạng
Số trang	25
Dung lượng	507,36 KB