These methods are susceptible to jamming lack of global convergence because many simple direction finding mappings and the usual line search mapping are not closed.Problems with inequali
Trang 1that require the full line search machinery Hence, in general, the convex simplexmethod may not be a bargain.
The concept of feasible direction methods is a straightforward and logical extension
of the methods used for unconstrained problems but leads to some subtle difficulties
These methods are susceptible to jamming (lack of global convergence) because many
simple direction finding mappings and the usual line search mapping are not closed.Problems with inequality constraints can be approached with an active setstrategy In this approach certain constraints are treated as active and the othersare treated as inactive By systematically adding and dropping constraints fromthe working set, the correct set of active constraints is determined during thesearch process In general, however, an active set method may require that severalconstrained problems be solved exactly
The most practical primal methods are the gradient projection methods and thereduced gradient method Both of these basic methods can be regarded as the method
of steepest descent applied on the surface defined by the active constraints The rate
of convergence for the two methods can be expected to be approximately equal and
is determined by the eigenvalues of the Hessian of the Lagrangian restricted to thesubspace tangent to the active constraints Of the two methods, the reduced gradientmethod seems to be best It can be easily modified to ensure against jamming and
it requires fewer computations per iterative step and therefore, for most problems,will probably converge in less time than the gradient projection method
2 Sometimes a different normalizing term is used in (4) Show that the problem of finding
Trang 212.10 Exercises 397
3 Perhaps the most natural normalizing term to use in (4) is one based on the Euclidean
norm This leads to the problem of finding d= d1 d2 dn to
4 Let ⊂ En be a given feasible region A set ⊂ E2n consisting of pairs x d, with
x∈ and d a feasible direction at x, is said to be a set of uniformly feasible direction
∈ implies that x + d is feasible for all
0
Let ⊂ E2n be a set of uniformly feasible direction vectors for , with feasibility
M x d
for some 0 y ∈ Show that if d = 0, the map M is closed at x d.
5 Let ⊂ E2n be a set of uniformly feasible direction vectors for with feasibility
i=1
di = 1
Trang 3where M is some given positive constant For large M the ith inequality of thissubsidiary problem will be active only if the corresponding inequality in the original
problem is nearly active at x (indeed, note that M→ corresponds to Zoutendijk’smethod) Show that this direction finding mapping is closed and generates uniformlyfeasible directions with feasibility constant 1/M
7 Generalize the method of Exercise 6 so that it is applicable to nonlinear inequalities
8 An alternate, but equivalent, definition of the projected gradient p is that it is the vector
solving
minimize g − p2subject to Aqp = 0
Using the Karush-Kuhn–Tucker necessary conditions, solve this problem and therebyderive the formula for the projected gradient
9 Show that finding the d that solves
minimize gTd
subject to Aqd = 0 d2= 1
gives a vector d that has the same direction as the negative projected gradient.
10 Let P be a projection matrix Show that PT= P P2= P.
11 Suppose Aq T A¯q so that Aqis the matrix A¯qwith the row aT adjoined Show that
AqAT−1 can be found from A¯qAT
¯q−1from the formula
Develop a similar formula for (A¯qA¯q−1 in terms of AqAq−1
12 Show that the gradient projection method will solve a linear program in a finite number
of steps
13 Suppose that the projected negative gradient d is calculated satisfying
−g = d + AT
and that some component i of , corresponding to an inequality, is negative Show
that if the ith inequality is dropped, the projection diof the negative gradient onto theremaining constraints is a feasible direction of descent
14 Using the result of Exercise 13, it is possible to avoid the discontinuity at d = 0 in the
direction finding mapping of the simple gradient projection method At a given point let
Trang 412.10 Exercises 399
= − min 0 i, with the minimum taken with respect to the indices i corresponding
the active inequalities The direction to be taken at this point is d = −Pg if Pg ,
or d, defined by dropping the inequality i for which i= −, if Pg (In case of
equality either direction is selected.) Show that this direction finding map is closed over
a region where the set of active inequalities does not change
15 Consider the problem of maximizing entropy discussed in Example 3, Section 14.4.Suppose this problem were solved numerically with two constraints by the gradientprojection method Derive an estimate for the rate of convergence in terms of theoptimal pi’s
16 Find the geodesics of
is such that every point is a regular point And suppose that the sequence of points
xkk=0 generated by geodesic descent is bounded Prove that every limit point of thesequence satisfies the first-order necessary conditions for a constrained minimum
18 Show that, for linear constraints, if at some point in the reduced gradient method z is
zero, that point satisfies the Karush-Kuhn–Tucker first-order necessary conditions for aconstrained minimum
19 Consider the problem
(ties are broken arbitrarily); (ii) the formula for z is replaced by
zi=
−ri if ri 0
−xiri if ri> 0
Establish the global convergence of this algorithm
20 Find the exact solution to the example presented in Section 12.4
21 Find the direction of movement that would be taken by the gradient projection method
if in the example of Section 12.4 the constraint x4= 0 were relaxed Show that if theterm −3x4 in the objective function were replaced by −x4, then both the gradientprojection method and the reduced gradient method would move in identical directions
Trang 522 Show that in terms of convergence characteristics, the reduced gradient method behaveslike the gradient projection method applied to a scaled version of the problem.
23 Let r be the condition number of LMand s the condition number of CTC Show that the
−1/sr +12
24 Formulate the symmetric version of the hanging chain problem using a single constraint
Find an explicit expression for the condition number of the corresponding CTCmatrix(assuming y1 is basic) Use Exercise 23 to obtain an estimate of the convergencerate of the reduced gradient method applied to this problem, and compare it with therate obtained in Table 12.1, Section 12.7 Repeat for the two-constraint formulation(assuming y1and ynare basic)
25 Referring to Exercise 19 establish a global convergence result for the convex simplexmethod
REFERENCES
12.2 Feasible direction methods of various types were originally suggested and developed
by Zoutendijk [Z4] The systematic study of the global convergence properties of feasibledirection methods was begun by Topkis and Veinott [T8] and by Zangwill [Z2]
12.3–12.4 The gradient projection method was proposed and developed (more completelythan discussed here) by Rosen [R5], [R6], who also introduced the notion of an active setstrategy See Gill, Murray, and Wright [G7] for a discussion of working sets and active setstrategies
12.5 This material is taken from Luenberger [L14]
12.6–12.7 The reduced gradient method was originally proposed by Wolfe [W5] for problemswith linear constraints and generalized to nonlinear constraints by Abadie and Carpentier[A1] Wolfe [W4] presents an example of jamming in the reduced gradient method Theconvergence analysis given in this section is new
12.8 The convex simplex method, for problems with linear constraints, together with a proof
of its global convergence is due to Zangwill [Z2]
Trang 6Chapter 13 PENALTY
AND BARRIER METHODS
Penalty and barrier methods are procedures for approximating constrainedoptimization problems by unconstrained problems The approximation is accom-plished in the case of penalty methods by adding to the objective function a termthat prescribes a high cost for violation of the constraints, and in the case of barriermethods by adding a term that favors points interior to the feasible region over
those near the boundary Associated with these methods is a parameter c or that
determines the severity of the penalty or barrier and consequently the degree towhich the unconstrained problem approximates the original constrained problem.For a problem with n variables and m constraints, penalty and barrier methods workdirectly in the n-dimensional space of variables, as compared to primal methodsthat work in (n− m)-dimensional space
There are two fundamental issues associated with the methods of this chapter.The first has to do with how well the unconstrained problem approximates theconstrained one This is essential in examining whether, as the parameter c isincreased toward infinity, the solution of the unconstrained problem converges
to a solution of the constrained problem The other issue, most important from
a practical viewpoint, is the question of how to solve a given unconstrainedproblem when its objective function contains a penalty or barrier term It turns outthat as c is increased to yield a good approximating problem, the correspondingstructure of the resulting unconstrained problem becomes increasingly unfavorablethereby slowing the convergence rate of many algorithms that might be applied.(Exact penalty functions also have a very unfavorable structure.) It is necessary,then, to devise acceleration procedures that circumvent this slow convergencephenomenon
Penalty and barrier methods are of great interest to both the practitioner and thetheorist To the practitioner they offer a simple straightforward method for handlingconstrained problems that can be implemented without sophisticated computerprogramming and that possess much the same degree of generality as primalmethods The theorist, striving to make this approach practical by overcoming itsinherently slow convergence, finds it appropriate to bring into play nearly all aspects
401
Trang 7of optimization theory; including Lagrange multipliers, necessary conditions, andmany of the algorithms discussed earlier in this book The canonical rate of conver-gence associated with the original constrained problem again asserts its fundamental
role by essentially determining the natural accelerated rate of convergence for
unconstrained penalty or barrier problems
where c is a positive constant and P is a function on En satisfying: (i) P is
continuous, (ii) Px 0 for all x ∈ En, and (iii) Px = 0 if and only if x ∈ S Example 1. Suppose S is defined by a number of inequality constraints:
S= x gix 0 i = 1 2 p
A very useful penalty function in this case is
Px=1 2 P
Trang 8For each k solve the problem
obtaining a solution point xk
We assume here that, for each k, problem (4) has a solution This will be true,
for example, if qc x increases unboundedly as x → (Also see Exercise 2 to
see that it is not necessary to obtain the minimum precisely.)
Convergence
The following lemma gives a set of inequalities that follow directly from the
definition of xkand the inequality ck+1> ck
Trang 9which proves (5).
We also have
fxk+ ckPxk fxk+1+ ckPxk+1 (8)
fxk+1+ ck+1Pxk+1 fxk+ ck+1Pxk (9)Adding (8) and (9) yields
ck+1− ckPxk+1 ck+1− ckPxk
which proves (6)
Also
fxk+1+ ckPxk+1 fxk+ ckPxk
and hence using (6) we obtain (7)
Lemma 2 Let x∗be a solution to problem (1) Then for each k
Theorem Let xk be a sequence generated by the penalty method Then, any
limit point of the sequence is a solution to (1).
Proof. Suppose the subsequence xk k∈ is a convergent subsequence of xk
having limit x Then by the continuity of f , we have
limit
Let f∗ be the optimal value associated with problem (1) Then according toLemmas 1 and 2, the sequence of values qck xk is nondecreasing and boundedabove by f∗ Thus
Trang 1013.2 Barrier Methods 405
Since Pxk 0 and ck→ , (12) implies
limit
k ∈ Pxk= 0
Using the continuity of P, this implies Px= 0 We therefore have shown that the
limit point x is feasible for (1).
To show that x is optimal we note that from Lemma 2, fxk f∗and hence
refer to such a set as robust Some examples of robust and nonrobust sets are shown
in Fig 13.2 This kind of set often arises in conjunction with inequality constraints,where S takes the form
S= x gix 0 i = 1 2 p
Barrier methods are also termed interior methods They work by establishing
a barrier on the boundary of the feasible region that prevents a search procedure
from leaving the region A barrier function is a function B defined on the interior
of S such that: (i) B is continuous, (ii) Bx 0, (iii) Bx → as x approaches
Trang 11Robust Not robust Not robust
where c is a positive constant
Alternatively, it is common to formulate the barrier method as
Trang 1213.3 Properties of Penalty and Barrier Functions 407
When formulated with c we take c large (going to infinity); while when formulatedwith we take small (going to zero) Either way the result is a constrainedproblem, and indeed the constraint is somewhat more complicated than in theoriginal problem (13) The advantage of this problem, however, is that it can besolved by using an unconstrained search technique To find the solution one starts
at an initial interior point and then searches from that point using steepest descent
or some other iterative descent method applicable to unconstrained problems Sincethe value of the objective function approaches infinity near the boundary of S, thesearch technique (if carefully implemented) will automatically remain within theinterior of S, and the constraint need not be accounted for explicitly Thus, althoughproblem (14) or (15) is from a formal viewpoint a constrained problem, from acomputational viewpoint it is unconstrained
The Method
The barrier method is quite analogous to the penalty method Let ck be a sequencetending to infinity such that for each k k= 1 2 ck 0, ck+1> ck Define thefunction
is not given explicitly but is defined implicitly by a number of functional constraints
In these situations, the penalty or barrier function is invariably defined in terms of
Trang 13the constraint functions themselves; and although there are an unlimited number ofways in which this can be done, some important general implications follow fromthis kind of construction.
For economy of notation we consider problems of the form
minimize fx
subject to gix 0 i = 1 2 p (16)For our present purposes, equality constraints are suppressed, at least notationally,
by writing each of them as two inequalities If the problem is to be attacked with
a barrier method, then, of course, equality constraints are not present even in anunsuppressed version
Penalty Functions
A penalty function for a problem expressed in the form (16) will most naturally beexpressed in terms of the auxiliary constraint functions
This is because in the interior of the constraint region Px≡ 0 and hence P should
be a function only of violated constraints Denoting by g+x the p-dimensional
vector made up of the gi+x’s, we consider the general class of penalty functions
i =1gi
+x2=1
2g+x2which is without doubt the most popular penalty function In this case is one-halftimes the identity quadratic form on Ep, that is, y=1
... setstrategy See Gill, Murray, and Wright [G7] for a discussion of working sets and active setstrategies12. 5 This material is taken from Luenberger [L14]
12. 6– 12. 7 The reduced gradient... Exercise 23 to obtain an estimate of the convergencerate of the reduced gradient method applied to this problem, and compare it with therate obtained in Table 12. 1, Section 12. 7 Repeat for the two-constraint... method was originally proposed by Wolfe [W5] for problemswith linear constraints and generalized to nonlinear constraints by Abadie and Carpentier[A1] Wolfe [W4] presents an example of jamming