DSpace at VNU: ADJOINT-BASED PREDICTOR-CORRECTOR SEQUENTIAL CONVEX PROGRAMMING FOR PARAMETRIC NONLINEAR OPTIMIZATION

This paper proposes an algorithmic framework for solving parametric optimization problems which we call adjoint-based predictor-corrector sequential convex programming.. predictor-correc

Trang 1

ADJOINT-BASED PREDICTOR-CORRECTOR SEQUENTIAL CONVEX PROGRAMMING FOR PARAMETRIC NONLINEAR

QUOC TRAN DINH†, CARLO SAVORGNAN‡, AND MORITZ DIEHL‡

Abstract This paper proposes an algorithmic framework for solving parametric optimization

problems which we call adjoint-based predictor-corrector sequential convex programming After senting the algorithm, we prove a contraction estimate that guarantees the tracking performance of the algorithm Two variants of this algorithm are investigated The ﬁrst can be used to treat online parametric nonlinear programming problems when the exact Jacobian matrix is available, while the second variant is used to solve nonlinear programming problems The local convergence of these variants is proved An application to a large-scale benchmark problem that originates from nonlinear model predictive control of a hydro power plant is implemented to examine the performance of the algorithms.

pre-Key words predictor-corrector path-following, sequential convex programming, adjoint-based

optimization, parametric nonlinear programming, online optimization

AMS subject classifications 49M37, 65K05, 90C31 DOI 10.1137/110844349

1 Introduction In this paper, we consider a parametric nonconvex

optimiza-tion problem of the form

where f :Rn → R is convex, g : R n → R mis nonlinear, Ω⊆ R n is a nonempty, closed

convex set, and the parameter ξ belongs to a given subset P ⊆ R p Matrix M ∈ R m ×p

plays the role of embedding the parameter ξ into the equality constraints in a linear way Throughout this paper, f and g are assumed to be diﬀerentiable on their domain Problem P(ξ) includes many (parametric) nonlinear programming problems such as

standard nonlinear programs, nonlinear second order cone programs, and nonlinearsemideﬁnite programs [32, 39, 47] The theory of parametric optimization has beenextensively studied in many research papers and monographs; see, e.g., [7, 25, 42].This paper deals with the eﬃcient calculation of approximate solutions to a se-

quence of problems of the form P(ξ), where the parameter ξ is slowly varying In

∗Received by the editors August 15, 2011; accepted for publication (in revised form) July 6, 2012;

published electronically October 9, 2012 This research was supported by Research Council KUL CoE EF/05/006; Optimization in Engineering (OPTEC); IOF-SCORES4CHEM, GOA/10/009(MaNet), GOA/10/11; the Flemish Government through projects G.0452.04, G.0499.04, G.0211.05, G.0226.06, G.0321.06, G.0302.07, G.0320.08, G.0558.08, G.0557.08, G.0588.09, and G.0377.09; ICCoS, AN- MMM, and MLDM; IWT PhD grants, Belgian Federal Science Policy Oﬃce, IUAP P6/04; EU, ERNSI; FP7-HDMPC, FP7-EMBOCON; AMINAL; Helmholtz-viCERP, COMET-ACCM, ERC- HIGHWIND, and ITN-SADCO.

http://www.siam.org/journals/siopt/22-4/84434.html

†Department of Electrical Engineering (ESAT-SCD) and Optimization in Engineering Center

(OPTEC), K.U Leuven, B-3001 Leuven, Belgium, and Department of Informatics, Vietnam National University, Hanoi, Vietnam (quoc.trandinh@esat.kuleuven.be).

Mathematics-Mechanics-‡Department of Electrical Engineering (ESAT-SCD) and Optimization in Engineering

Center (OPTEC), K.U Leuven, B-3001 Leuven, Belgium (carlo.savorgnan@esat.kuleuven.be, moritz.diehl@esat.kuleuven.be).

1258

Trang 2

other words, for a sequence{ξ k } k ≥0 such thatξ k+1− ξ k is small, we want to solve

in the result

In practice, sequences of problems of the form P(ξ) arise in the framework of

real-time optimization, moving horizon estimation, and online data assimilation as well as

in nonlinear model predictive control (NMPC) A practical obstacle in these tions is the time limitation imposed on solving the underlying optimization problemfor each value of the parameter Instead of solving completely a nonlinear program

applica-at each sample time [3, 4, 5, 29], several online algorithms approximapplica-ately solve theunderlining nonlinear optimization problem by performing the ﬁrst iteration of exactNewton, sequential quadratic programming (SQP), Gauss–Newton, or interior pointmethods [17, 40, 54] In [17, 40] the authors only considered the algorithms in theframework of SQP methods This approach has been proved to be eﬃcient in prac-tice and is widely used in many applications [14] Recently, Zavala and Anitescu [54]proposed an inexact Newton-type method for solving online optimization problemsbased on the framework of generalized equations [7, 42]

Other related work considers practical problems which possess general convexitystructure such as second order cone and semideﬁnite cone constraints and nonsmoothconvexity [21, 47] In these applications, standard optimization methods may notperform satisfactorily Many algorithms for nonlinear second order cone and nonlinearsemideﬁnite programming have recently been proposed and found many applications

in robust optimal control, experimental design, and topology optimization; see, e.g.[2, 21, 23, 33, 47] These approaches can be considered as generalizations of theSQP method Although solving semidefinite programming problems is in generaltime consuming due to matrix operations, in some practical applications, problemsmay possess few expensive constraints such as second order cone or semidefinite coneconstraints In this case handling these constraints directly in the algorithm may bemore efficient than transforming them into scalar constraints

Contribution The contribution of this paper is as follows:

(a) We start our paper by proposing a generic framework called the

adjoint-based predictor-corrector sequential convex programming (APCSCP) method

for solving parametric optimization problems of the form P(ξ) The algorithm

is specially suited for solving nonlinear MPC problems where the evaluations

of the derivatives are time-consuming For example, it can show advantageswith respect to standard techniques when applied to problems in which thenumber of state variables in the dynamic system is much larger than thenumber of control variables

(b) We prove the stability of the tracking error for this algorithm (Theorem 3.5).(c) In the second part of the paper the theory is specialized to the nonparametriccase where a single optimization problem is solved The local convergence ofthis variant is also obtained

(d) Finally, we present a numerical application to large-scale nonlinear modelpredictive control of a hydro power plant with 259 state variables and 10controls The performance of our algorithms is compared with a standardreal-time Gauss–Newton method and a conventional MPC approach.APCSCP is based on three main ideas: sequential convex programming, predictor-corrector path-following, and adjoint-based optimization We brieﬂy explain thesemethods in the following

Trang 3

1.1 Sequential convex programming The sequential convex programming

(SCP) method is a local nonconvex optimization technique SCP solves a sequence

of convex approximations of the original problem by convexifying only the vex parts and preserving the structures that can efficiently be exploited by convexoptimization techniques [9, 36, 38] Note that this method is different from SQPmethods where quadratic programs are used as approximations of the problem Thisapproach is useful when the problem possesses general convex structures such as conicconstraints, a cost function depending on matrix variables, or convex constraintsresulting from a low-level problem in multilevel settings [2, 15, 47] Due to the com-plexity of these structures, standard optimization techniques such as SQP and Gauss–Newton-type methods may not be convenient to apply In the context of nonlinearconic programming, SCP approaches have been proposed under the names sequentialsemidefinite programming (SSDP) or SQP-type methods [12, 21, 23, 32, 33, 47] Ithas been shown in [18] that the superlinear convergence is lost if the linear semidef-inite programming subproblems in the SSDP algorithm are convexified In [35] theauthors considered a nonlinear program in the framework of a composite minimizationproblem, where the inner function is linearized to obtain a convex subproblem which

noncon-is made strongly convex by adding a quadratic proximal term

In this paper, following the work in [21, 24, 50, 52], we apply the SCP approach

to solve problem P(ξ) The nonconvex constraint g(x) + M ξ = 0 is linearized at each

iteration to obtain a convex approximation The resulting subproblems can be solved

by exploiting convex optimization techniques

We would like to point out that the term “sequential convex programming” wasalso used in structural optimization; see, e.g [22, 55] The cited papers are related

to methods of moving asymptotes introduced by Svanberg [49]

1.2 Predictor-corrector path-following methods In order to illustrate the

idea of the predictor-corrector path-following method [13, 54] and to distinguish itfrom the other “predictor-corrector” concepts, e.g., the well-known predictor-correctorinterior point method proposed by Mehrotra in [37], we summarize the concept ofpredictor-corrector path-following methods in the case Ω≡ R n as follows

The KKT system of problem P(ξ) can be written as F (z; ξ) = 0, where z = (x, y)

is its primal-dual variable The solution z ∗ (ξ) that satisﬁes the KKT condition for a

given ξ is in general a smooth map By applying the implicit function theorem, the derivative of z ∗(·) is expressed as

this case, linearization at (¯z, ¯ ξ) yields a formula that one step of a predictor-corrector

path-following method needs to satisfy:

Trang 4

Written explicitly, it delivers the solution guess ˆz for the next parameter ˆ ξ as

guess ¯z if we employ the parameter embedding in the problem formulation [14].

Based on the above analysis, the predictor-corrector path-following method onlyperforms the ﬁrst iteration of the exact Newton method for each new problem Inthis paper, by applying the generalized equation framework [42, 43], we generalizethis idea to the case where more general convex constraints are considered Whenthe parameter does not enter linearly into the problem, we can always reformulate

this problem as P(ξ) by using intermediate variables In this case, the derivatives

with respect to these intermediate variables contain the information of the predictorterm Finally, we notice that the real-time iteration scheme proposed in [17] can beconsidered as a variant of the above predictor-corrector method in the SQP context

1.3 Adjoint-based method From a practical point of view, most of the time

spent on solving optimization problems resulting from simulation-based methods isneeded to evaluate the functions and their derivatives [6] Adjoint-based methodsrely on the observation that it is not necessary to use exact Jacobian matrices ofthe constraints Moreover, in some applications, the time needed to evaluate all thederivatives of the functions exceeds the time available to compute the solution of theoptimization problem The adjoint-based Newton-type methods in [19, 28, 45] canwork with an inexact Jacobian matrix and only require an exact evaluation of theLagrange gradient using adjoint derivatives to form the approximate optimizationsubproblems in the algorithm This technique still allows the algorithm to converge

to the exact solutions but can save valuable time in the online performance of thealgorithm

1.4 A tutorial example The idea of the APCSCP method is illustrated in

the following simple example

Example 1.1 (tutorial example) Let us consider a simple nonconvex parametric

ξ − 1) T is a stationary point of problem (1.4) which

is also the unique global optimum It is clear that problem (1.4) satisﬁes the strong

second order suﬃcient condition (SSOSC) at x ∗

Trang 5

Note that the constraint x2− x2+ 1 ≤ 0 can be rewritten as a second order

(1.4) can be cast into the form of P(ξ).

parameter ξ Instead of solving the nonlinear optimization problem at each ξ k untilcomplete convergence, APCSCP only performs the ﬁrst step of the SCP algorithm to

obtain an approximate solution x k at ξ k Notice that the convex subproblem needed

We compare this method with other known real-time iteration algorithms The ﬁrst

is the real-time iteration with an exact SQP method, and the second is the real-timeiteration with an SQP method using a projected Hessian [17, 31] In the secondalgorithm, the Hessian matrix of the Lagrange function is projected onto the cone ofsymmetric positive semideﬁnite matrices to obtain a convex quadratic programmingsubproblem

Figures 1.1 and 1.2 illustrate the performance of three methods when ξ k = 1.2 +

kΔξ k for k = 0, , 9 and Δξ k = 0.25 Figure 1.1 presents the approximate solution

trajectories given by three methods, while Figure 1.2 shows the tracking errors and

the cone constraint violations of those methods The initial point x0of three methods

is chosen at the true solution of P(ξ0) We can see that the performance of the exactSQP and the SQP using projected Hessian is quite similar However, the secondorder cone constraint(x1, 1) T

preserves the feasibility and better follows the exact solution trajectory Note thatthe subproblem in the exact SQP method is a nonconvex quadratic program, a convex

QP in the projected SQP case, and a second order cone constrained program (1.5) inthe SCP method

1 1.5 2 2.5 3

Exact−SQP

0 0.02 0.04 0.06 0.08

Projected−SQP

0 0.02 0.04 0.06

SCP

||x k −x*(ξk)||

SOC const viol.

Fig 1.2 The tracking error and the cone constraint violation of three methods ( k = 0, , 9).

Trang 6

1.5 Notation Throughout this paper, we use the notation ∇f for the

val-ued function g, and S n (resp., S n

+ and S n

(resp., positive semideﬁnite and positive deﬁnite) matrices The notation· stands

B(x, r) := {y ∈ R n | y − x < r} and ¯ B(x, r) is its closure.

The rest of this paper is organized as follows Section 2 presents a generic work of the APCSCP algorithm Section 3 proves the local contraction estimate for

frame-APCSCP and the stability of the approximation error Section 4 considers an

adjoint-based SCP algorithm for solving nonlinear programming problems as a special case.

The last section presents computational results for an application of the proposedalgorithms in NMPC of a hydro power plant

2 An APCSCP algorithm In this section, we present a generic algorithmic

framework for solving the parametric optimization problem P(ξ) Traditionally, at

com-pletely converged solution ¯z(ξ k) Exploiting the real-time iteration idea [14, 17] inour algorithm below, only one convex subproblem is solved to get an approximated

solution z k at ξ k to ¯z(ξ k)

Suppose that z k := (x k , y k)∈ Ω×R m is a given KKT point of P(ξ k) (more details

x L(¯z k), where L is the

correction term of the inconsistency between A k and g (x k ) Vector y k is referred to

semideﬁnite, the subproblem P(z k , A k , H k ; ξ) is convex Here, z k , A k and H k areconsidered as parameters

Remark 1 Note that computing the term g (x k)T y k of the correction vector s k does not require the whole Jacobian matrix g (x k), which is usually time-consuming

to evaluate

When implementing the algorithm, the evaluation of the directional derivatives

η k := g (x k)T y k can be done by the reverse mode (or adjoint mode) of automatic

dif-ferentiation By using this technique, we can evaluate an adjoint directional derivative

vector of the form g (x k)T y k without evaluating the whole Jacobian matrix g (x k) of

the vector function g More details about automatic diﬀerentiation can be found

in a monograph [26] or at http://www.autodiﬀ.org In the NMPC framework, the

constraint function g is usually obtained from a dynamic system of the form

(2.1)

˙

η(t) = G(η(t), x, t), t0≤ t ≤ t f , η(t0) = η0(x),

by applying a direct transcription, where η is referred to as a state vector and x

is a parameter vector The adjoint directional derivative vector g (x) T y is nothing

Trang 7

more than the gradient vector ∂V ∂x of the function V (x) := g(x) T y In the dynamic

system context, this function V is a special case of the general functional V (x) :=

e(η(t f)) +t f

t0 v(η, x, t)dt By simultaneously integrating the dynamic system and its

adjoint sensitivity system ˙λ = −G T

η and λ(t f) =∇ η e(η(t f)), we can evaluate the

gradient vector of V with respect to x as dV

t0 (v x + λ T G x )dt, where

λ(t0) is the solution of the adjoint system at t0 Note that the cost of integrating theadjoint system is of the same order as integrating the forward dynamics and, crucially,

independent of the dimension of x Adjoint diﬀerentiation of dynamic systems is

performed, e.g., in an open-source software package, Sundials [46] For more details

of adjoint sensitivity analysis of dynamic systems, see [10, 46]

The APCSCP algorithmic framework is described as follows

Initialization For a given parameter ξ0 ∈ P, solve approximately (oﬄine)

P(ξ0) to get an approximate KKT point z0 := (x0, y0) Compute g(x0), ﬁnd a

ma-trix A0 which approximates g (x0), and H0 ∈ S n

Step 1 Get a new parameter value ξ k+1 ∈ P.

Step 2 Solve the convex subproblem P(z k , A k , H k ; ξ k+1) to obtain a solution x k+1

and the corresponding multiplier y k+1.

Step 3 Evaluate g(x k+1), update (or recompute) matrices A k+1and H k+1∈ S n

The core step of Algorithm 1 is to solve the convex subproblem P(z k , A k , H k ; ξ)

at each iteration In Algorithm 1 we do not mention explicitly the method to solve

P(z k , A k , H k ; ξ) In practice, to reduce the computational time, we can either

imple-ment an optimization method which exploits the structure of the problem, e.g., blockstructure or separable structure [22, 51, 55], or rely on several eﬃcient methods andsoftware tools that are available for convex optimization [9, 38, 39, 48, 53] In this

paper, we are most interested in the case where one evaluation of g is very expensive.

A possible simple choice of H k is H k = 0 for all k ≥ 0.

The initial point z0is obtained by solving oﬄine P(ξ

0) However, as we will show

later (Corollary 3.6), if we choose z0 close to the set of KKT points Z ∗ (ξ

0) of P(ξ0)

(not necessarily an exact solution), then the new KKT point z1 of P(z0, A

0, H0; ξ1)

is still close to Z ∗ (ξ

1) of P(ξ1) provided thatξ1− ξ0 is suﬃciently small Hence, in

practice, we only need to solve approximately problem P(ξ0) to get a starting point z0.

In the NMPC framework, the parameter ξ usually coincides with the initial state

g (x k ), the exact Jacobian matrix of g at x k and H k ≡ 0, then this algorithm collapses

to the real-time SCP method considered in [52].

3 Contraction estimate In this section, we will show that under certain

as-sumptions, the sequence{z k } k ≥0 generated by Algorithm 1 remains close to the quence of the true KKT points{¯z k } k ≥0 of problem P(ξ k) Without loss of generality,

se-we assume that the objective function f is linear, i.e., f (x) = c T x, where c ∈ R n is

given Indeed, since f is convex, by using a slack variable s, we can reformulate P(ξ)

as a nonlinear program min(x,s)

s | g(x) + Mξ = 0, x ∈ Ω, f(x) ≤ s

Trang 8

3.1 KKT condition as a generalized equation Let us ﬁrst deﬁne the

Lagrange function of problem P(ξ) as

Note that the ﬁrst line of (3.1) implicitly includes the constraint x ∈ Ω.

A pair (¯x(ξ), ¯ y(ξ)) satisfying (3.1) is called a KKT point of P(ξ) and ¯ x(ξ) is

denote by Z ∗ (ξ) and X ∗ (ξ) the set of KKT points and the set of stationary points

of P(ξ), respectively In what follows, we use the letter z for the pair of (x, y), i.e.,

z := (x T , y T)T

Throughout this paper, we require the following assumptions, which are standard

in optimization

Assumption 1 The function g is twice diﬀerentiable on their domain.

Assumption 2 For a given ξ0 ∈ P, problem P(ξ0) has at least one KKT point

parametric generalized equation as follows:

M

Generalized equations are an essential tool to study many problems

in nonlinear analysis, perturbation analysis, variational calculations, and optimization[8, 34, 43]

Suppose that for some ξ k ∈ P, the set of KKT points Z ∗ (ξ k ) of P(ξ k) is nonempty.

For any ﬁxed ¯z k ∈ Z ∗ (ξ k), we deﬁne the following set-valued mapping:

(3.5) L(z; ¯ z k , ξ k ) := F (¯ z k ) + F (¯z k )(z − ¯z k ) + Cξ k+N K (z).

We also deﬁne the inverse mapping L −1:Rn +m → R n +m of L( ·; ¯z k , ξ k) as follows:

z ∈ R n +m : δ ∈ L(z; ¯z k , ξ k) .

Now, we consider the KKT condition of the subproblem P(z k , A k , H k ; ξ) For

given neighborhoods B(¯z k , r z) of ¯z k and B(ξ k , r ξ ) of ξ k , and z k ∈ B(¯z k , r z ), ξ k+1 ∈

Trang 9

B(ξ k , r ξ ), and given matrices A k and H k ∈ S n

+, let us consider the convex

subprob-lem P(z k , A k , H k ; ξ k+1) with respect to the parameter (z k , A k , H k , ξ k+1) The KKT

condition of this problem is expressed as follows

(3.7)

0∈ c + m(z k , A k ) + H k (x − x k ) + A T k y + NΩ(x),

0 = g(x k ) + A k (x − x k ) + M ξ k+1,

holds for the subproblem P(z k , A k , H k ; ξ k+1), i.e.,

ri(Ω)∩ x ∈ R n | g(x k ) + A k (x − x k ) + M ξ k+1= 0 = ∅,

where ri(Ω) is the relative interior of Ω Then by convexity of Ω, a point z k+1 :=

(x k+1, y k+1) is a KKT point of P(z k , A k , H k ; ξ k+1) if and only if x k+1is a solution to

P(z k , A k , H k ; ξ k+1) associated with the multiplier y k+1.

Since g is twice diﬀerentiable by Assumption 1 and f is linear, for a given z = (x, y), we have

the Hessian matrix of the Lagrange function L, where ∇2g i(·) is the Hessian matrix

of g i (i = 1, , m) Let us deﬁne the following matrix:

k , and ξ k+1 are considered as parameters Note that if A k = g (x k) and

3.2 The strong regularity concept We recall the following deﬁnition of the

strong regularity concept This deﬁnition can be considered as the strong regularity

of the generalized equation (3.4) in the context of nonlinear optimization; see [42].Definition 3.1 Let ξ k ∈ P such that the set of KKT points Z ∗ (ξ k ) of P(ξ k ) is

nonempty Let ¯ z k ∈ Z ∗ (ξ

k ) be a given KKT point of P(ξ k ) Problem P(ξ k ) is said to

be strongly regular at ¯ z k if there exist neighborhoods B(0, ¯r δ ) of the origin and B(¯z k , ¯ r z)

of ¯ z k such that the mapping z ∗

Trang 10

From the deﬁnition of L −1 where strong regularity holds, there exists a unique

is equivalent to SSOSC [20] In order to interpret the strong regularity condition of

Here, δ = (δ c , δ g)∈ B(0, ¯r δ ) is a perturbation Problem P(ξ k ) is strongly regular at ¯ z k

if and only if (3.12) has a unique KKT point z ∗

k (δ) in B(¯z k , ¯ r z ) and z ∗

k(·) is Lipschitz

continuous inB(0, ¯r δ ) with a Lipschitz constant γ.

Example 3.2 Let us recall example 1.1 in section 1 The optimal multipliers

2+ 2− 4ξ = 0 and x2 − x2 + 1 ≤ 0 are

y ∗ = (2√

ξ − 1)[8ξ2− ξ √ ξ] −\1 > 0 and y ∗ = [8

ξ2− ξ √ ξ] −1 > 0, respectively.

Since the last inequality constraint is active, while x ≥ 0 is inactive, we can easily

compute the critical cone as C(x ∗

2 > 0 which says that the strict complementarity condition holds.

Therefore, problem (1.1) satisﬁes the the strong second order suﬃcient condition On

the other hand, it is easy to check that the LICQ condition holds for (1.1) at x ∗

applying [42, Theorem 4.1], we can conclude that (1.4) is strongly regular at (x ∗

The following lemma shows the nonemptiness of Z ∗ (ξ) in the neighborhood of ξ k.

a given ξ k ∈ P Suppose further that problem P(ξ k ) is strongly regular at ¯ z k for a given ¯ z k ∈ Z ∗ (ξ

k ) Then there exist neighborhoods B(ξ k , r ξ ) of ξ k and B(¯z k , r z ) of

Proof Since the KKT condition of P(ξ k) is equivalent to the generalized equation

(3.4) with ξ = ξ k, by applying [42, Theorem 2.1] we conclude that there exist borhoods B(ξ k , r ξ ) of ξ k and B(¯z k , r z) of ¯z k such that Z ∗ (ξ

neigh-k+1) is nonempty for all

ξ k+1∈ B(ξ k , r ξ ) and Z ∗ (ξ

k+1)∩ B(¯z k , r z) contains only one point ¯z k+1 On the other

hand, sinceF (¯ z k ) + Cξ k − F (¯z k)− Cξ k+1=M(ξ k − ξ k+1) ≤ M ξ k+1− ξ k ,

by using the formula [42, 2.4], we obtain the estimate (3.13)

Trang 11

3.3 A contraction estimate for APCSCP using an inexact Jacobian matrix In order to prove a contraction estimate for APCSCP, throughout this sec-

tion we make the following assumption

Assumption 3 For a given ¯ z k ∈ Z ∗ (ξ k ), k ≥ 0, the following conditions are

k is deﬁned by (3.9) and γ is the constant in Deﬁnition 3.1.

(b) The Jacobian mapping F (·) is Lipschitz continuous on B(¯z k , r z) around ¯z k,i.e., there exists a constant 0≤ ω < +∞ such that

(3.15) F (z) − F (¯z k) ≤ ωz − ¯z k ∀ z ∈ B(¯z k , r z ).

Note that Assumption 3 is commonly used in the theory of Newton-type andGauss–Newton methods [13, 16], where the residual term is required to be suﬃcientlysmall in a neighborhood of the local solution From the deﬁnition of ˜F 

k depends on the norms of ∇2

x L(¯z k)−H k and g (¯x k)−A k These

∇2

x L(¯z k ) and the Jacobian matrix g (¯x k), respectively On the one hand, Assumption

3(a) requires the positive deﬁniteness of H k to be an approximation of∇2

x L (which is

not necessarily positive deﬁnite) On the other hand, it requires that matrix A k is a

stationary point ¯x k Note that the matrix H k in the Newton-type method proposed

in [7] is not necessarily positive deﬁnite

Now, let us deﬁne the following mapping:

where ˜F 

Lipschitz continuous in a neighborhood of ¯v k := ˜F 

k¯k − F (¯z k)− Cξ k

exist neighborhoods B(ξ k , r ξ ) and B(¯z k , r z ) such that if we take any z k ∈ B(¯z k , r z ) and

ξ k+1∈ B(ξ k , r ξ ), the mapping J k deﬁned by (3.16) is single-valued in a neighborhood B(¯v k , r v ), where ¯ v k := ˜F 

k¯k − F (¯z k)− Cξ k Moreover, the following inequality holds:

(3.17) J k (v) − J k (v ) ≤ β v − v ∀v, v ∈ B(¯v k , r v ),

where β := 1−γκ γ > 0 is a Lipschitz constant.

Proof Let us ﬁx a neighborhood B(¯v k , r v) of ¯v k Suppose for contradiction that

J k is not single-valued inB(¯v k , r v ); then for a given v the set J k (v) contains at least two points z and z such thatz − z = 0 We have

Trang 12

Now, using the strong regularity assumption of P(ξ k) at ¯z k, it follows from (3.20)that

Finally, we prove the Lipschitz continuity of J k Let z = J k (v) and z = J k (v ),

where v, v ∈ B(¯v k , r v) Similar to (3.20), these expressions can be written equivalentlyto

δ ∈ F (¯z k ) + F (¯z k )(z − ¯z k ) + Cξ k+N K (z)

and(3.22)

Trang 13

By using again the strong regularity assumption, it follows from (3.22) and (3.23)that

which shows that J k satisﬁes (3.17) with a constant β := 1−γκ γ > 0.

Let us recall that if z k+1 is a KKT point of the convex subproblem P(z k , A k , H k;

ξ k+1), then

0∈ ˜ F

k (z k+1− z k ) + F (z k ) + Cξ k+1+N K (z k+1).

According to Lemma 3.4, if z k ∈ B(¯z k , r z ), then problem P(z k , A k , H k ; ξ) is uniquely

solvable We can write its KKT condition equivalently as

Since ¯z k+1 is the solution of (3.4) at ξ k+1, we have 0 = F (¯ z k+1) + Cξ k+1 + ¯u k+1,

where ¯u k+1∈ N K(¯z k+1) Moreover, since ¯z k+1= J k( ˜F 

The main result of this section is stated in the following theorem

Then, for k ≥ 0 and ¯z k ∈ Z ∗ (ξ k ), if P(ξ k ) is strongly regular at ¯ z k , then there exist neighborhoods B(¯z k , r z ) and B(ξ k , r ξ ) such that the following hold:

(a) The set of KKT points Z ∗ (ξ k+1) of P(ξ k+1) is nonempty for any ξ k+1 ∈ B(ξ k , r ξ ).

(b) If, in addition, Assumption 3(a) is satisﬁed, then subproblem P(z k , A k , H k;

ξ k+1) is uniquely solvable in the neighborhood B(¯z k , r z ).

(c) Moreover, if, in addition, Assumption 3(b) is satisﬁed, then the sequence

{z k } k ≥0 generated by Algorithm 1, where ξ k+1 ∈ B(ξ k , r ξ ), guarantees

Proof We prove the theorem by induction For k = 0, we have Z ∗ (ξ0) is nonempty

by Assumption 2 Now, we assume Z ∗ (ξ k ) is nonempty for some k ≥ 0 We will prove

k ) such that P(ξ k) is strong regular at ¯z k Now, by applying Lemma 3.3 to

prob-lem P(ξ k), we conclude that there exist neighborhoodsB(¯z k , r z) of ¯z k and B(ξ k , r ξ)

of ξ k such that Z ∗ (ξ k

+1) is nonempty for any ξ k+1∈ B(ξ k , r ξ)

in nonlinear analysis, perturbation analysis, variational calculations, and optimization[ 8, 34, 43]

Suppose that for some ξ k ∈ P, the... regularity

of the generalized equation (3.4) in the context of nonlinear optimization; see [42].Definition 3.1 Let ξ k ∈ P such that the set of KKT points Z ∗... it is easy to check that the LICQ condition holds for (1.1) at x ∗

applying [42, Theorem 4.1], we can conclude that (1.4) is strongly regular at (x ∗

Định dạng
Số trang	27
Dung lượng	639,24 KB