Thuật toán DCA và các bài toán quy hoạch toàn phương không lồi

DC programming and DC algorithms DCA, for brevity treat the problem of minimizing a function f = g − h, with g, h being lower semicontinuous,proper, convex functions on Rn, on the whole

Trang 1

VIETNAM ACADEMY OF SCIENCE AND TECHNOLOGY

INSTITUTE OF MATHEMATICS

Hoang Ngoc Tuan

Speciality: Applied MathematicsSpeciality code: 62 46 01 12

SUMMARY OF PH.D DISSERTATION

SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS

FOR THE DEGREE OF

DOCTOR OF PHILOSOPHY IN MATHEMATICS

Supervisor:

Prof Dr Hab Nguyen Dong Yen

HANOI - 2015

Trang 2

The dissertation was written on the basis of the author’s research works carried at Institute of Mathematics, Vietnam Academy of Science and Tech-nology

Supervisor: Prof Dr Hab Nguyen Dong Yen

First referee:

Second referee:

Third referee:

To be defended at the Jury of Institute of Mathematics, Vietnam Academy of Science and Technology:

on , at o’clock

The dissertation is publicly available at:

• The National Library of Vietnam

• The Library of Institute of Mathematics

Trang 3

Convex functions have many nice properties For instance, a convex tion, say ϕ : Rn → R, is continuous, directionally differentiable, locallyLipschitz at any point u ∈ Rn In addition, ϕ is Fr´echet differentiable al-most everywhere on Rn, i.e., the set of points where the gradient ∇ϕ(x)exists is of full Lebesgue measure

func-It is also known that the subdifferential

min{ϕ(x) : x ∈ Rn}

if and only if 0 ∈ ∂ϕ(¯x)

Convex analysis is a powerful machinery for dealing with convex tion problems Note that convex programming is an important branch ofoptimization theory, which continues to draw attention of many researchersworldwide until now

optimiza-If f : Rn → R is a given function, and if there exist convex functions

g : Rn → R and h : Rn → R such that f (x) = g(x) − h(x) for every

x ∈ Rn, then f is called a d.c function The abbreviation “d.c.” here

Trang 4

comes from the combination of words “Difference (of) Convex (functions”.More generally, a function f : Rn →R, whereR= R∪ {±∞}, is said to be

a d.c function if there are lower semicontinuous, proper, convex functions

g, h : Rn →Rsuch that f (x) = g(x) − h(x) for all x ∈ Rn The convention(+∞) − (+∞) = +∞ is used here Despite their (possible) nonconvexity,d.c functions still enjoy some good properties of convex functions

A minimization problem with a geometrical constraint

min{f (x) = g(x) − h(x) : x ∈ C}, (0.1)where f, g and h are given as above, and C ⊂ Rn is a nonempty closedconvex set, is a typical DC programming problem Setting

e

f (x) = (g(x) + δC(x)) − h(x),where δC, with δC(x) = 0 for all x ∈ C and δC(x) = +∞ for all x /∈ C, isthe indicator function of the set C, one can easily transform (0.1) to

min{f (x) : x ∈e Rn}, (0.2)which is an unconstrained DC programming problem, with f (x) being a d.c.e

function

DC programming and DC algorithms (DCA, for brevity) treat the problem

of minimizing a function f = g − h, with g, h being lower semicontinuous,proper, convex functions on Rn, on the whole space Usually, g and h arecalled d.c components of f The DCA are constructed on the basis of the

DC programming theory and the duality theory of J F Toland It was PhamDinh Tao who suggested a general DCA theory, which have been developedintensively by him and Le Thi Hoai An starting from their fundamentalpaper “Convex analysis approach to D.C programming: Theory, algorithmsand applications” (Acta Mathematica Vietnamica, Vol 22, 1997)

Note that DC programming is among the most successful convex analysisapproaches to nonconvex programming One wishes to make an extension

of convex programming, not too wide so that the powerful tools of convexanalysis and convex optimization still can be used, but sufficiently large to

Trang 5

cover the most important nonconvex optimization problems The set of d.c.functions, which is closed under the basic operations usually considered

in optimization, can serve well the purpose Note that the convexity ofthe two components of the objective function has been employed widely in

DC programming to obtain essential theoretical results and to constructefficient solution methods The DC duality scheme of J F Toland, is anexample of such essential theoretical results To be more precise, Toland’sDuality Theorem asserts that, under mild conditions, the dual problem of

a DC program is also a DC program, and the two problems have the sameoptimal value

Due to their local character, DCA (i.e., DC algorithms) do not ensurethe convergence of an iteration sequence to a global solution of the problem

in question However, with the help of a restart procedure, DCA applied

to trust-region subproblems can yield a global solution of the problem Inpractice, DCA have been successfully applied to many different nonconvexoptimization problems for which it has proved to be more robust and efficientthan many standard methods; in particular, DCA work well for large-scaleproblems Note also that, with appropriate decompositions of the objectivefunctions, DCA can generate several standard algorithms in convex andnonconvex programming

This dissertation studies the DCA applied for minimizing a quadraticproblem on an Euclidean ball (also called the trust-region subproblem) and forminimizing a quadratic function on a polyhedral convex set These problemsplay important roles in optimization theory

Let A ∈Rn×n be a symmetric matrix, b ∈ Rn be a given vector, and r > 0

be a real number The nonconvex quadratic programming with convexconstraints

minnf (x) := 1

2x

T

Ax + bTx : kxk2 ≤ r2o,where kxk =

denotes the Euclidean norm of x = (x1, , xn)T ∈

Rn and T means the matrix transposition, is called the trust-region

Trang 6

One encounters with the problem while applying the trust-region method(see, e.g., A R Conn, N I M Gould, and P L Toint, “Trust-RegionMethods”, 2000) to solve the unconstrained problem of finding the minimum

of a C2–smooth function ϕ : Rn → R Having an approximate solution xk

at a step k of the trust-region method, to get a better approximate solution

xk+1 one finds the minimum of ϕ on a ball with center xk and a radiusdepending on a ratio defined by some calculations on ϕ and the point xk

If one replaces ϕ with its second-order Taylor expansion around xk, theauxiliary problem of the form of the trust-region subproblem appears, and

xk+1 is a solution of this problem

Consider a quadratic programming problem under linear constraints ofthe form

minnf (x) := 1

2x

TQx + qTx : Dx ≥ dowhere Q ∈ Rn×n and D ∈ Rm×n be given matrices, q ∈ Rn and d ∈ Rm begiven vectors It is assumed that Q is symmetric This class of optimiza-tion problems is well known and has various applications Basic qualitativeproperties related to the solution existence, structure of the solution set,first-order necessary and sufficient optimality conditions, second-order nec-essary and sufficient optimality conditions, stability, differential stability ofthe problem can be found in the books of B Bank, J Guddat, D Klatte, B.Kummer, and K Tammer, “Non-Linear Parametric Optimization” (1982),

R W Cottle, J.-S Pang, and R E Stone, “The Linear ComplementarityProblem” (1992), G M Lee, N N Tam, and N D Yen, “Quadratic Pro-gramming and Affine Variational Inequalities: A Qualitative Study” (2005),and the references therein

The structure of the solution set and the Karush-Kuhn-Tucker point set

of this problem is far different from the trust-region subproblem since theconstraint set of the trust-region subproblem is convex, compact, and hassmooth boundary

Our aim is to study the convergence and the convergence rate of DCA

Trang 7

applied for the two mentioned problems An open question and a conjectureraised in the two papers by H A Le Thi, T Pham Dinh, and N D Yen (J.Global Optim., Vol 49, 2011, pp 481–495, and Vol 53, 2012, pp 317–329)will be completely solved in this dissertation.

By using some advanced tools, we are able to obtain complete results

on convergence of DCA sequences Moreover, convergence rates of DCAsequences are established for the first time in this dissertation

The results of this dissertation complement and develop the correspondingpublished results of T Pham Dinh and H A Le Thi (SIAM J Optim., Vol

8, 1998), T Pham Dinh, H A Le Thi, and F Akoa (Optim MethodsSoftw., Vol 23, 2008), H A Le Thi, T Pham Dinh, and N D Yen (J.Global Optim., Vol 49, 2011; Vol 53, 2012)

The dissertation has three chapters and a list of references

Chapter 1 “Preliminaries” presents basic concepts and results of a generaltheory on DC programming and DCA

Chapter 2 “Minimization of a Quadratic Function on an Euclidean Ball”considers an application of DCA to trust-region subproblems Here wepresent in details an useful restart procedure that allows the algorithm tofind a global solution We also give an answer in the affirmative to the ques-tion raised by H A Le Thi, T Pham Dinh, and N D Yen (2012) about theconvergence of DCA Furthermore, the convergence rate of DCA is studied.Chapter 3 “Minimization of a Quadratic Function on a Polyhedral Con-vex Set” investigates an application of DCA to indefinite quadratic programsunder linear constraints Here we solve in the affirmative a conjecture raised

by H A Le Thi, T Pham Dinh, and N D Yen (2011) about the edness of the DCA sequences At first, by a direct proof, we obtain theboundedness of the DCA sequences for quadratic programs in R2 Then, byusing some error bounds for affine variational inequalities, we establish theR-linear convergence rate of the algorithm, hence give a complete solutionfor the conjecture

Trang 8

bound-The results of Chapter 2 were published in Journal of Global Optimization[5] (a joint work with N D Yen) and in Journal of Optimization Theory andApplications [2] Chapter 3 is written on the basis of the papers [3], whichwas published in Journal of Optimization Theory and Applications, and ofthe paper [4], which was published in Journal of Mathematical Analysis andAplications.

The above results were reported by the author of this dissertation at nar of Department of Numerical Analysis and Scientific Computing of HanoiInstitute of Mathematics, The 8th Vietnam-Korea Workshop “Mathemati-cal Optimization Theory and Applications” (University of Dalat, December8-10, 2011), The 5th International Conference on High Performance Scien-tific Computing (March 5-9, 2012, Hanoi, Vietnam), The Joint Congress ofthe French Mathematical Society (SMF) and the Vietnamese Mathemati-cal Society (VMS) (University of Hue, August 20-24, 2012), The 8th Viet-namese Mathematical Conference (Nha Trang, August 10-14, 2013), The12th Workshop on Optimization and Scientific Computing (Ba Vi, April23-25, 2014)

Trang 9

Semi-Chapter 1

Preliminaries

This chapter reviews some background materials of DC Algorithms Formore details, we refer to H A Le Thi and T Pham Dinh’s papers (1997,1998), H N Tuan’s Master dissertation (“DC Algorithms and Applications

in Quadratic Programming”, Hanoi, 2010), and the references therein

dom θ := {x ∈ Rn : θ(x) < +∞}

Let Γ0(Rn) be the set of all lower semicontinuous, proper, convex functions

on Rn The conjugate function g∗ of the function g ∈ Γ0(Rn) is defined by

g∗(y) = sup{hx, yi − g(x) : x ∈ Rn} ∀ y ∈ Rn.Note that g∗ : Rn → R is also a lower semicontinuous, proper, convexfunction In the sequel, we use the convention (+∞)−(+∞)=(+∞)

Trang 10

Definition 1.1 The optimization problem

inf{f (x) := g(x) − h(x) : x ∈ Rn}, (P)where g and h are functions belonging to Γ0(Rn), is called a DC program.The functions g and h are called d.c components of f

Definition 1.2 For any g, h ∈ Γ0(Rn), the DC program

inf{h∗(y) − g∗(y) : y ∈ Rn}, (D)

is called the dual problem of (P)

Proposition 1.1 (Toland’s Duality Theorem) The DC programs (P) and(D) have the same optimal value

Definition 1.3 A vector x∗ ∈ Rn is said to be a a local solution of (P) if

f (x∗) = g(x∗) − h(x∗) is finite (i.e., x∗ ∈ dom g ∩ dom h) and there exists aneighborhood U of x∗ such that

g(x∗) − h(x∗) ≤ g(x) − h(x), ∀x ∈ U

If we can choose U = Rn, then x∗ is called a (global) solution of (P)

Proposition 1.2 (First-order optimality condition) If x∗ is a local solution

Trang 11

(i) The sequences {(g − h)(xk)} and {(h∗ − g∗)(yk)} are decreasing;(ii) The cluster point x∗ (resp y∗) of {xk} (resp., of {yk}) is the criticalpoint of (P) (resp., of (D)).

The general DC algorithm by H A Le Thi and T Pham Dinh (1997) isformulated as follows

Trang 12

Check the condition ||xk+1 − xk|| < If it is satisfied, then terminate thecomputation; otherwise, go to Step 3

• Step 3

Increase k by 1 and return to Step 2

Definition 1.6 Let ρ ≥ 0 and C be a convex set in the spaceRn A function

Consider the problem (P) If ρ(g) > 0 (resp., ρ(g∗) > 0), let ρ1 (resp.,

ρ∗1) be real numbers such that 0 ≤ ρ1 < ρ(g) (resp., 0 ≤ ρ∗1 < ρ(g∗)) Ifρ(g) = 0 (resp., ρ(g∗) = 0), let ρ1 = 0 (resp., ρ∗1 = 0) If ρ(h) > 0 (resp.,ρ(h∗) > 0), let ρ2 (resp., ρ∗2) be real numbers such that 0 ≤ ρ2 < ρ(h) (resp.,

0 ≤ ρ∗2 < ρ(h∗)) If ρ(h) = 0 (resp., ρ(h∗) = 0), let ρ2 = 0 (resp., ρ∗2 = 0).One adopts the abbreviations dxk = xk+1 − xk and dyk = yk+1 − yk Let

α := inf{f (x) = g(x) − h(x) : x ∈ Rn}

Theorem 1.1 Assume that {xk} and {yk} are generated by the DCA We

Trang 14

Chapter 2

Minimization of a Quadratic

Function on an Euclidean Ball

In this chapter, we prove that any DCA sequence constructed by the PhamDinh–Le Thi algorithm for the trust-region subproblem converges to a Karush-Kuhn-Tucker point We also obtain sufficient conditions for the Q-linearconvergence of DCA sequences In addition, we give two examples to showthat, if the sufficient conditions are not satisfied, then the sequences maynot be Q-linearly convergent

This chapter is written on the basis of the papers [2, 5] A part of theresults from [4] is used in the final section of this chapter

Let A ∈ Rn×n be a symmetric matrix, b ∈ Rn be a given vector, and r > 0

be a real number The trust-region subproblem corresponding to the triple{A, b, r} is the optimization problem

minnf (x) := 1

2x

TAx + bTx : kxk2 ≤ r2o (2.1)

It is well-known that if x ∈ E := {x ∈ Rn : kxk ≤ r} is a local minimum

of (2.1), then there exists a unique Lagrange multiplier λ ≥ 0 such that

(A + λI)x = −b, λ(kxk − r) = 0, (2.2)

Trang 15

where I denotes the n × n unit matrix If x ∈ E and there exists λ ≥ 0satisfying (2.2), x is said to be a Karush-Kuhn-Tucker point (or a KKTpoint) of (2.1).

Algorithm

2.2.1 The Pham Dinh–Le Thi Algorithm

Applying the DCA presented in Chapter 1 to (2.1), we have the followingiterative algorithm:

1 Choose ρ > 0 so that ρI − A is a positive semidefinite matrix

2 Fix an initial point x0 ∈ Rn and a constant ε ≥ 0 (a tolerance) Set

k = 0

3 If

ρ−1k(ρI − A)xk− bk ≤ r, (2.3)then take

Trang 16

where θ1 is the smallest eigenvalue of the matrix ρI − A If x0 ∈ E thenthe inequality holds for all k ≥ 0 It holds lim

k→∞kxk+1 − xkk = 0 and

f (xk) → β ≥ α as k → ∞, where α is the optimal value of (2.1) and

β is a constant depending on the choice of x0 In addition, every clusterpoint of the sequence {xk} is a KKT point

After proving that “if A is a nonsingular matrix that has no multiplenegative eigenvalues, then any DCA sequence of (2.1) converges to a KKTpoint”, H A Le Thi, T Pham Dinh, and N D Yen (J Global Optim.,2012) have posed the following

Question Under what conditions is the DCA sequence {xk} convergent?The next sections are aimed at solving completely the above Question Itwill be proved that any DCA sequence constructed by the Pham Dinh–LeThi algorithm for the trust-region subproblem (2.1) converges to a KKTpoint

2.2.2 Restart Procedure

DCA for finding global solutions of (2.1):

Start Compute λ1 (the smallest eigenvalue of A) , λn (the largest value of A) and an eigienvector u corresponding to λ1 by a suitable algo-rithm

eigen-Take ρ > max{0, λn}, x ∈ domf , stop:=false

Định dạng
Số trang	26
Dung lượng	216,07 KB