Interior and exterior penalty methods to solve nonlinear optimization problems

INTERIOR AND EXTERIOR PENALTY METHODS TO SOLVE NONLINEAR... ADDIS ABABA UNIVERSITYAuthor: Kiflu Kemal Title: Interior and Exterior Penalty Methods to Solve Nonlinear Optimization Problem

Trang 1

INTERIOR AND EXTERIOR PENALTY METHODS TO SOLVE NONLINEAR

Trang 2

ADDIS ABABA UNIVERSITYDEPARTMENT OF MATHEMATICS

The undersigned hereby certify that they have read and recommend to the department

of mathematics for acceptance of this project entitled ”Interior and exterior PenaltyMethod to Solve Nonlinear programming Problems” by Kiflu Kemal in partialfulfilment of the requirements for the degree of Master of Science in Mathematics

Advisor: Dr Berhanu Guta

Trang 3

ADDIS ABABA UNIVERSITY

Author: Kiflu Kemal

Title: Interior and Exterior Penalty Methods to Solve Nonlinear Optimization Problems

Trang 4

I would like to express my gratitude to my advisor, Dr Birhanu Guta, for all his dedication,patience, and advice Also, I would like to thank all Mathematics Instructors for theirmotivation and guidance during the past two years at Addis Ababa University, department ofMathematics,the Library workers My thank also forwarded to my Brother Tilahun Blaynehand my confessor Aba Zerea Dawit who force me to join Mathematics department of AddisAbaba University and to all of my families and friends for all their invulnerable love andsupport

Trang 5

of the feasible region over those near the boundary For a problem with n variables and

m constraints, both approaches work directly in the n-dimensional space of the variables.The discussion that follows emphasizes exterior penalty methods recognizing that interiorpenalty function methods embody the same principles

Keywords: Constrained optimization, unconstrained optimization, Exterior penalty,Interiorpenalty(barrier) methods,Penalty Parameter,Penalty function, Penalty Term,Auxiliary func-tion,non linear programming

Trang 6

List of Notations

∇f : gradient of real valued function f

∇tf : transpose of the gradient

<: set of real numbers

<n: n dimensional space

<n×m: space of real n × m matrices

C: a cone

∂f

∂x: partial derivative of f with respect to x

H(x): Hessian matrix of a function at x

L: Lagrangian function

L(., λ, µ): Lagrangian function with Lagrange multipliers λ and µ

fµk: auxiliary function for penalty methods with penalty parameter µkα(x): penalty function

P (x): barrier function

SDP :Positive Semidefinite

φµk: auxiliary function for barrier methods with penalty parameter µk

< λ, h >: inner product of vectors λ and h

f ∈ C1: f is once continuously differentiable function

f ∈ C2: f is twice continuously differentiable function

Trang 7

Table of Contents

1.1 Convex Analysis 3

1.2 Convex set and Convex function 3

2 Optimization Theory and Methods 6 2.1 Some Classes of Optimization Problems 6

2.1.1 Linear Programming 6

2.1.2 Quadratic Programming 7

2.1.3 Non Linear Programming Problems 7

2.2 Unconstrained Optimization 8

2.3 Optimality Conditions 10

2.4 Constrained Optimization 11

2.4.1 Optimality Conditions for Equality Constrained Optimization 12

2.4.2 Optimality Conditions for General Constrained Optimization 13

2.5 Methods to Solve Unconstrained Optimization Problems 15

3 Interior and Exterior Penalty Methods 18

Trang 8

3.1 The Concept Of Penalty Functions 19

3.2 Interior Penalty Function Methods 20

3.2.1 Algorithmic Scheme For Interior Penalty Function Methods 22

3.2.2 Convergence Of Interior Penalty Function Methods 24

3.3 Exterior Penalty Function Methods 25

3.3.1 Algorithmic Scheme For Exterior Penalty Function Methods 26

3.3.2 Convergence Of Exterior Penalty Function Methods 28

3.3.3 Penalty Function Methods and Lagrange Multipliers 30

Trang 9

Since the early 1960s, the idea of replacing a constrained optimization problem by a sequence

of unconstrained problems parameterized by a scalar parameter µ has played a fundamentalrole in the formulation of algorithms(Bertsekas,1999)

To do this replacement penalty methods have a vital role Penalty methods approximate thesolution for nonlinear constrained problem by minimizing the penalty function for a largevalue of µ or a smaller value ofµ

Generally, penalty methods can be categorized in to two types, exterior penalty functionmethods (we can say simply penalty function methods) and interior penalty (barrier) functionmethods

In exterior penalty methods some or all of the constraints are eliminate and add to the jective function a penalty term which prescribes a high cost to infeasible points Associatedwith these methods is a parameter µ, which determines the severity of the penalty and as

ob-a consequence the extent to which the resulting unconstrob-ained problem ob-approximob-ates theoriginal constrained problem and can be illustrated:

Minimize f (x)subject to gi(x) ≤ 0

x ∈ <n

(1)

By using exterior penalty function method the constrained optimization problem is converted

in to the following unconstrained form:

Minimize f (x) + µ(M ax{0, gi(x)})2

Similar to exterior penalty functions, interior penalty functions are also used to transform

a constrained problem into an unconstrained problem or into a sequence of unconstrainedproblem These functions set a barrier against leaving the feasible region.We can solveproblem (1) by interior penalty function method By converting it in to unconstrainedproblem in the following fashion:

Minimize f (x) − Σmi=1 µ

gi(x); f or{gi(x) < 0} and i = 1, , m, (3)

Trang 10

In Chapter-1 we try to discuss some basic, concepts of convex analysis and some additionalpreliminary concepts which help to understand the idea of the project.

Chapter-2 more explain about theories of nonlinear optimization, both unconstrained andconstrained The chapter focuses mainly on the minimization theories and basic conditionsrelated to this optimization point of view

Chapter-3 discusses on interior penalty function methods and exterior penalty functionmethods Throughout the chapter, we try to describe some basic concepts and properties

of the methods for nonlinear optimization problems Definitions,algorithmic schemes ofthe respective methods,convergence theories and special properties of these methods arediscussed in the chapter

Trang 11

Chapter 1

Preliminary Concepts

1.1 Convex Analysis

1.2 Convex set and Convex function

The concept of convex set and convex function play an important role in the study ofoptimization(W.SUN,2007)

Definition 1.2.1 (Convex Sets) Let S ⊂ <n is said to be convex if the line segment joiningany two points of S also belongs to S

In other words, if x1, x2 ∈ S, then λx1+ (1 − λ)x2 ∈ S for each λ ∈ [0, 1]

A convex combination of a finite set of vectors {x1, x2, , xn} in <n is any vector x ofthe form

x = Σni=1αixi, where Σni=1αi = 1, and αi ≥ 0 for all i = 1, 2, , n

The convex hull of the set S containing {x1, x2, , xn}, denoted by conv(S), is the set ofall convex combinations of S In other words, x ∈ conv(S) if and only if x can be represented

as a convex combination of {x1, x2, , xn}

If the non-negativity of the multipliers αi for i = 1, 2, , n is ignored, then the combination

is said to be an affine combination

A cone is a non empty set C with the property that for all x ∈ C, then αx ∈ C ,symbolically

we can write as:

x ∈ C ⇒ αx ∈ C, for all α ≥ 0

For instance, the set C ⊂ <2 defined by {(x1, x2)t|x1 ≥ 0, x2 ≥ 0} is a cone in <2

Note that cones are not necessarily convex

For

Trang 12

example:-The set {(x1, x2)t|x1 ≥ 0 or x2 ≥ 0} which encompasses three quarters of the

two-dimensional plane is a cone, but not convex

The cone generated by {x1, x2, , xn} is the set of all vectors x of the form

x = Σni=1αixi where, αi ≥ 0 for all i = 1, 2, , n

Note that all cones of this form are convex

Definition 1.2.2 (Convex Functions) Let S be a non empty convex set in <n As (J.Jahn,1996)definedthat a function f : S → < is said to be convex for all x1, x2 ∈ S if

f [λx1+ (1 − λ)x2] ≤ λf (x1) + (1 − λ)f (x2)

for each λ ∈ [0, 1]

The function f is said to be strictly convex on S if

f [λx1+ (1 − λ)x2] < λf (x1) + (1 − λ)f (x2)

for each distinct x1, x2 ∈ S and for each λ ∈ (0, 1)

Theorem 1.2.1 Jensen’s Inequality

If g is a convex function on a convex set X and x = Σn

Definition 1.2.3 (Concave Functions) A function f (x) is said to be concave function over

the region S if for any two points x1, x2 ∈ S

We have the function

f [λx1+ (1 − λ)x2] ≥ λf (x1) + (1 − λ)f (x2)

where λ ∈ [0, 1]

S is strictly concave function if

f [λx1+ (1 − λ)x2] > λf (x1) + (1 − λ)f (x2)

for each distinct x1, x2 ∈ S and for each λ ∈ (0, 1)

Similarly we can describe that; If the function −f is a convex (strictly convex,uniformly

convex) function on S,then f is said to be a concave (strictly concave,uniformly concave)

function(W.SUN,2006)

Lemma 1.2.1 Let S be a non empty convex set in <n, and let f : S → < be a convex

function Then the level set Sα = {x ∈ S : f (x) ≤ α}, where α is real number, is a convex

set

Proposition 1.2.1 If g is a convex function on a convex set X, then the function

g(x) = max{g(x), 0} is also convex on X

Trang 13

Proof 1.2.1 Suppose x, y ∈ X and λ ∈ [0, 1] Then

g(λx + (1 − λ)y)) = max{g(λx + (1 − λ)y), 0}

≤ max{λg(x) + (1 − λ)g(y), 0}, since g is convex

≤ max{λg(x), 0} + max{(1 − λ)g(y), 0},

= λmax{g(x), 0} + (1 − λ)max{g(y), 0}

= λg(x) + (1 − λ)g(y)

Proposition 1.2.2 If h is convex and non-negative on a convex set X, then h2 is alsoconvex on X

Proof 1.2.2 Suppose x, y ∈ X and λ ∈ [0, 1] Then

h2(λx + (1 − λ)y)) = [h(λx + (1 − λ)y)][h(λx + (1 − λ)y)]

≤ [λh(x) + (1 − λ)h(y)][λh(x) + (1 − λ)h(y)]

= λh2(x) + (1 − λ)h2(y) − λ(1 − λ)(h(x) − h(y))2

≤ λh2(x) + (1 − λ)h2(y)

Trang 14

Chapter 2

Optimization Theory and Methods

Optimization Theory and Methods is a young subject in applied mathematics,computationalmathematics and operations research(W.SUN,2006)

The subject is involved in optimal solution of problems which are defined mathematically,i.e., given a practical problem, the best solution to the problem can be found from lots ofschemes by means of scientific methods and tools It involves the study of optimality condi-tions of the problems, the construction of model problems, the determination of algorithmicmethod of solution, the establishment of convergence theory of the algorithms, and numeri-cal experiments with typical problems and real life problems

The general form of optimization problems is

Minimize f (x)

where x ∈ X is decision variable,f (x) : <n→ < an objective function and the set X ∈ <nisthe feasible set of (2.1) Based on the description of the function f and the feasible set X theproblem (2.1) can be classified as linear, quadratic, non-linear, multiple-objective problemetc

2.1 Some Classes of Optimization Problems

Trang 15

Minimize CTxsubject to Ax = a

Bx ≤ b

x ∈ <n

(2.2)

where f (x) = CTx and X = {x ∈ <n|Ax = a, Bx ≤ b}

Under linear programming problems there are practical problems such as: linear discreteproblems, transportation problems, network flow problems,etc and we use simplex method,Big

M method, Dual simplex method,Graphical method etc to find the solution of those linearprogramming problems

Here the objective function f (x) = 12xTQx + qTx + r is Quadratic while the feasible set

X = {x ∈ <n|Ax = a, Bx ≤ b} is defined using linear function and r is constant

The general form of a non-linear optimization problem is:

Minimize f (x)subject to hi(x) = 0, for i = 1, , l,

gj(x) ≤ 0, for j = 1, , m,

x ∈ <n

(2.4)

where, we assume that all the functions are smooth The feasible set of the (NLPP) is given

by X = {x ∈ <n|hi(x) = 0 for i = 1, , l; gj(x) ≤ 0 for j = 1, , m} Throughout this paper our interest is in solving Non-linear programming problems by classifyingthem primarily as unconstrained and constrained optimization problems Particularly, if thefeasible set X = <n, the optimization problem (2.1) is called an unconstrained optimizationproblem where as the problems of type (2.4) are said to be constrained optimization problems.Generally Optimization problems can be classified as unconstrained optimization problemand constrained optimization problems

Trang 16

2.2 Unconstrained Optimization

Unconstrained optimization problem has the following form

Minimize f (x) , subject to x ∈ <nwhere f : <n → < is a given function The first thing will be to derive some conditions,which allow to decide whether a point is a minimum or not

Definition 2.2.1 i A point x∗ is a local minimizer if there is a neighbourhood η of x∗such that f (x∗) ≤ f (x) for all x ∈ η

ii A point x∗ is a strict local minimizer (also called a strong local minimizer) if there is aneighbourhood η of x∗ such that f (x∗) < f (x) for all x ∈ η with x 6= x∗

iii A point x∗ is an isolated local minimizer if there is a neighbourhood η of x∗ such that x∗

is the only local minimizer in η

iv All isolated local minimizers are strict local minimizers

v We say that x∗ is a global minimizer if

f (x∗) ≤ f (x) for all x ∈ <n

When the function f is smooth, there are efficient and practical ways to identify local minima

In particular, if f is twice continuously differentiable, we may be able to tell that x∗ is a localminimizer (and possibly a strict local minimizer) by examining just the gradient ∇f (x∗) andthe Hessian ∇2f (x∗)

There is no general procedure to determine whether the local minimum is really a globalminimum in a non-linear optimization problem(KUMAR,2014)

Definition 2.2.2 Gradientof f : <n → < at x∗ ∈ <n defined as:

Trang 17

• if ∂∂xf (x2 ) < o, then f attains its maximum.

• if ∂2∂xf (x2∗) = o, then f need further investigation

2 For two variables x1, x2

also called the Hessian matrix

• if rt − s2 > 0, then f attains its minimum

• if rt − s2 < 0, then f attains its maximum

• if rt − s2=0, then f need further investigation

• if |H| > 0, then f attains its minimum

• if |H| < 0, then f attains its maximum

• if |H| = 0, then f need further investigation

Definition 2.2.3 i A symmetric n × n matrix M is said to be positive semi definite if

xtM x ≥ 0 for all x ∈ <n Now we can said that

M ≥ 0

ii We say that M is positive definite if xtM x > 0 for all x 6= 0

Trang 18

iv Let M ∈ <n×n be a matrix Then the eigenvalues of M are scalar λ such that

Property 2.2.2 If M is positive definite, then M−1 is positive definite

Property 2.2.3 Let P be a symmetric n × n matrix and Q be a positive semi definite n × nmatrix Assume that xtP x > 0 for all x 6= 0 satisfying xtQx = 0 Then there exists a scalar

µ such that P + µQ is positive definite

2.3 Optimality Conditions

Proposition 2.3.1 (Necessary Optimality Conditions) Assume that x∗ is a local minimizer

of f and f ∈ C1 over η Then

Proposition 2.3.2 (Sufficient Optimality Conditions) Let f ∈ C2 over η, ∇f (x∗) = 0, and

∇2f (x∗) > 0, i.e., the function is locally convex in x∗ Then x∗ is a strict local minimizerfor f

Note that: if the objective function is convex, local and global minimizers are simple tocharacterize

Theorem 2.3.1 When f is convex, any local minimizer x∗ is a global minimizer of f If inaddition f is differentiable, then any stationary point x∗ (i.e., a point satisfying the condition

∇f (x∗) = 0) is a global minimizer of f

Proof 2.3.1 Suppose that x∗ is a local but not a global minimizer Then we can find a point

z ∈ < with f (z) < f (x∗) Consider the line segment that joins x∗ and z, that is

x = λz + (1 − λ)x∗, for some λ ∈ (0, 1] (2.5)

By the convexity property for f , we have

f (x) ≤ λf (z) + (1 − λ)f (x∗) < f (x∗) (2.6)

Trang 19

Any neighbourhood N of x∗ contains a piece of line segment 2.5, so there will always bepoints x ∈ N at which 2.6 is satisfied Hence, x∗ is not a local minimizer, which contradictsthe assumption Therefore, x∗ is a global minimizer.

For the second part of the theorem, suppose that x∗ is not a global minimizer and choose

z as above Then, from convexity, we have

The points belonging to the feasible region are called feasible points

A vector d ∈ <nis a feasible direction at x ∈ X if d 6= 0 and x + αd ∈ X for some sufficientlysmall α > 0 At a feasible point x, the inequality constraint is said to be active if gj(x) = 0

Trang 20

The set A(x) = {i : gj(x) = 0; j = 1, , m} denotes the index set of the active (binding)inequality constraints at x.

Note that: if the set X is convex and the objective function f is convex, then 2.8 is called

a convex optimization problem

Definition 2.4.1 Definitions of the different types of local minimizing solutions are simpleextensions of the corresponding definitions for the unconstrained case, except that now werestrict consideration to the feasible points in the neighbourhood of x∗

i A vector x∗ is a local solution of the problem 2.9 if x∗ ∈ X and there is a neighbourhood

η of x∗ such that f (x) ≥ f (x∗) for x ∈ η ∩ X

ii A vector x∗ is a strict local solution (also called strong local solution) if x∗ ∈ X andthere is a neighbourhood η of x∗ such that f (x) > f (x∗) for x ∈ η ∩ X with x 6= x∗

iii A vector x∗ is an isolated local solution if x∗ ∈ X and there is a neighbourhood η of x∗

such that x∗ is the only local solution in x ∈ η ∩ X

Note that isolated local solutions are strict, but that the reverse is not true

Theorem 2.4.1 Assume that X is a convex set and for some > 0 and x∗ ∈ X, f ∈ C1

over S(x∗; ) Then if x∗ is a local minimizer, then

∇f (x∗)td ≥ 0, (2.10)where d = x − x∗, feasible direction, for all x ∈ X

If in addition f is convex over X and 2.10 holds, then x∗ is a global minimizer

Proof 2.4.1 Let d be a feasible direction If ∇f (x∗)td < 0 (i.e., if d is a descent direction

at x∗), then f (x∗+ αd) < f (x∗) for all sufficiently small α > 0 (i.e., all α ∈ (0, ¯α) for some

¯

α > 0) This is a contradiction since x∗ is a local minimizer

Theorem 2.4.2 (Weierstrass’s Theorem) Let X be a non empty, compact set in <n, and let

f : X → < be continuous on X Then the problem min{f (x) : x ∈ X} attains its minimum;that is, there is a minimizing point to this problem

Optimiza-tion

Consider the equality constrained problem

Minimize f (x)subject to h(x) = 0, (2.11)

Where f : <n → <, h : <n→ <l are given functions, and h1, , hl are components of h

Trang 21

Definition 2.4.2 (Regular Point) Let x∗ be a vector such that h(x∗) = 0 and, for some

> 0, h ∈ C1 on S(x∗; ) We say that x∗is a regular point if the gradients ∇h1(x∗), , ∇hl(x∗)are linearly independent

Definition 2.4.3 (Lagrangian Function) The Lagrangian function L : <n+l→< for theproblem 2.11 is defined by

L(x, λ) = f (x)+ < λ, h(x) >

where λ = (λ1, , λl) is the Lagrange multiplier of h

Proposition 2.4.1 (Karush-Kuhn-Tucker (KKT) Necessary Conditions) Let x∗ be a localminimum for 2.11 and assume that, for some > 0, f ∈ C1, h ∈ C1 on S(x∗; ), and x∗ is aregular point Then, there exists unique vector λ∗∈ <l such that

∇xL(x∗, λ∗) = 0 (2.12)

If in addition, f ∈ C2, h ∈ C2 on S(x∗; ), then for all z ∈ <n satisfying ∇h(x∗)tz = 0, wehave

zt∇xxL(x∗, λ∗)z ≥ 0 (2.13)Theorem 2.4.3 (KKT Sufficient Conditions) Let x∗ ∈ <n such that h(x∗) = 0, and, forsome > 0, f ∈ C2, h ∈ C2 on S(x∗; ) Assume that there exists vector λ∗∈ <m such that

∇xL(x∗, λ∗) = 0 (2.14)and for every z 6= 0 satisfying ∇h(x∗)tz = 0, we have

zt∇xxL(x∗, λ∗)z > 0 (2.15)Then x∗ is a strict local minimizer for 2.11

Remark: A point is said to be KKT point if it satisfies all the KKT necessary conditions

Optimiza-tion

Consider the constrained problem involving both equality and inequality constraints

Minimize f (x)subject to hi(x) = 0,

Định dạng
Số trang	43
Dung lượng	398,76 KB