INTERIOR AND EXTERIOR PENALTY METHODS TO SOLVE NONLINEAR... ADDIS ABABA UNIVERSITYAuthor: Kiflu Kemal Title: Interior and Exterior Penalty Methods to Solve Nonlinear Optimization Problem
Trang 1INTERIOR AND EXTERIOR PENALTY METHODS TO SOLVE NONLINEAR
Trang 2ADDIS ABABA UNIVERSITYDEPARTMENT OF MATHEMATICS
The undersigned hereby certify that they have read and recommend to the department
of mathematics for acceptance of this project entitled ”Interior and exterior PenaltyMethod to Solve Nonlinear programming Problems” by Kiflu Kemal in partialfulfilment of the requirements for the degree of Master of Science in Mathematics
Advisor: Dr Berhanu Guta
Trang 3ADDIS ABABA UNIVERSITY
Author: Kiflu Kemal
Title: Interior and Exterior Penalty Methods to Solve Nonlinear Optimization Problems
Trang 4I would like to express my gratitude to my advisor, Dr Birhanu Guta, for all his dedication,patience, and advice Also, I would like to thank all Mathematics Instructors for theirmotivation and guidance during the past two years at Addis Ababa University, department ofMathematics,the Library workers My thank also forwarded to my Brother Tilahun Blaynehand my confessor Aba Zerea Dawit who force me to join Mathematics department of AddisAbaba University and to all of my families and friends for all their invulnerable love andsupport
Trang 5of the feasible region over those near the boundary For a problem with n variables and
m constraints, both approaches work directly in the n-dimensional space of the variables.The discussion that follows emphasizes exterior penalty methods recognizing that interiorpenalty function methods embody the same principles
Keywords: Constrained optimization, unconstrained optimization, Exterior penalty,Interiorpenalty(barrier) methods,Penalty Parameter,Penalty function, Penalty Term,Auxiliary func-tion,non linear programming
Trang 6List of Notations
∇f : gradient of real valued function f
∇tf : transpose of the gradient
<: set of real numbers
<n: n dimensional space
<n×m: space of real n × m matrices
C: a cone
∂f
∂x: partial derivative of f with respect to x
H(x): Hessian matrix of a function at x
L: Lagrangian function
L(., λ, µ): Lagrangian function with Lagrange multipliers λ and µ
fµk: auxiliary function for penalty methods with penalty parameter µkα(x): penalty function
P (x): barrier function
SDP :Positive Semidefinite
φµk: auxiliary function for barrier methods with penalty parameter µk
< λ, h >: inner product of vectors λ and h
f ∈ C1: f is once continuously differentiable function
f ∈ C2: f is twice continuously differentiable function
Trang 7Table of Contents
1.1 Convex Analysis 3
1.2 Convex set and Convex function 3
2 Optimization Theory and Methods 6 2.1 Some Classes of Optimization Problems 6
2.1.1 Linear Programming 6
2.1.2 Quadratic Programming 7
2.1.3 Non Linear Programming Problems 7
2.2 Unconstrained Optimization 8
2.3 Optimality Conditions 10
2.4 Constrained Optimization 11
2.4.1 Optimality Conditions for Equality Constrained Optimization 12
2.4.2 Optimality Conditions for General Constrained Optimization 13
2.5 Methods to Solve Unconstrained Optimization Problems 15
3 Interior and Exterior Penalty Methods 18
Trang 83.1 The Concept Of Penalty Functions 19
3.2 Interior Penalty Function Methods 20
3.2.1 Algorithmic Scheme For Interior Penalty Function Methods 22
3.2.2 Convergence Of Interior Penalty Function Methods 24
3.3 Exterior Penalty Function Methods 25
3.3.1 Algorithmic Scheme For Exterior Penalty Function Methods 26
3.3.2 Convergence Of Exterior Penalty Function Methods 28
3.3.3 Penalty Function Methods and Lagrange Multipliers 30
Trang 9Since the early 1960s, the idea of replacing a constrained optimization problem by a sequence
of unconstrained problems parameterized by a scalar parameter µ has played a fundamentalrole in the formulation of algorithms(Bertsekas,1999)
To do this replacement penalty methods have a vital role Penalty methods approximate thesolution for nonlinear constrained problem by minimizing the penalty function for a largevalue of µ or a smaller value ofµ
Generally, penalty methods can be categorized in to two types, exterior penalty functionmethods (we can say simply penalty function methods) and interior penalty (barrier) functionmethods
In exterior penalty methods some or all of the constraints are eliminate and add to the jective function a penalty term which prescribes a high cost to infeasible points Associatedwith these methods is a parameter µ, which determines the severity of the penalty and as
ob-a consequence the extent to which the resulting unconstrob-ained problem ob-approximob-ates theoriginal constrained problem and can be illustrated:
Minimize f (x)subject to gi(x) ≤ 0
x ∈ <n
(1)
By using exterior penalty function method the constrained optimization problem is converted
in to the following unconstrained form:
Minimize f (x) + µ(M ax{0, gi(x)})2
Similar to exterior penalty functions, interior penalty functions are also used to transform
a constrained problem into an unconstrained problem or into a sequence of unconstrainedproblem These functions set a barrier against leaving the feasible region.We can solveproblem (1) by interior penalty function method By converting it in to unconstrainedproblem in the following fashion:
Minimize f (x) − Σmi=1 µ
gi(x); f or{gi(x) < 0} and i = 1, , m, (3)
Trang 10In Chapter-1 we try to discuss some basic, concepts of convex analysis and some additionalpreliminary concepts which help to understand the idea of the project.
Chapter-2 more explain about theories of nonlinear optimization, both unconstrained andconstrained The chapter focuses mainly on the minimization theories and basic conditionsrelated to this optimization point of view
Chapter-3 discusses on interior penalty function methods and exterior penalty functionmethods Throughout the chapter, we try to describe some basic concepts and properties
of the methods for nonlinear optimization problems Definitions,algorithmic schemes ofthe respective methods,convergence theories and special properties of these methods arediscussed in the chapter
Trang 11Chapter 1
Preliminary Concepts
1.1 Convex Analysis
1.2 Convex set and Convex function
The concept of convex set and convex function play an important role in the study ofoptimization(W.SUN,2007)
Definition 1.2.1 (Convex Sets) Let S ⊂ <n is said to be convex if the line segment joiningany two points of S also belongs to S
In other words, if x1, x2 ∈ S, then λx1+ (1 − λ)x2 ∈ S for each λ ∈ [0, 1]
A convex combination of a finite set of vectors {x1, x2, , xn} in <n is any vector x ofthe form
x = Σni=1αixi, where Σni=1αi = 1, and αi ≥ 0 for all i = 1, 2, , n
The convex hull of the set S containing {x1, x2, , xn}, denoted by conv(S), is the set ofall convex combinations of S In other words, x ∈ conv(S) if and only if x can be represented
as a convex combination of {x1, x2, , xn}
If the non-negativity of the multipliers αi for i = 1, 2, , n is ignored, then the combination
is said to be an affine combination
A cone is a non empty set C with the property that for all x ∈ C, then αx ∈ C ,symbolically
we can write as:
x ∈ C ⇒ αx ∈ C, for all α ≥ 0
For instance, the set C ⊂ <2 defined by {(x1, x2)t|x1 ≥ 0, x2 ≥ 0} is a cone in <2
Note that cones are not necessarily convex
For
Trang 12example:-The set {(x1, x2)t|x1 ≥ 0 or x2 ≥ 0} which encompasses three quarters of the
two-dimensional plane is a cone, but not convex
The cone generated by {x1, x2, , xn} is the set of all vectors x of the form
x = Σni=1αixi where, αi ≥ 0 for all i = 1, 2, , n
Note that all cones of this form are convex
Definition 1.2.2 (Convex Functions) Let S be a non empty convex set in <n As (J.Jahn,1996)definedthat a function f : S → < is said to be convex for all x1, x2 ∈ S if
f [λx1+ (1 − λ)x2] ≤ λf (x1) + (1 − λ)f (x2)
for each λ ∈ [0, 1]
The function f is said to be strictly convex on S if
f [λx1+ (1 − λ)x2] < λf (x1) + (1 − λ)f (x2)
for each distinct x1, x2 ∈ S and for each λ ∈ (0, 1)
Theorem 1.2.1 Jensen’s Inequality
If g is a convex function on a convex set X and x = Σn
Definition 1.2.3 (Concave Functions) A function f (x) is said to be concave function over
the region S if for any two points x1, x2 ∈ S
We have the function
f [λx1+ (1 − λ)x2] ≥ λf (x1) + (1 − λ)f (x2)
where λ ∈ [0, 1]
S is strictly concave function if
f [λx1+ (1 − λ)x2] > λf (x1) + (1 − λ)f (x2)
for each distinct x1, x2 ∈ S and for each λ ∈ (0, 1)
Similarly we can describe that; If the function −f is a convex (strictly convex,uniformly
convex) function on S,then f is said to be a concave (strictly concave,uniformly concave)
function(W.SUN,2006)
Lemma 1.2.1 Let S be a non empty convex set in <n, and let f : S → < be a convex
function Then the level set Sα = {x ∈ S : f (x) ≤ α}, where α is real number, is a convex
set
Proposition 1.2.1 If g is a convex function on a convex set X, then the function
g(x) = max{g(x), 0} is also convex on X
Trang 13Proof 1.2.1 Suppose x, y ∈ X and λ ∈ [0, 1] Then
g(λx + (1 − λ)y)) = max{g(λx + (1 − λ)y), 0}
≤ max{λg(x) + (1 − λ)g(y), 0}, since g is convex
≤ max{λg(x), 0} + max{(1 − λ)g(y), 0},
= λmax{g(x), 0} + (1 − λ)max{g(y), 0}
= λg(x) + (1 − λ)g(y)
Proposition 1.2.2 If h is convex and non-negative on a convex set X, then h2 is alsoconvex on X
Proof 1.2.2 Suppose x, y ∈ X and λ ∈ [0, 1] Then
h2(λx + (1 − λ)y)) = [h(λx + (1 − λ)y)][h(λx + (1 − λ)y)]
≤ [λh(x) + (1 − λ)h(y)][λh(x) + (1 − λ)h(y)]
= λh2(x) + (1 − λ)h2(y) − λ(1 − λ)(h(x) − h(y))2
≤ λh2(x) + (1 − λ)h2(y)
Trang 14Chapter 2
Optimization Theory and Methods
Optimization Theory and Methods is a young subject in applied mathematics,computationalmathematics and operations research(W.SUN,2006)
The subject is involved in optimal solution of problems which are defined mathematically,i.e., given a practical problem, the best solution to the problem can be found from lots ofschemes by means of scientific methods and tools It involves the study of optimality condi-tions of the problems, the construction of model problems, the determination of algorithmicmethod of solution, the establishment of convergence theory of the algorithms, and numeri-cal experiments with typical problems and real life problems
The general form of optimization problems is
Minimize f (x)
where x ∈ X is decision variable,f (x) : <n→ < an objective function and the set X ∈ <nisthe feasible set of (2.1) Based on the description of the function f and the feasible set X theproblem (2.1) can be classified as linear, quadratic, non-linear, multiple-objective problemetc
2.1 Some Classes of Optimization Problems
Trang 15Minimize CTxsubject to Ax = a
Bx ≤ b
x ∈ <n
(2.2)
where f (x) = CTx and X = {x ∈ <n|Ax = a, Bx ≤ b}
Under linear programming problems there are practical problems such as: linear discreteproblems, transportation problems, network flow problems,etc and we use simplex method,Big
M method, Dual simplex method,Graphical method etc to find the solution of those linearprogramming problems
Here the objective function f (x) = 12xTQx + qTx + r is Quadratic while the feasible set
X = {x ∈ <n|Ax = a, Bx ≤ b} is defined using linear function and r is constant
The general form of a non-linear optimization problem is:
Minimize f (x)subject to hi(x) = 0, for i = 1, , l,
gj(x) ≤ 0, for j = 1, , m,
x ∈ <n
(2.4)
where, we assume that all the functions are smooth The feasible set of the (NLPP) is given
by X = {x ∈ <n|hi(x) = 0 for i = 1, , l; gj(x) ≤ 0 for j = 1, , m} Throughout this paper our interest is in solving Non-linear programming problems by classifyingthem primarily as unconstrained and constrained optimization problems Particularly, if thefeasible set X = <n, the optimization problem (2.1) is called an unconstrained optimizationproblem where as the problems of type (2.4) are said to be constrained optimization problems.Generally Optimization problems can be classified as unconstrained optimization problemand constrained optimization problems
Trang 162.2 Unconstrained Optimization
Unconstrained optimization problem has the following form
Minimize f (x) , subject to x ∈ <nwhere f : <n → < is a given function The first thing will be to derive some conditions,which allow to decide whether a point is a minimum or not
Definition 2.2.1 i A point x∗ is a local minimizer if there is a neighbourhood η of x∗such that f (x∗) ≤ f (x) for all x ∈ η
ii A point x∗ is a strict local minimizer (also called a strong local minimizer) if there is aneighbourhood η of x∗ such that f (x∗) < f (x) for all x ∈ η with x 6= x∗
iii A point x∗ is an isolated local minimizer if there is a neighbourhood η of x∗ such that x∗
is the only local minimizer in η
iv All isolated local minimizers are strict local minimizers
v We say that x∗ is a global minimizer if
f (x∗) ≤ f (x) for all x ∈ <n
When the function f is smooth, there are efficient and practical ways to identify local minima
In particular, if f is twice continuously differentiable, we may be able to tell that x∗ is a localminimizer (and possibly a strict local minimizer) by examining just the gradient ∇f (x∗) andthe Hessian ∇2f (x∗)
There is no general procedure to determine whether the local minimum is really a globalminimum in a non-linear optimization problem(KUMAR,2014)
Definition 2.2.2 Gradientof f : <n → < at x∗ ∈ <n defined as:
Trang 17• if ∂∂xf (x2 ) < o, then f attains its maximum.
• if ∂2∂xf (x2∗) = o, then f need further investigation
2 For two variables x1, x2
also called the Hessian matrix
• if rt − s2 > 0, then f attains its minimum
• if rt − s2 < 0, then f attains its maximum
• if rt − s2=0, then f need further investigation
• if |H| > 0, then f attains its minimum
• if |H| < 0, then f attains its maximum
• if |H| = 0, then f need further investigation
Definition 2.2.3 i A symmetric n × n matrix M is said to be positive semi definite if
xtM x ≥ 0 for all x ∈ <n Now we can said that
M ≥ 0
ii We say that M is positive definite if xtM x > 0 for all x 6= 0
Trang 18iv Let M ∈ <n×n be a matrix Then the eigenvalues of M are scalar λ such that
Property 2.2.2 If M is positive definite, then M−1 is positive definite
Property 2.2.3 Let P be a symmetric n × n matrix and Q be a positive semi definite n × nmatrix Assume that xtP x > 0 for all x 6= 0 satisfying xtQx = 0 Then there exists a scalar
µ such that P + µQ is positive definite
2.3 Optimality Conditions
Proposition 2.3.1 (Necessary Optimality Conditions) Assume that x∗ is a local minimizer
of f and f ∈ C1 over η Then
Proposition 2.3.2 (Sufficient Optimality Conditions) Let f ∈ C2 over η, ∇f (x∗) = 0, and
∇2f (x∗) > 0, i.e., the function is locally convex in x∗ Then x∗ is a strict local minimizerfor f
Note that: if the objective function is convex, local and global minimizers are simple tocharacterize
Theorem 2.3.1 When f is convex, any local minimizer x∗ is a global minimizer of f If inaddition f is differentiable, then any stationary point x∗ (i.e., a point satisfying the condition
∇f (x∗) = 0) is a global minimizer of f
Proof 2.3.1 Suppose that x∗ is a local but not a global minimizer Then we can find a point
z ∈ < with f (z) < f (x∗) Consider the line segment that joins x∗ and z, that is
x = λz + (1 − λ)x∗, for some λ ∈ (0, 1] (2.5)
By the convexity property for f , we have
f (x) ≤ λf (z) + (1 − λ)f (x∗) < f (x∗) (2.6)
Trang 19Any neighbourhood N of x∗ contains a piece of line segment 2.5, so there will always bepoints x ∈ N at which 2.6 is satisfied Hence, x∗ is not a local minimizer, which contradictsthe assumption Therefore, x∗ is a global minimizer.
For the second part of the theorem, suppose that x∗ is not a global minimizer and choose
z as above Then, from convexity, we have
The points belonging to the feasible region are called feasible points
A vector d ∈ <nis a feasible direction at x ∈ X if d 6= 0 and x + αd ∈ X for some sufficientlysmall α > 0 At a feasible point x, the inequality constraint is said to be active if gj(x) = 0
Trang 20The set A(x) = {i : gj(x) = 0; j = 1, , m} denotes the index set of the active (binding)inequality constraints at x.
Note that: if the set X is convex and the objective function f is convex, then 2.8 is called
a convex optimization problem
Definition 2.4.1 Definitions of the different types of local minimizing solutions are simpleextensions of the corresponding definitions for the unconstrained case, except that now werestrict consideration to the feasible points in the neighbourhood of x∗
i A vector x∗ is a local solution of the problem 2.9 if x∗ ∈ X and there is a neighbourhood
η of x∗ such that f (x) ≥ f (x∗) for x ∈ η ∩ X
ii A vector x∗ is a strict local solution (also called strong local solution) if x∗ ∈ X andthere is a neighbourhood η of x∗ such that f (x) > f (x∗) for x ∈ η ∩ X with x 6= x∗
iii A vector x∗ is an isolated local solution if x∗ ∈ X and there is a neighbourhood η of x∗
such that x∗ is the only local solution in x ∈ η ∩ X
Note that isolated local solutions are strict, but that the reverse is not true
Theorem 2.4.1 Assume that X is a convex set and for some > 0 and x∗ ∈ X, f ∈ C1
over S(x∗; ) Then if x∗ is a local minimizer, then
∇f (x∗)td ≥ 0, (2.10)where d = x − x∗, feasible direction, for all x ∈ X
If in addition f is convex over X and 2.10 holds, then x∗ is a global minimizer
Proof 2.4.1 Let d be a feasible direction If ∇f (x∗)td < 0 (i.e., if d is a descent direction
at x∗), then f (x∗+ αd) < f (x∗) for all sufficiently small α > 0 (i.e., all α ∈ (0, ¯α) for some
¯
α > 0) This is a contradiction since x∗ is a local minimizer
Theorem 2.4.2 (Weierstrass’s Theorem) Let X be a non empty, compact set in <n, and let
f : X → < be continuous on X Then the problem min{f (x) : x ∈ X} attains its minimum;that is, there is a minimizing point to this problem
Optimiza-tion
Consider the equality constrained problem
Minimize f (x)subject to h(x) = 0, (2.11)
Where f : <n → <, h : <n→ <l are given functions, and h1, , hl are components of h
Trang 21Definition 2.4.2 (Regular Point) Let x∗ be a vector such that h(x∗) = 0 and, for some
> 0, h ∈ C1 on S(x∗; ) We say that x∗is a regular point if the gradients ∇h1(x∗), , ∇hl(x∗)are linearly independent
Definition 2.4.3 (Lagrangian Function) The Lagrangian function L : <n+l→< for theproblem 2.11 is defined by
L(x, λ) = f (x)+ < λ, h(x) >
where λ = (λ1, , λl) is the Lagrange multiplier of h
Proposition 2.4.1 (Karush-Kuhn-Tucker (KKT) Necessary Conditions) Let x∗ be a localminimum for 2.11 and assume that, for some > 0, f ∈ C1, h ∈ C1 on S(x∗; ), and x∗ is aregular point Then, there exists unique vector λ∗∈ <l such that
∇xL(x∗, λ∗) = 0 (2.12)
If in addition, f ∈ C2, h ∈ C2 on S(x∗; ), then for all z ∈ <n satisfying ∇h(x∗)tz = 0, wehave
zt∇xxL(x∗, λ∗)z ≥ 0 (2.13)Theorem 2.4.3 (KKT Sufficient Conditions) Let x∗ ∈ <n such that h(x∗) = 0, and, forsome > 0, f ∈ C2, h ∈ C2 on S(x∗; ) Assume that there exists vector λ∗∈ <m such that
∇xL(x∗, λ∗) = 0 (2.14)and for every z 6= 0 satisfying ∇h(x∗)tz = 0, we have
zt∇xxL(x∗, λ∗)z > 0 (2.15)Then x∗ is a strict local minimizer for 2.11
Remark: A point is said to be KKT point if it satisfies all the KKT necessary conditions
Optimiza-tion
Consider the constrained problem involving both equality and inequality constraints
Minimize f (x)subject to hi(x) = 0,