Các định lý tách tập lồi và một số vấn đề liên quan.Các định lý tách tập lồi và một số vấn đề liên quan.Các định lý tách tập lồi và một số vấn đề liên quan.Các định lý tách tập lồi và một số vấn đề liên quan.Các định lý tách tập lồi và một số vấn đề liên quan.SEPARATION THEOREMS AND RELATED PROBLEMS.SEPARATION THEOREMS AND RELATED PROBLEMS.SEPARATION THEOREMS AND RELATED PROBLEMS.SEPARATION THEOREMS AND RELATED PROBLEMS.SEPARATION THEOREMS AND RELATED PROBLEMS.SEPARATION THEOREMS AND RELATED PROBLEMS.SEPARATION THEOREMS AND RELATED PROBLEMS.SEPARATION THEOREMS AND RELATED PROBLEMS.SEPARATION THEOREMS AND RELATED PROBLEMS.SEPARATION THEOREMS AND RELATED PROBLEMS.SEPARATION THEOREMS AND RELATED PROBLEMS.SEPARATION THEOREMS AND RELATED PROBLEMS.SEPARATION THEOREMS AND RELATED PROBLEMS.SEPARATION THEOREMS AND RELATED PROBLEMS.SEPARATION THEOREMS AND RELATED PROBLEMS.SEPARATION THEOREMS AND RELATED PROBLEMS.
Affine sets
Definition 1.1 (Affine set, see e.g [2]) A subset A⊂ E is called an affine set if for every a,b∈A and λ∈R we have λa + (1 − λ)b ∈ A.
Given two distinct points \(a, b \in E\), the line through these points is defined as the set \(\{x \in E \mid x = \lambda a + (1 - \lambda) b, \ \lambda \in \mathbb{R}\}\) This line is an affine set, meaning it can be expressed as a translation of a linear subspace A subset \(A \subset E\) is affine if and only if, for any two distinct points in \(A\), the entire line passing through these points is contained within \(A\) Understanding the concept of affine sets and lines is fundamental in convex analysis and geometric studies within Euclidean spaces.
Definition 1.2 (Hyperplane, see e.g [1]) A hyperplane in E is a set of form
It is also not hard to see that a hyperplane is an affine set.
Definition 1.3 (Affine hull, see e.g [2]) Given a subset A ⊂ E The affine hull of A, denoted aff(A), is the smallest affine set in E containing A (in sense of set inclusion).
The following proposition is a well-known result about the structure of the affine hull.
Proposition 1.4 (See e.g [2]) For a given subset A ⊂ E, its affine hull aff(A) coincides the set of all affine combinations of its points, i.e., aff(A) ={θ 1 x 1 + .+θ k x k | x 1 , ,x k ∈A, θ 1 + .+θ k = 1}.
Definition 1.5 (Relative interior, see e.g [3]) Given a subsetA ⊂E The relative interior of A, denoted relint(A), is the set
Roughly speaking, the relative interior of a subset of R n is the interior of that set relative to its affine hull.
Convex sets
Definition 1.6 (Convex set, see e.g [3]) A subset C ⊂E is called a convex set if for every a,b∈C and λ∈[0,1] we have λa+ (1−λ)b ∈C.
A line segment [a,b] between two distinct points a and b in a Euclidean space E is defined as the set {x ∈ E | x = λa + (1−λ)b for some λ ∈ [0,1]}, demonstrating the concept of convexity Such a line segment is inherently a convex set, meaning that any point on the segment lies within the set Moreover, a subset C of E is convex if and only if the line segment between any pair of distinct points in C is also contained within C, highlighting the fundamental property of convex sets Additionally, hyperplanes in Euclidean space E are examples of convex sets, further emphasizing the importance of convexity in geometric analysis.
Similar to the affine hull, we have the following concept.
Definition 1.7 (Convex hull, see e.g [2]) Given a subset C ⊂ E The convex hull of C, denoted conv(C), is the smallest convex set in E containing C (in sense of set inclusion).
The following proposition is a well-known result about structure of the convex hull.
Proposition 1.8 states that for any subset C within a Euclidean space, its convex hull, conv(C), is precisely the set of all convex combinations of points in C This means that convex hulls can be characterized as the collection of points formed by weighted averages of points in C, where the weights are non-negative and sum to one Additionally, the proposition highlights several useful properties of convex sets, which are fundamental in understanding convex analysis and optimization.
Proposition 1.9 (i) The closureC of any convex set C ⊂E is also convex.(ii) Let C 1 andC 2 be convex sets in E Then C 1 ∩C 2 , C 1 +C 2 , C 1 −C 2 are also convex.
Proof (i) Let λ ∈[0,1] and x,y∈C There exist sequences {x n },{y n } inC such thatx n →xandy n →yasn → ∞ SinceCis convex, we haveλx n +(1−λ)y n ∈C for all n ∈ N Taking n → ∞ we have λx + (1 − λ)y ∈ C, which shows that C is convex.
(ii) Let x 1 ,x 2 ∈C 1 ∩C 2 , andθ∈[0,1] Sincex 1 ,x 2 ∈C 1 , by convexity ofC 1 we haveθx 1 + (1−θ)x 2 ∈C1 Similarly, sincex 1 ,x 2 ∈C2, by convexity ofC2 we have θx 1 + (1−θ)x 2 ∈C2 Thus, θx 1 + (1−θ)x 2 ∈C1∩C2, which proves the convexity of C 1 ∩C 2
Let λ∈ [0,1] and u,v ∈C 1 +C 2 Since u,v∈C 1 +C 2 , there existu 1 ,v 1 ∈C 1 andu 2 ,v 2 ∈C 2 such thatu=u 1 +u 2 ,v=v 1 +v 2 Sinceu 1 ,v 1 ∈C 1 , by convexity of C 1 we have λu 1 + (1−λ)v 1 ∈ C 1 Similarly, since u 2 ,v 2 ∈C 2 , by convexity of
C 2 we have λu 2 + (1−λ)v 2 ∈C 2 Therefore we have λu+ (1−λ)v=λ(u 1 +u 2 ) + (1−λ)(v 1 +v 2 )
Thus C 1 + C 2 is convex By similar arguments we obtain convexity of the set
Additionally, the following proposition gives some non-trivial properties of convex sets in finite dimensional spaces.
Proposition 1.10 (i) Any nonempty convex set in R n has nonempty relative in- terior.
(ii) Let C1, C2⊂R n be nonempty convex sets Then we have relint(C 1 −C 2 ) = relint(C 1 )−relint(C 2 ).
For the proof of Proposition 1.10(i), we refer to Proposition 1.9 in [2] For the proof of Proposition 1.10(ii), we refer to Corollary 2.87 in [4].
The following proposition gives an additional property of points in relative interior of a convex set.
Proposition 1.11 LetC be a nonempty convex set inE, x∈relint(C), and y∈C. Then there exists t >0 for which x+t(x−y)∈C.
For any real number \( t \), the expression \( x + t(x - y) = (1 + t)x - ty \) is an affine combination of \( x \) and \( y \), as the sum of the coefficients equals 1 Since \( x \) is in the relative interior of \( C \) and \( y \) belongs to \( C \), this affine combination remains within the affine hull of \( C \) Therefore, \( x + t(x - y) \in \text{aff}(C) \), demonstrating how affine combinations preserve membership within the affine hull of convex sets.
Sincex∈relint(C), there existsr >0 such that B(x, r)∩aff(C)⊂C By choosing t such that 0< t < ∥x−y∥ r we have x+t(x−y)∈B(x, r) (1.2)
For such choice of t we have both (1.1) and (1.2), and consequently x+t(x−y)∈B(x, r)∩aff(C)⊂C.
We will need the following result in the sequel.
Lemma 1.12 Let C be a nonempty convex set in E and x¯ ∈ C\relint(C) Then there exists a sequence {x k |k ∈N} ⊂ aff(C) with x k ∈ / C and x k → x ¯ as k → ∞.
The proof begins by noting that the relative interior of C, relint(C), is non-empty according to Proposition 1.10(i), allowing us to select 0 as a point within relint(C) The main objective is to demonstrate that (1 + t)¯x − t x₀ does not belong to C for all t > 0 Assuming the opposite—that (1 + t)¯x − t x₀ belongs to C for some t > 0—leads to the conclusion that, since x₀ is in relint(C), the affine combination (t / (t + 1)) x₀ + (1 / (t + 1)) ((1 + t)¯x − t x₀) also lies in relint(C) This results in a contradiction, as it contradicts the initial assumption that ¯x is not in relint(C).
Now, by choosing t = k 1 for k = 1,2, , we obtain x k := 1 + 1 k ¯ x− 1 k x 0 ∈/ C. Eachx k is an affine combination of ¯x∈C\relint(C) and x 0 ∈relint(C), hence it is in aff(C) By letting k→ ∞, we have x k :1 + 1 k ¯ x− 1 kx 0 →x.¯
Conic sets
Definition 1.13 (See e.g [3]) (i) A subset K ⊂ E is called a cone if for every a∈K and λ≥0 we have λa∈K.
(ii) A conic combination of points x 1 , ,x k ∈E is a point of form λ1x 1 + .+λ k x k with λ 1 , , λ k ≥0.
(iii) The conic hull of a given subset C ⊂ E, denoted cone(C) is the set of all conic combinations of points in C.
Similar to the case of affine hulls, we have the following well-known result about conic hulls.
Proposition 1.14 (See e.g [3]) The conic hull cone(C) of a subset C ⊂E is the smallest convex cone containing C (in sense of set inclusion).
Projection on convex sets
Proposition 1.15 (See e.g [1]) Let C ⊂ R n be a nonempty closed convex set. Let x∈R n Then there exists uniquely a vector x ∗ ∈C such that
Proof Existence Firstly, we observe that the functionf(y) =∥x−y∥is continuous onR n Indeed, let y 0 be an arbitrary vector inR n and {y n | n ∈N} a sequence in
R n converging to y 0 , i.e.,∥y n −y 0 ∥ →0 as n→ ∞ For any n ∈N we have
It follows that f(y n )→f(y 0 ) as n→ ∞, i.e.,f(y) is continuous at y 0 Since y 0 is chosen arbitrarily in R n , we obtain the continuity off onR n
Since C is closed, so is C y∗ Clearly, C y∗ is bounded, so it is compact Since f is continuous, by Bolzano-Weierstrass theorem, f achieves its minimum on the compact setC y∗ at somex ∗ ∈C y∗ ⊂C,i.e.,
Furthermore, for anyy∈/ C y∗ , by definition of C y∗ we have ∥x−y∥>∥x−y ∗ ∥ It means that miny∈C∥x−y∥= min y∈C y∗
Uniqueness Assume that x 1 and x 2 are minimizers of f over C That means x 1 ,x 2 ∈C and
Let ¯x = 1 2 (x 1 +x 2 ) Since x 1 ,x 2 ∈ C and C is convex, we have ¯x ∈ C, thus
So we must have ∥x 1 −x 2 ∥= 0, and consequently, x 1 =x 2
Proposition 1.15 enables us to define the projection of a vector x in ℝⁿ onto a nonempty closed convex set C as the point y in C that minimizes the Euclidean distance to x, denoted by proj_C(x) This projection is fundamental in convex analysis and optimization, providing a way to find the closest point in C to any given vector x The subsequent proposition offers a key characterization of the projection onto closed convex sets, essential for understanding the properties and behaviors of projections in convex optimization problems.
Proposition 1.16 (See e.g [1]) Given a nonempty closed convex set C ⊂R n and let x∈R n A vector z ∈ C is the projection proj C (x) if and only if
Proof Sufficiency Assume that z is the projection of xonto C Since (1.3) holds with y = z, we consider an arbitrary y ∈ C\{z} Since x,z ∈ C and C is convex, for any α∈(0,1) we have z+α(y−z) =αy+ (1−α)z∈C.
Recall z= proj C (x) = argmin y∈C ∥x−y∥, we have
This inequality holds for arbitraryα∈(0,1), therefore by lettingα→0 + we obtain (1.3).
Necessity Let z∈ C satisfying (1.3) For any y ∈ C such that y ̸=z, we have
From ∥x−y∥ 2 >∥x−z∥ 2 for any y∈C\{z}, we derive z= argmin y∈C ∥x−y∥= proj C (x).
Another important property of projection mapping onto closed convex sets is given in the following proposition.
Proposition 1.17 (See e.g [1]) Let C be a closed convex set in R n Then proj C is nonexpansive in the following sense
∥proj C (x 1 )−proj C (x 2 )∥ ≤ ∥x 1 −x 2 ∥ ∀x 1 ,x 2 ∈R n (1.4) Consequently, proj C is a continuous mapping.
Proof Let x 1 , x 2 be arbitrary points in E We first observe that the inequality (1.4) holds when proj C (x 1 ) = proj C (x 2 ) Therefore, we consider the case in which the projections ofx 1 and x 2 are distinct.
In view of the inequality (1.3) with x=x 1 ,y= proj C (x 2 ), we obtain
We now apply the inequality (1.3) again withx=x 2 ,y= proj C (x 1 ), we obtain
⟨x 2 −proj C (x 2 ),proj C (x 1 )−proj C (x 2 )⟩ ≤0 (1.6) Adding (1.5) and (1.6) gives
Note that proj C (x 1 )̸= proj C (x 2 ), then by dividing both sides of above inequality by
∥proj C (x 2 )−proj C (x 1 )∥, we obtain the inequality (1.4) It means that the projection mapping proj C is nonexpansive The continuity of proj C follows as a consequence of its nonexpansiveness.
We close this section with a computational result on the distance from a point to a hyperplane in a finite dimensional space R n
Lemma 1.18 Let H :=H(a, α) = {u ∈ R n | ⟨a, u⟩ = α} be a hyperplane in R n Then for any x∈R n we have min{∥x−y∥ |y∈H}= |⟨a,x⟩ −α|
A hyperplane in \(\mathbb{R}^n\) is a closed convex set, ensuring the existence of a point in the set that minimizes distance to any given point in space According to Proposition 1.15, for any fixed \(x \in \mathbb{R}^n\), the minimum of \(\|x - y\|\) over all \(y \in H\) is always achieved Since the hyperplane \(H = H(a, \alpha)\) is defined with \(a \neq 0\), we can apply the Cauchy-Schwarz inequality to analyze the relationship between points on the hyperplane, providing key insights for optimization and geometric properties.
∥a∥ 2 a satisfying ⟨a,y ∗ ⟩=α and ∥x−y ∗ ∥= |⟨a,x⟩−α| ∥a∥ It readily follows that min{∥x−y∥ |y∈H}= |⟨a,x⟩ −α|
Convex and concave functions
Definition 1.19 (See e.g [3]) A function f :E →R∪ {+∞} is said to be convex on a convex set C ⊂E if f(λx+ (1−λ)y)≤λf(x) + (1−λ)f(y) ∀x,y∈C, λ∈[0,1].
A function g : E → R∪ {−∞} is said to be concave on a convex set C ⊂ E if −g is convex on C.
It is well-known that the pointwise infimum of a set of linear functions is concave. This result is stated more precisely in the following proposition.
Proposition 1.20 (See e.g [3]) Let C ⊂ R n be a convex set For each α in an index set I ⊂R, let f α : C →R be a linear function Then f :C →R x7→ inf α∈If α (x) is a concave function on C.
Proof For any x,y∈C and λ∈[0,1], we have f(λx+ (1−λ)y) = inf α∈If α (λx+ (1−λ)y)
= inf α∈I(λfα(x) + (1−λ)fα(y)) (since eachfα is linear)
This proves the concavity off.
We will also need the following result.
Proposition 1.21 Let g : E → R be a concave function on a convex set C ⊂ E. Let f : R → R be a concave non-decreasing function on R Then the composition function h(x) := f(g(x)) is also a concave function on C.
Proof Let x,y be arbitrary point in C and λ∈ [0,1] Since C is convex, we have λx+ (1−λ)y is also inC By concavity of g onC, we have g(λx+ (1−λ)y)≥λg(x) + (1−λ)g(y).
Since f is non-decreasing, it follows that h(λx+ (1−λ)y) = f(g(λx+ (1−λ)y))≥f(λg(x) + (1−λ)g(y)) (1.7)
By concavity off we have f(λg(x) + (1−λ)g(y))≥λf(g(x)) + (1−λ)f(g(y)) =λh(x) + (1−λ)h(y) (1.8) From (1.7) and (1.8) we obtain h(λx+ (1−λ)y)≥λh(x) + (1−λ)h(y).
This proves the concavity ofh on C.
The following proposition gives us an important and non-trivial property of con- vex and concave functions on finite dimensional spaces.
Proposition 1.22 Let C be a nonempty open convex set If f :C ⊂R n →R is a convex (or concave) function, then it is continuous on C.
The proof of Proposition 1.22 can be found in e.g [2], Proposition 2.3.
Algebraic interior and algebraic closure
In this section, we consider a general vector space E without any imposed topology It is important to note that, as established earlier, key concepts discussed here are independent of any topological structure on the vector space.
• affine sets and affine hull (Definition 1.1 and Definition 1.3),
• convex sets and convex hull (Definition 1.6 and Definition 1.7),
• cones and conic hull (Definition 1.13 and Definition 1.14),
• convex and concave functions (Definition 1.19).
In infinite-dimensional vector spaces, these concepts remain applicable; however, the definition of relative interior (as outlined in Definition 1.5) and the method of projecting onto convex sets (described in Section 1.4) are dependent on the specific norm used in the underlying vector space.
The concept of relative interior is illustrated in Figure 1.1 with an example in R² using the standard Euclidean norm Consider two distinct points, x₁ and x₂, in R², connected by the line segment A This example highlights the importance of understanding the relative interior within convex sets, emphasizing how the segment's interior points differ from those on its boundary Recognizing the relative interior helps clarify the structure of convex sets in Euclidean spaces and is fundamental in optimization and convex analysis.
The affine hull of set A, denoted as aff(A), is the line passing through points x₁ and x₂ If a point x lies within the line segment between x₁ and x₂, then it is possible to select a small radius r > 0 such that the open ball B(x, r) intersected with aff(A) is contained entirely within A, indicating that x is a relative interior point of A Conversely, if x is either of the endpoints x₁ or x₂, such a radius r does not exist, meaning that these points are not relative interior points of A.
Figure 1.1: Relative interior of a line segment.
In this example, the intersection of the ball B(x, r) with the affine hull of A, aff(A), forms an open line segment that contains the point x, highlighting the geometric relationship within the set To establish that x is a relative interior point of A, it is sufficient to ensure that, for every neighborhood around x within the affine hull, there exists an open line segment contained entirely within A Specifically, this condition can be restated as: every line ℓ in aff(A) passing through x must include an open line segment lying completely inside A, emphasizing the importance of local segment richness for relative interior characterization.
The key advantage of this new condition is that it relies solely on the algebraic structure of the underlying vector space E, making it independent of any norm or topology This allows for the generalization of the concept of relative interior to more general vector spaces, expanding its applicability beyond traditional settings.
Definition 1.23 (Relative algebraic interior and relative algebraic closure, see e.g. [1]) Let A be a subset in a general vector space E.
(i) The relative algebraic interior of A, denoted rai(A), is defined by
In case aff(A) = E, we call the above set the algebraic interior of A, and denote ai(A) instead of rai(A).
(ii) The relative algebraic closure of A, denoted rac(A), is defined by
In case aff(A) = E, we call the above set the algebraic closure of A, and denote ac(A) instead of rac(A).
Concerning the notations in Definition 1.23, for u,v∈E we define
It is worth noting that the condition (1.9) can be equivalently replaced by
The following proposition can be seen as a generalization of Proposition 1.9 (i).Proposition 1.24 For any convex set C ⊂ E we have ai(C) and ac(C) are also convex.
We will use the following useful result in the sequel chapters.
Proposition 1.25 Let C be a convex set in E and x ∈ ai(C), y ∈ ac(C) Then [x,y)⊂ai(C).
The following proposition can be seen as a generalization of Proposition 1.10 (ii).
Proposition 1.26 For any convex setsC, D ⊂E with nonempty relative algebraic interiors we have rai(C+D) = rai(C) + rai(D).
Separation between two convex sets
This chapter explores key separation theorems concerning two convex sets, beginning with foundational separation concepts in Section 2.1 In Section 2.2, we present theorems and their corollaries, initially focusing on finite-dimensional Euclidean vector spaces before extending the discussion to general vector spaces.
Separation concepts
In R n
In this discussion, we focus on the setting of \( \mathbb{R}^n \) equipped with its standard inner product and the resulting norm, providing a straightforward framework for analysis It is important to note that the concepts presented are applicable to all finite-dimensional Euclidean vector spaces, ensuring their broader relevance beyond the specific case of \( \mathbb{R}^n \).
Definition 2.1 (Half-space in R n , see e.g [1]) Let H :=H(a, ξ) be a hyperplane in R n The two following closed sets
H¯ + (a, ξ) = {x∈R n : ⟨a, x⟩ ≥ ξ}, H ¯ − (a, ξ) = {x ∈R n : ⟨a, x⟩ ≤ ξ} are called the closed half-spaces associated withH, while the two following open sets
14 are called the open half-spaces associated with H.
Definition 2.2 (Separation concepts in finite dimensional spaces, see e.g [1]). Given nonempty convex sets C, D ⊂E, and let H =H(a, ξ) be a hyperplane in R n (i) The setsCandDare said to be separated by the hyperplaneH ifC⊆H¯ + (a, ξ) and D⊆H¯ − (a, ξ), i.e.,
In this case we say that H is a separating hyperplane for C and D.
(ii) The sets C and D are said to be strictly separated by the hyperplane H if
In this case we say that H is a strictly separating hyperplane for C and D.
(iii) The sets C and D are said to be strongly separated by the hyperplane H if there exist β > ξ > γ such that C ⊆H¯ + (a, β), D⊆H¯ − (a, γ), i.e.,
In this case we say that H is a strongly separating hyperplane for C and D.
(iv) The sets C and D are said to be properly separated by the hyperplane H if the two following conditions hold:
• C and D are not both included in H.
In this case we say that H is a proper separating hyperplane for C and D.
In Figure 2.1, the set C is a closed circle (including its boundary) and the set
D is a closed square (including its boundary) in R 2 An edge of the square D is included in the hyperplane H(a, ξ) and it is tangent to the circle C In this case,
C and D are separated by the hyperplane H(a, ξ) We observe furthermore that in this caseC and D cannot be either strictly separated or strongly separated.
H¯ + (a, α) H¯ − (a, α) a Figure 2.1: Separation of two sets by a hyperplane.
In Figure 2.2, sets C and D are depicted as an open circle and an open square in \(\mathbb{R}^2\), respectively, both excluding their boundaries An edge of the square's boundary lies within the hyperplane \(H(a, \(\xi\))\) and is tangent to the boundary of the circle C This configuration indicates that the sets C and D are strictly separated by the hyperplane \(H(a, \(\xi\))\), highlighting their distinct and non-overlapping positions in the plane.
We observe furthermore that in this case C and D are separated, but cannot be strongly separated.
H + (a, α) H − (a, α) a Figure 2.2: Strict separation of two sets by a hyperplane.
In Figure 2.3, the set C is depicted as a closed circle including its boundary, while the set D is a closed square with boundary in R², with both sets being disjoint The function H(a, ξ) effectively and strongly separates the sets C and D, demonstrating both their separation and strict separation within the Euclidean plane.
Figure 2.3: Strong separation of two sets by a hyperplane.
Figure 2.4(i) illustrates two line segments, C and D, lying on the same hyperplane H(a, ξ), with the hyperplane included in both half-spaces ¯H + (a, ξ) and ¯H − (a, ξ) This demonstrates that H(a, ξ) separates C and D, as C lies in one half-space and D in the other However, since C and D do not both lie exactly within H(a, ξ), they are not properly separated by this hyperplane Instead, they can still be separated by a hyperplane orthogonal to H(a, ξ) Therefore, proper separation of convex sets depends on the presence of a separating hyperplane that effectively divides the sets, emphasizing the importance of the separating hyperplane in defining proper separation.
Figure 2.4: (i) Not proper separation (ii) Proper separation.
In Figure 2.4(ii), the set C is a line segment lying on the hyperplane H(a, ξ), while the set D is a closed square entirely contained within the half-space ¯H − (a, ξ) Since H(a, ξ) is included in the half-space ¯H + (a, ξ), it follows that C is contained in ¯H + (a, ξ), making H(a, ξ) a separating hyperplane between C and D It is evident that D is not contained within this hyperplane, highlighting that H(a, ξ) effectively separates the two sets.
D are properly separated by the hyperplaneH(a, ξ).
In general vector spaces
Let E be a general vector space.
Definition 2.3 (See e.g [1]) A subset H ⊂E is called a hyperplane if it is of the form
H ={x∈E |h(x) =ξ} for some ξ∈R and some nontrivial linear functional h:E →R.
Roughly speaking, a hyperplane in E is the level set of a nontrivial linear func- tional We denote H :=H(h, ξ) to indicate the linear functional h and the level ξ defining the hyperplane.
Definition 2.4 (See e.g [1]) Given a hyperplane H := H(h, ξ) in E The two following sets
H¯ + (h, ξ) = {x∈E :h(x)≥ξ}, H¯ − (h, ξ) = {x∈E :h(x)≤ξ} are called the (algebraically) closed half-spaces associated with H, while the two fol- lowing sets
H + (h, ξ) = {x∈E :h(x)> ξ}, H − (h, ξ) = {x∈E :h(x)< ξ} are called the (algebraically) open half-spaces associated with H.
The terms ‘closed’ and ‘open’ refer to concepts that are independent of any underlying topology in space E, highlighting their fundamental nature across different contexts These definitions emphasize the similarity of these concepts to those used in finite-dimensional spaces, ensuring consistency and clarity in their application regardless of the dimensionality of the space.
Definition 2.5 (See e.g [1]) Given a hyperplane H = H(h, ξ) in E and two nonempty convex sets C, D⊂E.
(i) The setsCandDare said to be separated by the hyperplaneH ifC ⊆H¯ + (h, ξ) and D⊆H¯ − (h, ξ), i.e., h(x)≥ξ ≥h(y) ∀x∈C,y∈D.
In this case we say that H separates C and D.
(ii) The sets C and D are said to be strictly separated by the hyperplane H if
In this case we say that H strictly separates C and D.
(iii) The sets C and D are said to be strongly separated by the hyperplane H if there exist β > ξ > γ such that C ⊆H¯ + (h, β), D⊆H¯ − (h, γ), i.e., h(x)≥γ > ξ > β ≥h(y) ∀x∈C,y∈D.
In this case we say that H strongly separates C and D.
(iv) The sets C and D are said to be properly separated by the hyperplane H if the two following conditions hold:
• C and D are not both contained in H.
In this case we say that H properly separates C and D.
The following proposition gives us an important property of hyperplanes in gen- eral vector spaces.
Proposition 2.6 Any hyperplane H⊂E is a proper maximal affine subset of E.Proof See Lemma 6.27 in [1].
Separation theorems
In R n
Although these results apply to finite-dimensional Euclidean vector spaces, we will focus on the more straightforward setting of \(\mathbb{R}^n\) equipped with the standard inner product and the induced norm, to simplify the presentation and understanding of key concepts.
It is worth noting the equivalence of the two following facts:
• Two given convex sets C, D ∈R n are separable from each other.
• The point 0 can be separated from the convex set C−D.
Therefore, in the following, we first discuss about separation of a single point from a closed convex set, and then draw the results concerning separation between convex sets.
Theorem 2.7.(See e.g [1]) IfC ⊂R n is a nonempty convex set andx¯∈/ relint(C), thenx¯ can be separated fromC That is, there exists a hyperplaneH(a, ξ)containing ¯ x such that C ⊂H¯ + (a, ξ), or equivalently, ⟨a,x⟩ ≥ ⟨a,x⟩¯ =ξ for all x∈C.
Proof SinceC is convex, by Proposition 1.9(i), its closure C is also convex Since ¯ x∈/relint(C), either ¯x∈/C or ¯x∈C\relint(C).
We first consider the former case in which ¯x∈/ C Proposition 1.16 gives us the inequality
⟨¯x−proj C (¯x),x−proj C (¯x)⟩ ≤0 ∀x∈C (2.1) Leta= proj C (¯x)−x Clearly,¯ a̸=0 since ¯x∈/ C by our assumption By rewriting x−proj C (¯x) =x−x¯−(proj C (¯x)−x) =¯ x−¯x−a, from inequality (2.1) we derive⟨a,x−¯x−a⟩ ≥0 Then⟨a,x−x⟩ ≥ ⟨a,¯ a⟩=∥a∥ 2 >
Letξ =⟨a,x⟩, then the theorem is proved in this case.¯
In the latter case ¯x∈C\relint(C), by Lemma 1.12 there exists a sequence {x k | k ∈N} of points not in C such that x k →x By (2.1), we obtain¯
Note thatx k ∈/ C,proj C (x k )∈C, so x k ̸= proj C (x k ) Hence we can define a k := 1
Since each vector a_k has a norm of 1, the sequence {a_k} is bounded in R^n, ensuring the existence of a convergent subsequence {a_{k_i}}, with a_{k_i} approaching some vector a Because the norm of a is 1, it follows that a ≠ 0 Using the continuity of the projection operator onto the set C and the convergence of the sequence x_k to x, we conclude that the projections proj_¯C(x_k) converge to proj_C(¯x) = ¯x Observing equation (2.6) and considering the limit as k_i approaches infinity, we derive the desired results.
By letting ξ=⟨a,x⟩, the theorem is proved.¯
The previous theorem implies the existence of a so-called support hyperplane to a convex set, which is defined as follows.
Definition 2.8 (Support hyperplane, see e.g [1]) A hyperplane H(a, ξ) is called a support hyperplane of a convex set C ⊂ R n at a point x ∈ C if x ∈ H(a, ξ) and
Since ¯H + (a, ξ) is closed, the condition C ⊂ H¯ + (a, ξ) in the above definition is equivalent toC ⊂H¯ + (a, ξ) With this definition, the above theorem can be restated as follows.
Theorem 2.9 (Support hyperplane theorem, see e.g [1]) For any point x ∈ C\relint(C) in which C ⊆ R n is a nonempty convex set, there exists a support hyperplane to C at x.
Now we discuss the results concerning separation between convex sets.
Theorem 2.10 (First separation theorem, see e.g [1]) Any nonempty disjoint convex sets C, D ⊂R n can be separated by a hyperplane H(a, ξ) in the sense that
Proof Let A := C −D = {x−y | x ∈ C,y ∈ D} By Proposition 1.9(ii), A is convex Since C and D are disjoint, we have 0 ∈/ A, and hence 0 ∈/ relint(A).
By Theorem 2.7, there exists a hyperplane H(a,0) containing 0 such that ⟨a,s⟩ ≥
⟨a,0⟩= 0 for all s∈A In particular,⟨a,s⟩ ≥0 for alls∈A⊂A.SinceA=C−D, it follows that ⟨a,x−y⟩ ≥ 0, or equivalently, ⟨a,x⟩ ≥ ⟨a,y⟩ for all x ∈C,y ∈ D.
By choosing ξ such that x∈Cinf⟨a,x⟩ ≥ξ ≥ sup y∈D
Theorem 2.11 (Strong separation theorem, see e.g [1]) Any nonempty disjoint closed convex setsC, D ⊂R n can be strongly separated if one of the sets is compact.
Assuming that set D is compact, the proof leverages the concept of strong separation between two convex sets to establish the existence of a separating hyperplane Specifically, there exists a hyperplane H(a, ξ) such that for all x in C, the inequality ⟨a, x⟩ > ξ holds, while for all y in D, ⟨a, y⟩ < ξ This condition confirms strong separation and is essential for the theorem's validity, highlighting the critical role of hyperplanes in convex analysis and separation theorems.
Let A := C − D, where C is closed and D is compact; under these conditions, A is also closed Consider a sequence {u_k} in A that converges to some point u in ℝⁿ; each u_k can be expressed as x_k − y_k with x_k ∈ C and y_k ∈ D Since D is compact, we can extract a convergent subsequence {y_{k_i}} with limit y ∈ D Because x_{k_i} − y_{k_i} converges to u and y_{k_i} converges to y, it follows that x_{k_i} converges to x = u + y Since C is closed, x ∈ C, so u = x − y ∈ C − D, confirming that A is closed.
Since sets C and D are disjoint, the zero vector is not an element of their union A According to Proposition 1.9(ii), this implies that A is convex Employing similar reasoning used in the initial part of the proof of Theorem 2.7, we conclude that there exists a nonzero vector \( a \) satisfying the required properties.
2∥a∥ 2 >⟨a,y⟩ for all x∈C and y∈D Then, by choosing ξ = max y∈D ⟨a,y⟩+ 1 2 ∥a∥ 2 , we obtain (2.3) and the theorem is proved.
As a remark, the compactness condition in Theorem 2.11 cannot be omitted For example, in R 2 let us consider the two convex sets
Figure 2.5 demonstrates a counter-example involving two sets, C and D, which are not compact The only hyperplane that separates these sets is the x-axis, which coincides with the boundary of set C As a result, C and D cannot be strongly separated, illustrating limitations in their separability due to their non-compactness.
An important corollary of Theorem 2.11 concerns representation of convex sets as follows.
Corollary 2.12 (See e.g [1]) Any nonempty closed convex set in R n coincides with the intersection of all closed half-spaces containing it.
Proof Let C ⊆R n be a nonempty closed convex set and define D as
We need to show that C = D Indeed, since C is contained in each half-space forming D, it is also contained in the intersection of the half-spaces Therefore
C ⊆D It remains to show that D⊆C. y x D
Figure 2.5: An example of two convex sets that cannot be strongly separated.
Since D is the intersection of closed sets, it is itself closed and convex, as confirmed by Proposition 1.9(ii) Assuming, for contradiction, that D is not a subset of C, we identify a point x₀ in D but not in C By applying Theorem 2.11 to the compact convex set {x₀} and the closed convex set C, we find a hyperplane H = H(a, ξ) separating x₀ from C, with x₀ on one side of H and C contained on the other The construction of D ensures that the relevant half-spaces define this separation, confirming the convexity and closedness of D and establishing the separation properties within convex analysis.
H¯ + (a, ξ) is one of the closed half-spaces intersected to obtain D, soD⊆H¯ + (a, ξ). Since x 0 ∈ D, we have x 0 ∈ H¯ + (a, ξ), but this contradicts x 0 ∈ H − (a, ξ) This contradiction proves the corollary.
We now come to the results concerning proper separation between convex sets. The following lemma is useful in the proof of the results.
Lemma 2.13 (See e.g [1]) Two nonempty convex setsC, D ∈R n can be properly separated if and only if 0 is properly separated from K :=C−D.
Proof Sufficiency Recall that the setK =C−D is convex thanks to Proposition 1.9(ii) Let H(a, ξ) be a hyperplane properly separating C and D such that C ⊆
H¯ + (a, ξ), D ⊆ H¯ − (a, ξ) Without loss of generality, assume that C does not lie on H(a, ξ) Then we have
⟨a,x⟩ ≥ξ ≥ ⟨a,y⟩ ∀x∈C,y∈D, and ⟨a,x 0 ⟩> ξ for some x 0 ∈C This means that
The condition ⟨a, z⟩ ≥ 0 for all z in K = C−D, combined with the existence of some y₀ in D such that ⟨a, z₀⟩ > 0 for z₀ = x₀ − y₀, ensures that the hyperplane H(a, 0) properly separates the origin from the convex set K This demonstrates that the hyperplane acts as a boundary that distinctly separates the origin from the set K, which is essential for understanding the geometric structure of convex sets in optimization problems Additionally, the necessity condition states that if a hyperplane H(a, ξ) properly separates the origin from K, then K must be contained within the closed half-space H̄⁺(a, ξ), highlighting the importance of separating hyperplanes in convex analysis and optimization theory.
⟨a,x−y⟩ ≥ξ≥0 for all x∈C,y∈D The proper separation means that
• either K is included in the hyperplane H(a, ξ) while the origin0 is not,
• or K is not included in the hyperplane H(a, ξ) (but it is still contained in the half-space ¯H + (a, ξ)).
In the former case, since 0 ∈/ H(a, ξ) we have ξ > 0, and since K ⊂H(a, ξ) we have⟨a,x−y⟩=ξ for all x∈C,y∈D In this case we obtain for any x∈C and any y∈D that
2+⟨a,y⟩>⟨a,y⟩, so the hyperplane H(a, β) with β = ξ 2 +⟨a,y⟩ properly separatesC and D.
In the latter case, sinceK is not included in the hyperplaneH(a, ξ), there exists z 0 ∈ K such that ⟨a,z 0 ⟩ > ξ Since z 0 ∈ K = C −D, there exist x 0 ∈ C and y 0 ∈D such that z 0 =x 0 −y 0 So we have
⟨a,x 0 ⟩> ξ+⟨a,y 0 ⟩ (2.4) From the fact that ⟨a,x−y⟩ ≥ξ for all x∈C,y∈D, we have
So we obtain x∈Cinf⟨a,x⟩ ≥ξ+ sup y∈D
This, together with (2.4), means that any hyperplane H(a, γ) with x∈Cinf⟨a,x⟩ ≥γ ≥ξ+ sup y∈D
In relation with Lemma 2.13 we have the following result.
Lemma 2.14 (See e.g [1]) Let C be a nonempty convex set in R n Then the origin 0 and the set C can be properly separated if and only if 0 ∈/relint(C).
Proof Necessity Since 0 ∈/ relint(C), either 0 ∈/ C or 0 ∈ C\relint(C) We first consider the former case in which 0 ∈/ C By Proposition 1.9(i), since C is convex, so is its clossure C Proposition 1.16 gives us the inequality
⟨a,x−a⟩ ≥0 ∀x∈C, (2.5) in which a = proj C (0) Clearly, a ̸= 0 since 0 ∈/ C by our assumption From the inequality (2.5), we derive ⟨a,x⟩ ≥ ⟨a,a⟩ = ∥a∥ 2 > 0 for all x ∈ C This implies that the setC and {0} are properly separated.
We now consider the latter case in which0∈C\relint(C) By Lemma 1.12, there exists a sequence {x k | k ∈ N} ⊂ aff(C) with x k ∈/ C and x k → 0 as k → ∞ By Proposition 1.16, we obtain
Note thatx k ∈/ C,proj C (x k )∈C, so x k ̸= proj C (x k ) and hence we have a k := 1
∥proj C (x k )−x k ∥ proj C (x k )−x k ̸=0, and furthermore we obtain
In the context of the convex set C, it is observed that 0 belongs to C and the affine hull of C, denoted as aff(C), forms a linear subspace of ℝⁿ Consequently, both the sequence {x_k} and their projections onto C, proj_C(x_k), reside within aff(C), which implies that each a_k in the sequence belongs to aff(C) Since each a_k has unit norm (∥a_k∥ = 1), the sequence {a_k} is bounded within aff(C) and thus contains a convergent subsequence {a_{k_i}} that converges to some a in ℝⁿ Because aff(C) is closed, this limit point a lies within aff(C), and its norm equals 1 (∥a∥ = 1), indicating that a ≠ 0 Using the continuity of the projection operator proj_C (as established in Proposition 1.17) and the fact that x_k approaches 0, we conclude that proj_C(x_k) converges to proj_C(0), which equals 0 Taking the limit as i approaches infinity, we derive the key result from equation (2.6).
Assume that ⟨a,x⟩ = 0 for all x∈ C Since a ∈ aff(C), by Proposition 1.4, a can be represented as an affine combination of some vectorsv 1 , ,v m ∈C, that is a m
X i=1 λiv i in whichλ 1 , , λ m ∈R and λ 1+ .+λ m = 1 By our assumption that ⟨a,x⟩= 0 for allx∈C, takingxasv 1 , ,v m we have ⟨a,v i ⟩= 0 for alli= 1, , m Hence we obtain
X i=1 λ i ⟨a,v i ⟩= 0, which contradicts the fact that ∥a∥= 1 Since the assumption is false, there exists x 0 ∈C such that ⟨a,x 0 ⟩>0 This shows the proper separation of the sets {0}and C.
Sufficiency Assume that {0} and C are properly separated Then there exists a ∈ R n such that ⟨a, x⟩ ≥ 0 for all x ∈ C and ⟨a, x 0 ⟩ > 0 for some x 0 ∈ C If on the contrary0∈relint(C), by Proposition 1.11, there exists t >0 such that
Then ⟨a,−tx 0 ⟩ ≥ 0, or equivalently ⟨a,x 0 ⟩ ≤ 0, which is a contradiction Hence
We come up with the following theorem on proper separation between convex sets.
Theorem 2.15 (Proper separation theorem, see e.g [1]) We can properly separate two nonempty convex sets C, D ⊂ R n if and only if their relative interiors are disjoint.
Proof Since relint(C) and relint(D) are disjoint, we have0∈/ relint(C)−relint(D).
Proposition 1.10 establishes that the relative interiors of convex sets satisfy the relation relint(C) − relint(D) = relint(C − D), implying that the origin 0 is not in relint(C − D) Since both C and D are convex, their Minkowski difference C − D is also convex, as confirmed by Proposition 1.9(ii) Utilizing Lemma 2.14, we conclude that the origin and the set C − D can be properly separated Consequently, Lemma 2.13 implies that the convex sets C and D can be properly separated as well, highlighting a key property of convex sets in separation theorems.
In general vector spaces
Throughout this subsection, E is a general vector space without any equipped topology We start with the following concept.
Definition 2.16 (See e.g [1]) Two nonempty convex sets C, D ⊂ E are called complementary convex sets if they are disjoint and C∪D=E.
Complementary convex sets C and D⊂E separate two nonempty convex sets A and B⊂E when one set is contained entirely within C and the other within D, meaning either A⊂C and B⊂D or A⊂D and B⊂C This relationship is known as complementarily convex separated (by C and D), highlighting how convex sets can be distinctly separated in a Euclidean space.
Complementary convex sets are disjoint, and when they separate two nonempty convex sets A and B in a Euclidean space, it follows that A and B are also disjoint A key lemma reveals that the reverse implication is true, establishing a fundamental connection between the separation of convex sets and their complementary sets This nontrivial result underpins the development of subsequent theorems in this section, highlighting its importance in convex analysis and geometric separation theory.
Lemma 2.17 (See e.g [1]) If two nonempty convex sets A, B ⊂ E are disjoint, then they are complementarily convex separated.
Proof Let G be the set of disjoint convex subsets (C, D) ⊂ E ×E such that
In this article, we define a partial order ⪯ on the set G, where (C, D) ⪯ (C′, D′) if and only if C ⊂ C′ and D ⊂ D′, leveraging the set inclusion relation Since set inclusion is a partial order, the relation ⪯ inherits this property, establishing a partial order on G Moreover, for any totally ordered subset F of G, the union of all sets within F serves as an upper bound, due to the similar property of set inclusion Importantly, because the elements in F are nested, the resulting upper bound forms a pair of disjoint convex sets in E Applying Zorn’s lemma, this ensures the existence of a maximal element (C*, D*) in G.
• C ∗ and D ∗ are convex and disjoint,
• if C and D are convex sets satisfying C ⊃ C ∗ and D ⊃ D ∗ , then we have
It is left to prove that C ∗ ∪D ∗ =E Indeed, assume the contrary that there exists x∈E\(C ∗ ∪D ∗ ) By the maximality of (C ∗ , D ∗ ), we have conv(C ∗ ∪ {x})∩D ∗ ̸=∅ and conv(D ∗ ∪ {x}) ∩ C ∗ ̸=∅.
Therefore we can pick y 1 ∈conv(C ∗ ∪ {x})∩D ∗ and y 2 ∈conv(D ∗ ∪ {x})∩C ∗
In this analysis, selecting points y₁ and y₂ leads to the identification of corresponding points x₁ in C* and x₂ in D*, such that y₁ lies within the segment (x, x₁) and y₂ within (x, x₂) The intersection point z, formed by the segments [x₁, y₂] and [x₂, y₁], is shown to belong to both convex sets C* and D* due to their convexity properties Consequently, this point z lies in the intersection C* ∩ D*, demonstrating the overlap of these convex sets.
C ∗ and D ∗ are not disjoint This contradicts the construction of these sets This contradiction means thatC ∗ ∪D ∗ =E as desired. x x 2 x 1 y 2 y 1 z
Figure 2.6: Illustration for the proof of Lemma 2.17.
This lemma provides valuable insights into the structure of complementary convex sets, highlighting their geometric properties It is applicable in finite-dimensional spaces, where the geometric intuition becomes especially clear Understanding this relationship is crucial for analyzing convex set configurations and their interactions within various mathematical contexts.
Lemma 2.18 (See e.g [1]) Let C and D be complementary convex sets in E. Let L := ac(C)∩ac(D) Then either L = E or L is a hyperplane in E The former case holds if and only if the algebraic interiors of C and D are both empty, or equivalently, ac(C) = ac(D) =E If the latter case holds, then the following also holds:
(i) the algebraic interiors of C and D are both nonempty,
(ii) ai(C), ai(D) are the algebraically open half-spaces associated with L,
(iii) ac(C), ac(D) are the algebraically closed half-spaces associated with L.
By Proposition 1.24, the convexity of sets C and D implies that their ac(C) and ac(D) are also convex As the intersection of these two convex sets, L is necessarily convex and nonempty, ensuring that there exists at least one point in both ac(C) and ac(D) Since both C and D are nonempty, we can select points x in C and y in D, further supporting the nonemptiness of the intersection.
C and D are disjoint, there exists z ∈ (x,y) such that [x,z) ⊂ C and (z,y] ⊂ D.
By definition of algebraic closure, we have z∈ac(C) and z∈ac(D) Hence z ∈L, which implies L̸=∅.
We now show that ac(C) =E\ai(D) (2.7)
In the context of algebraic interior and closure, selecting any point x in the set E\ai(D) reveals that there exists a vector u in E such that, for all positive r, the interval [x, x + r(u - x)) is not fully contained within D By defining v as x + r(u - x), we see that for all v in E with x lying between u and v, the semi-open interval [x, v) is contained in the complement of D, denoted as C, indicating that x belongs to the algebraic closure ac(C) Since this holds for any arbitrary x outside the algebraic interior of D, we conclude that E\ai(D) is a subset of ac(C) Conversely, for any point y in ac(C), there exists a point z in C such that the semi-open interval [z, y) is contained in C, implying that y cannot be in the algebraic interior ai(D), as that would lead to a contradiction where [y, z) is contained in D.
(y,z)⊂C∩D, contradicting the fact that C and D are disjoint So we obtain the reverse inclusion ac(C)⊂E\ai(D), and therefore (2.7) holds.
Since the sets C and D have equal roles, by similar arguments we obtain ac(D) = E\ai(C) (2.8)
It follows immediately from (2.7) and (2.8) that L = E if and only if both ai(C) and ai(D) are empty, or equivalently, ac(C) = ac(D) = E.
Now we consider the case that L⊊ E In this case we need to show that L is a hyperplane.
Firstly, we observe thatL is an affine set Indeed, letx,y are arbitrary points in
L, andz∈E such thaty∈(x,z) Assume the contrary thatz̸∈L= ac(C)∩ac(D).
If z does not belong to ac(C), then by equation (2.7), we deduce that z is in ai(D) Since x is in ac(D), Proposition 1.25 indicates that y must also be in ai(D) Consequently, according to (2.7), y is not in ac(C), which contradicts our initial assumption that y belongs to L, where L is the intersection of ac(C) and ac(D) This contradiction confirms that z must be in L, thereby demonstrating that L is an affine set.
Since we are considering the case that L ⊊ E, we can pick some p ∈/ L Since a hyperplane in E is a maximal affine set in E (cf Proposition 2.6), to show that
To establish that E equals the affine hull of L united with {p}, it suffices to show this inclusion Since p is not in L and not in the intersection of the algebraic cores of C and D, we can assume without loss of generality that p is not in ac(D) Using equation (2.8), we deduce that p belongs to the set ai(C) By selecting an arbitrary point r in L and considering q = 2r - p, we observe that r lies on the line segment between p and q If q were in ac(C), it would imply r is in ai(C), which conflicts with r's membership in L contained in ac(D) Consequently, q must be in E \ ac(C), which equals ai(D) This implies that any point x in C but not in L must satisfy that its line segment to q intersects L, placing x in the affine hull of L combined with {p} A similar argument applies for points y in D outside L, confirming that y also belongs to aff(L ∪ {p}) Therefore, every such point in C and D resides in the affine hull of L united with {p}, demonstrating the equality E = aff(L ∪ {p}).
It follows from (2.7) and (2.8) that ai(C), ai(D), L are pairwise disjoint, and their union isE The arguments (i), (ii), (iii) follows immediately.
Now we come to the first separation theorem in the setting of general vector spaces.
Theorem 2.19 (See e.g [1]) Let C, D ⊂ E be nonempty convex sets such that ai(C) ̸= ∅ Then C and D can be separated by a hyperplane H in E if and only if ai(C)∩D =∅ In this case, ai(C) is contained in one of the algebraically open half-spaces associated with H.
Proof Necessity LetC and Dbe separated by a hyperplane H in such a way that
In the context of convex analysis, since C is contained within the closure of the open half-space H+ (i.e., C ⊆ H¯ +) and D is contained within the closure of the opposite half-space H¯ − (i.e., D ⊆ H¯ −), and given that the algebraic interior of C (ai(C)) is non-empty, it follows that ai(C) must be contained within the open half-space H+ (ai(C) ⊆ H+) This is because C is not contained in the hyperplane H itself, and selecting a point y in the intersection of C and H+ confirms this Furthermore, the algebraic interior of C cannot intersect with H, and as a result, ai(C) does not intersect with D, which lies in H¯ − This reasoning ensures that ai(C) and D are disjoint, reinforcing the separation properties in convex analysis.
Sufficiency Assume that ai(C)∩D=∅ Since C is convex, by Proposition 1.24 we have ai(C) is convex Applying Lemma 2.17 for disjoint convex sets ai(C) and
D, there exists complementary convex sets C ′ and D ′ such that ai(C) ⊆ C ′ and
In the given context, since D is a subset of D′, and x belongs to the algebraic interior of C, for any y in E, there exists a u in C such that x lies in the segment (u,y) Using Proposition 2.6, we establish that the segment [x,u) is contained within the algebraic interior of C, allowing us to assume u is also in ai(C) Given that ai(C) is contained in C′ and C′ is convex, it follows that the segment [x,u) is in C′, which implies that x is in the algebraic interior of C′ Since x was arbitrarily chosen in ai(C), this leads to the conclusion that ai(C) is a subset of ai(C′).
Since C′ and D′ are complementary convex sets, Lemma 2.18 establishes that their affine hulls intersect in a hyperplane H that separates the two sets Because the images of C under the affine transformation are contained within C′, and D is contained within D′, this hyperplane H also separates ai(C) and D Without loss of generality, we can assume that ai(C) lies on the positive side of H's affine span, denoted as H¯ +, while D lies on the negative side, denoted as H¯ −.
We demonstrate that C is a subset of H̄+ by assuming the contrary—that C is not contained in H̄+—and arriving at a contradiction Since H− and H̄+ are disjoint and their union covers the entire space E, assuming C intersects H− leads to the selection of a point x in C ∩ H− and a point y in the accessible set of C, which is within H̄+ According to Proposition 1.25, the pair (y, x) contains a point z in the intersection of the accessible set of C and H− However, because the accessible set of C is contained within H̄+, its intersection with H− must be empty, contradicting the existence of z Therefore, C must be contained within H̄+.
We have shown that C ⊂ H¯ + and D ⊂ H¯ − This means that C ans D are separated byH.
Homogeneous Farkas lemma
Homogeneous Farkas Lemma addresses the solvability of finite systems of homogeneous linear inequalities, offering crucial insights into linear programming and optimization Named after Hungarian mathematician Gyula Farkas, who first proved the result, this lemma is fundamental in understanding whether a given system has solutions within Euclidean space ℝⁿ Formulated within the context of ℝⁿ equipped with the standard inner product ⟨x, y⟩, the lemma provides necessary and sufficient conditions for the existence of solutions to homogeneous linear inequalities, making it an essential tool in mathematical analysis and computational mathematics.
Lemma 3.1 (Homogeneous Farkas lemma) (See e.g [5]) Let a,a 1 , ,a m be vectors inR n \{0} Then the following system of homogeneous linear inequalities in x∈R n
⟨a i ,x⟩ ≥0 (i= 1, , m) is infeasible if and only if there exist non-negative numbers λ 1 , , λ m such that a m
The representation (3.1) indicates that a vector 'a' belongs to the conic hull of vectors a₁, , aₘ Geometrically, the homogeneous Farkas lemma illustrates this concept, as shown in Figure 3.1, where three vectors a₁, a₂, a₃ in ℝ² are depicted along with a vector x in ℝ² satisfying the inequalities ⟨a₁, x⟩ ≥ 0 and ⟨a₂, x⟩ ≥ 0.
⟨a 3 ,x⟩ ≥ 0 On the left, we have a vector a in the conic hull of vectors a 1 ,a 2 ,a 3
In this context, the system (F) is infeasible when the inner product ⟨a, x⟩ ≥ 0 is violated, indicating the first inequality doesn't hold Conversely, when a vector satisfies ⟨a, x⟩ < 0, the system becomes feasible, demonstrating that vector a is not within the convex cone generated by vectors a₁, a₂, and a₃ This relationship highlights the importance of the inner product conditions in determining the feasibility of the system and the positioning of vectors within convex cones.
Figure 3.1: Illustration of homogeneous Farkas lemma.
Proving the homogeneous Farkas lemma is a non-trivial task, unlike the straightforward illustration provided earlier In this section, we offer a proof based on the theorem of strong separation of convex sets, a fundamental concept in convex analysis To support the proof, we will utilize several key results that underpin the logic and validity of the argument, ensuring a rigorous demonstration of the lemma.
Lemma 3.2 The conic hull of any set of linearly independent vectors in R n is closed.
In this proof, we establish that the cone V generated by linearly independent vectors \( v_1, \ldots, v_\ell \) in \( \mathbb{R}^n \) is closed by demonstrating that any limit point of a sequence within V also belongs to V Since each vector \( x_k \) in the sequence can be expressed as a non-negative linear combination of \( v_1, \ldots, v_\ell \), the sequence lies within the finite-dimensional subspace \( W = \text{span}(v_1, \ldots, v_\ell) \), which is closed in \( \mathbb{R}^n \) As the sequence \( \{x_k\} \) converges to some vector \( x \), and because \( W \) is closed, \( x \) must also lie within \( W \) Finally, the linear independence of \( v_1, \ldots, v_\ell \) guarantees the existence of unique coefficients \( \xi_1, \ldots, \xi_\ell \) such that \( x = \xi_1 v_1 + \ldots + \xi_\ell v_\ell \).
Now we prove that ξ i k → ξ i for each i = 1, , ℓ In the following we will show the proof in case i = 1, the other cases of i can be shown similarly Let
Let F be the subspace spanned by vectors v₂, , v_ℓ, which makes F a finite-dimensional and closed subspace of ℝⁿ Since the vectors v₁, v₂, , v_ℓ are linearly independent, v₁ does not belong to F Defining u as v₁ minus its projection onto F (u = v₁ − proj_F(v₁)) ensures that u is non-zero because v₁ is outside F, and the projection of v₁ onto F lies within F.
Since proj F (v 1 ) ∈F and F is a subspace of R n , for any z ∈ F and λ ∈R we have proj F (v 1 ) +λz∈F Then, by definition of proj F (v 1 ) we obtain
The last equality follows from (3.3) and the fact that proj F (v 1 ) ∈ F By Cauchy- Schwartz inequality, we see furthermore that
The last equality is because of (3.3) and the fact that a 2 , ,a ℓ ∈ F Combining (3.5) with (3.4) we obtain
∥x k −x∥∥u∥ ≥ |ξ 1 k −ξ 1 |∥u∥ 2 Keeping (3.2) in mind, it follows that
As x k → x by our assumption, letting k → ∞ we have ∥x k −x∥ → 0 Together with (3.2), it follows from the above inequality that |ξ k 1 −ξ 1 | → 0 as k → ∞, or equivalently, ξ 1 k →ξ 1
Now we haveξ i k →ξ i fori= 1, , ℓ Sinceξ 1 k , , ξ ℓ k ≥0 for allk ∈N, we have ξ i ≥0 Thusx=ξ 1 v 1 + .+ξ ℓ v ℓ is a conic combination ofv 1 , ,v ℓ ,i.e., x∈V. This proves the closedness of V.
Proposition 3.3 Let K := cone(a 1 , ,a m ) Then K is a closed convex cone.
The proof demonstrates the conic property of set K by showing that if x belongs to K, then scaling x by a non-negative scalar θ also results in a point belonging to K Specifically, since x can be represented as a non-negative linear combination of vectors \(a_1, a_2, , a_m\), multiplying x by θ produces a scaled combination with non-negative coefficients \(\theta \lambda_i\) As these coefficients remain non-negative when multiplied by θ, this confirms that θx also resides in K, affirming its conic nature.
Convexity of K Let x,y ∈ K and θ ∈ [0,1] Since x,y ∈ K, they admits the following representations x=λ1a 1 + .+λma m , y=à1a 1 + .+àma m for some λ1, , λm ≥0 and à1, , àm ≥0 Then we have z=θx+ (1−θ)y
Sinceθ ∈[0,1] and λ i ≥0, à i ≥0 (i= 1, , m), we have θλ i + (1−θ)à i ≥0 for all i= 1, , m Therefore z ∈K by definition ofK, which confirms convexity of K. Closedness of K Let
The set C can be described as the union of conic hulls generated by linearly independent subsets of {a₁, , aₘ} Since the index set {1, , m} is finite, this union is finite For each subset J in this collection, the conic hull is formed by the vectors {a_j | j ∈ J}, which are contained within the original set {a₁, , aₘ}.
We demonstrate that \(K \subseteq C\) by considering an arbitrary nonzero vector \(x \in K\), which can be expressed as a nonnegative linear combination of vectors \(a_1, \ldots, a_m\) If \(x \neq 0\), some coefficients in this combination are positive, allowing us to simplify the representation to involve only the linearly independent vectors \(a_1, \ldots, a_k\) If these vectors are linearly independent, then \(x\) directly belongs to \(C\); otherwise, there exists a nontrivial combination of these vectors that sums to zero, indicating further structure within the cone.
By multiplying both sides of (3.8) with -1 if needed, we can assume furthermore that there exists at least one positive coefficient in β 1 , , β k For any s∈R, from (3.7) and (3.8) we have x=x−sã0= (ξ1a 1 + .+ξ k a k )−s(β1a 1 + .+β k a k )
Let s* be defined as the minimum value of ξi divided by βi for i in the set {1, , k}, where all βi are positive The set I* consists of the indices where this minimum is achieved Since all coefficients ξi are positive, it follows that the optimal value s* is always greater than zero This indicates that the optimal solution maintains positivity, which is essential for the stability and feasibility of the optimization process.
• For anyi∈ {1, , k}withβ i 0 ands ∗ >0, we haveξ i −s ∗ β i > 0.
• For any i∈ {1, , k}with β i = 0, since ξ i >0, we haveξ i −s ∗ β i =ξ i >0.
• For i ∈ {1, , k} with β i > 0: if i ∈ I ∗ , then ξ i −s ∗ β i = 0, otherwise ξ i −s ∗ β i >0 (by definition of s ∗ and I ∗ ).
By substituting \(s = s^*\) in equation (3.9) and removing zero coefficient terms, we express \(x\) as a conic combination of a subset of vectors from \(\{a_1, \ldots, a_k\}\) with positive coefficients Repeating this process by removing vectors that are not in the subset and that maintain linear dependence continues until the remaining vectors are linearly independent, resulting in \(x\) being represented as a conic combination of these linearly independent vectors within \(\{a_1, \ldots, a_m\}\) This demonstrates that \(x \in C\), and since \(x\) was arbitrarily chosen from \(K\), we conclude that \(K \subseteq C\).
We have proved thatC ⊆KandK ⊆C, soK =C Recall that, by construction,
The set C is composed of the union of finitely many conic hulls generated by linearly independent vectors from {a₁, , aₘ} According to Lemma 3.2, each of these conic hulls is closed Therefore, the union of these finitely many closed sets remains closed, establishing the closedness of C Since the set K equals C, it follows that K is also closed.
We are now ready for the proof of the homogeneous Farkas lemma.
‘If ’ part Assume that there exist λ i ≥0 (i= 1, , msuch that a=Pm i=1λ i a i
If the system of inequalities (F) is feasible, then
X i=1 λ i ⟨a i ,x⟩ ≥0, which is a contradiction Therefore the system (F) must be infeasible.
‘Only if ’ part By Proposition 3.3 the set
A closed convex cone is defined as a set in a vector space where any non-negative linear combination of its elements remains within the set, and it is also closed under limit operations To prove that a vector \(a\) belongs to this cone \(K\), we need to demonstrate that \(a\) can be expressed as a non-negative linear combination of the cone's generating vectors \(a_1, \ldots, a_m\) Assuming, for contradiction, that \(a\) does not belong to \(K\), the strong separation theorem guarantees the existence of a vector \(e\in \mathbb{R}^n\) such that \(\langle e, a \rangle > 0\) and \(\langle e, u \rangle \leq 0\) for all \(u \in K\) Setting \(x^* = -e\), this leads to a contradiction, implying that \(a\) must indeed be in \(K\).
Note thata 1 , ,a m ∈K, so respectively replacing u by these vectors we get
This means that x ∗ is a solution of (F), which contradicts the infeasibility of this system The contradiction means that amust be in K.
Dual cone
In this section, we present a particular case in duality theory For that we recall the following concept.
Definition 3.4 (Dual cone, see e.g [1]) Given a nonempty set K ⊆R n The set
K ∗ :={y∈R n | ⟨x, y⟩ ≥ 0 ∀x ∈ K} is called the dual cone of K.
The following proposition gives an important property of the concept of dual cone.
Proposition 3.5 If K ⊆R n is a nonempty set, then its dual cone K ∗ is a closed convex set.
Proof Conic property of K ∗ Lety∈K ∗ and θ ≥0 Since y∈K ∗ , we have
This means θy∈K ∗ , hence K ∗ is conic.
Convexity of K ∗ Let y 1 ,y 2 ∈ K ∗ and θ ∈ [0,1] Since y 1 ,y 2 ∈ K ∗ , for all x∈K we have
Since θ∈[0,1], we haveθ ≥0 and 1−θ≥0 It follows that for all x∈K we have θ⟨x,y 1 ⟩ ≥0 and (1−θ)⟨x,y 2 ⟩ ≥0.
It means θy 1 + (1−θ)y 2 ∈K ∗ , henceK ∗ is convex.
The closedness of K* is established through the continuity of the function f(y) = ⟨x, y⟩ Specifically, for any sequence {y_k} in K* converging to some ¯y, the inner products ⟨x, y_k⟩ converge to ⟨x, ¯y⟩ Since ⟨x, y⟩ ≥ 0 for all x in K, it follows that ⟨x, ¯y⟩ ≥ 0 for all x in K, which means ¯y belongs to K* This demonstrates that K* is a closed set, preserving its limit points within the set.
The main result in this section is as follows.
Theorem 3.6 states that every closed convex cone in \(\mathbb{R}^n\) is equal to its double dual, i.e., \(K = K^{**}\) The proof begins by noting that if \(x\) belongs to \(K\), then for any \(y\) in the dual cone \(K^*\), the inner product \(\langle x,y \rangle \geq 0\), implying \(x \in K^{**}\) This establishes that \(K \subseteq K^{**}\) The remaining step is to prove the reverse inclusion, \(K^{**} \subseteq K\), which completes the proof that the cone is equal to its double dual.
If the reserve inclusion does not hold, then there exists a point ¯x in the double dual space that is not in K Applying the strong separation theorem on the closed convex set K and the compact set {¯x} guarantees the existence of a non-zero vector a in R^n such that
Since K is a cone, 0 ∈ K Taking z = 0 in (3.10) gives ⟨a,x⟩¯ < 0 On the other hand, for everyz ∈K, by conic property of K we havetz∈K for allt > 0 From (3.10), for all t >0 we have
Dividing both sides of above inequality byt, then letting t→+∞, we obtain
⟨a,z⟩ ≥0 ∀z∈K, which implies a ∈ K ∗ Note that ¯x ∈ K ∗∗ , then ⟨a,x⟩ ≥¯ 0, which contradicts the fact that⟨a,x⟩¯ ξ} for some a∈R n \{0} and ξ ∈R It follows that
H¯ − :=R n \H + = {u ∈R n | ⟨a, u⟩ ≤ ξ} is a closed half-space that does not contain any point of C Furthermore, H + and
Letz be an arbitrary point in H Since H is a subset of the closure of H minus, z belongs to the closure of H minus Because the closure of H minus does not contain any points of C, it follows that z is not in C Given that x was chosen to be in C, this leads to the conclusion that the line segment between x and z lies outside of C, highlighting important properties of the set H and its relation to the set C.
[x,z] :={λx+ (1−λ)z|λ ∈[0,1]} intersects the boundary ∂C of C at some point y z Since y z ∈[x,z], we have
∥x−y z ∥ ≤ ∥x−z∥ (3.13) This holds for arbitrary choice ofz in H, hence it also holds for z=z ∗ := argmin v∈H ∥x−v∥.
We come up with the following inequalities d C (x) = min{∥x−y∥ |y∈∂C}
Since the inequality d C (x) ≤ d H + (x) holds for arbitrary open half-space H + con- tainingC, we obtain d C (x)≤ inf
H + ∈Hd H + (x), (3.14) in which H is the set of all open half-space containing C.
Given a point \(x\) outside a convex set \(C\), the closest point \(y_x\) on the boundary \(\partial C\) minimizes the distance \(\|x - y_x\|\), and this distance defines the metric \(d_C(x) = \|x - y_x\|\) Due to \(y_x\)’s position on the boundary and applying Theorem 2.9, there exists a support hyperplane \(H_x\) to \(C\) at \(y_x\) This support hyperplane can be represented mathematically, providing a key geometric insight into the structure of convex sets and their boundary properties.
The hyperplane Hx = {u ∈ ℝⁿ | ⟨b, u⟩ = β} serves as the support hyperplane to the open convex set C at point yₓ, with the open half-space Hₓ⁺ = {u ∈ ℝⁿ | ⟨b, u⟩ > β} containing C Notably, yₓ belongs to both the boundary of C (∂C) and the hyperplane Hₓ, which is also the boundary of Hₓ⁺, and the distance from x to C, denoted d_C(x), equals the Euclidean distance between x and yₓ, which is also equal to the distance from x to Hₓ⁺, expressed as d_C(x) = d_{Hₓ⁺}(x).
Since H x + ∈ H, it follows from (3.14) and (3.15) that d C (x) = inf
This equality applies to any arbitrarily chosen point x within the set C Considering an open half-space H+ = {u ∈ ℝⁿ | ⟨a, u⟩ > ξ} that contains C by definition, we observe that since x ∈ C, it must also be within H+ This foundational concept highlights the relationship between points in convex sets and the defining half-spaces, which is essential in convex analysis and optimization.
C ⊂H + ), and hence⟨a,x⟩> ξ On the other hand, we have d H + (x) = |⟨a,x⟩ −ξ|
So the right hand side of (3.16) is the infimum of linear functions Therefore, by Proposition 1.20,d C (x) is a concave function on C.
We are now ready for the main result of this section.
Theorem 3.8 Let C be a nonempty open convex set in R n Then the function b(x) :=−lnd C (x) is a convex barrier function on C.
Proof We need to show the followings.
According to Proposition 3.7, the function d C(x) is concave on the set C, which ensures its continuity on C as established by Proposition 1.22 Since the function b(x) is composed of the continuous functions -ln(ã) on R+ and d C(x) on C, it follows that b(x) itself is continuous, confirming the proof of (i).
The proof of (ii) relies on the fact that the natural logarithm function, ln(ã), is concave and non-decreasing on the positive real numbers By applying Proposition 1.21, the composition of ln(ã) with the concave function d C(x) defined on the set C remains concave Consequently, the function b(x) = −ln(d C(x)) is convex on C, establishing the key property needed for the analysis.
The distance function d_C(x) is zero on the boundary ∂C, and due to its continuity within the set C, as x approaches ∂C, d_C(x) tends to zero Consequently, the natural logarithm ln(d_C(x)) approaches negative infinity near the boundary This implies that the function b(x) = -ln(d_C(x)) tends to positive infinity as x approaches ∂C, establishing b(x) as a valid barrier function within C These properties complete the proof of item (iii).
Hahn-Banach theorem
This section provides a proof of the Hahn-Banach theorem, a fundamental result in functional analysis, by leveraging the theorem on proper separation between convex and affine sets in general vector spaces.
The Hahn-Banach theorem is a fundamental result in functional analysis, applicable to general vector spaces without any topology It involves the concept of functionals, which are scalar-valued functions defined on a vector space E A functional p : E → R is considered linear if it satisfies the properties of additivity and homogeneity, making it a key component in extending linear functionals while preserving their norm.
Such functionalpis said to be sub-additiveifp(u+v)≤p(u) +p(v) for allu,v∈E.
A functional p is considered positive homogeneous if it satisfies the condition p(λv) = λp(v) for all vectors v in E and all non-negative scalars λ ≥ 0 When a functional is both sub-additive and positive homogeneous, it is classified as sublinear Sublinear functionals inherently possess convexity, making them important in various mathematical and optimization contexts.
The Hahn-Banach theorem states that for any vector space E with a subspace F and a sublinear functional p, a linear functional defined on F that is dominated by p can be extended to the entire space E Specifically, if a linear functional f on F satisfies f(v) ≤ p(v) for all v in F, then there exists a linear functional g on E such that g(v) = f(v) for all v in F and g(v) ≤ p(v) for all v in E This fundamental result ensures the extension of linear functionals while preserving their boundedness with respect to sublinear functionals, which is essential in functional analysis and dual space theory.
The Hahn-Banach theorem states that any linear functional defined on a subspace and dominated by a sublinear functional can be extended to the entire vector space without losing the majorization property This fundamental principle in functional analysis ensures the extension of linear functionals while preserving boundedness, making it a crucial concept for understanding dual spaces and continuous linear functionals The theorem's ability to extend linear functionals under these conditions plays a key role in various applications within mathematics and related fields.
A sublinear functional p on E is convex, which implies that the set A is also convex Since f is a linear functional, the set B is a linear subspace of E×R, making it an affine set The set rai(A) = {(x, ξ) ∈ E×R | p(x) < ξ} is convex, consistently with the properties of sublinear functionals Importantly, rai(A) and B do not intersect; assuming otherwise leads to a contradiction because if (y, ξ) belongs to both sets, then p(y) < ξ and f(y) = ξ, which cannot happen This ensures that rai(A) and B are disjoint, confirming the key property of these convex and affine sets in functional analysis.
Applying Theorem 2.22 to the convex set rai(A) and the affine set B within the vector space E×R, we establish the existence of a hyperplane H that contains B while remaining disjoint from rai(A) This hyperplane H can be expressed in a specific form, highlighting its role in separating the convex and affine sets effectively.
H ={(x, ξ)∈E×R| h(x) + mξ = 0} which corresponds to a nonzero linear functional (h, m) on E×R defined by
Suppose without loss of generality thatA ⊆H¯ − Then it holds that h(x) +mf(x) = 0 ∀x∈F (3.18) and h(x) +mξ 0, we can fix x∈E and let ξ →+∞ to obtain a contradiction with (3.19).
Ifm = 0, then it follows from (3.19) that h(x)