Mathematical analysis for machine learning and data mining

mathemat-There are several well-developed areas in mathematical analysis thatpresent a special interest for machine learning: topology with various ﬂa-vors: point-set topology, combinato

Trang 3

This page intentionally left blank

Trang 5

Library of Congress Cataloging-in-Publication Data

Names: Simovici, Dan A., author.

Title: Mathematical analysis for machine learning and data mining / by Dan Simovici

(University of Massachusetts, Boston, USA).

Description: [Hackensack?] New Jersey : World Scientific, [2018] |

Includes bibliographical references and index.

Identifiers: LCCN 2018008584 | ISBN 9789813229686 (hc : alk paper)

Subjects: LCSH: Machine learning Mathematics | Data mining Mathematics.

Classification: LCC Q325.5 S57 2018 | DDC 006.3/101515 dc23

LC record available at https://lccn.loc.gov/2018008584

British Library Cataloguing-in-Publication Data

A catalogue record for this book is available from the British Library.

electronic or mechanical, including photocopying, recording or any information storage and retrieval

system now known or to be invented, without written permission from the publisher.

For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance

Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA In this case permission to photocopy

is not required from the publisher.

For any available supplementary material, please visit

http://www.worldscientific.com/worldscibooks/10.1142/10702#t=suppl

Desk Editors: V Vishnu Mohan/Steven Patt

Typeset by Stallion Press

Email: enquiries@stallionpress.com

Printed in Singapore

Trang 6

Making mathematics accessible to the educated layman, whilekeeping high scientiﬁc standards, has always been considered

a treacherous navigation between the Scylla of professionalcontempt and the Charybdis of public misunderstanding

Gian-Carlo Rota

Trang 7

This page intentionally left blank

Trang 8

Mathematical Analysis can be loosely described as is the area of ics whose main object is the study of function and of their behaviour withrespect to limits The term “function” refers to a broad collection of gen-eralizations of real functions of real arguments, to functionals, operators,measures, etc

mathemat-There are several well-developed areas in mathematical analysis thatpresent a special interest for machine learning: topology (with various ﬂa-vors: point-set topology, combinatorial and algebraic topology), functionalanalysis on normed and inner product spaces (including Banach and Hilbertspaces), convex analysis, optimization, etc Moreover, disciplines like mea-sure and integration theory which play a vital role in statistics, the otherpillar of machine learning are absent from the education of a computerscientists We aim to contribute to closing this gap, which is a serioushandicap for people interested in research

The machine learning and data mining literature is vast and embraces adiversity of approaches, from informal to sophisticated mathematical pre-sentations However, the necessary mathematical background needed forapproaching research topics is usually presented in a terse and unmotivatedmanner, or is simply absent This volume contains knowledge that comple-ments the usual presentations in machine learning and provides motivations(through its application chapters that discuss optimization, iterative algo-rithms, neural networks, regression, and support vector machines) for thestudy of mathematical aspects

Each chapter ends with suggestions for further reading Over 600 ercises and supplements are included; they form an integral part of thematerial Some of the exercises are in reality supplemental material Forthese, we include solutions The mathematical background required for

ex-vii

Trang 9

making the best use of this volume consists in the typical sequence lus — linear algebra — discrete mathematics, as it is taught to ComputerScience students in US universities.

calcu-Special thanks are due to the librarians of the Joseph Healy Library

at the University of Massachusetts Boston whose diligence was essential

in completing this project I also wish to acknowledge the helpfulness andcompetent assistance of Steve Patt and D Rajesh Babu of World Scientiﬁc.Lastly, I wish to thank my wife, Doina, a steady source of strength andloving support

Dan A Simovici

Boston and Brookline

January 2018

Trang 10

1.1 Introduction 3

1.2 Sets and Collections 4

1.3 Relations and Functions 8

1.4 Sequences and Collections of Sets 16

1.5 Partially Ordered Sets 18

1.6 Closure and Interior Systems 28

1.7 Algebras and σ-Algebras of Sets 34

1.8 Dissimilarity and Metrics 43

1.9 Elementary Combinatorics 47

Exercises and Supplements 54

Bibliographical Comments 64

2 Linear Spaces 65 2.1 Introduction 65

2.2 Linear Spaces and Linear Independence 65

2.3 Linear Operators and Functionals 74

2.4 Linear Spaces with Inner Products 85

2.5 Seminorms and Norms 88

2.6 Linear Functionals in Inner Product Spaces 107

2.7 Hyperplanes 110

ix

Trang 11

3 Algebra of Convex Sets 117

3.1 Introduction 117

3.2 Convex Sets and Aﬃne Subspaces 117

3.3 Operations on Convex Sets 129

3.4 Cones 130

3.5 Extreme Points 132

3.6 Balanced and Absorbing Sets 138

3.7 Polytopes and Polyhedra 142

Part II Topology 159 4 Topology 161 4.1 Introduction 161

4.2 Topologies 162

4.3 Closure and Interior Operators in Topological Spaces 166

4.4 Neighborhoods 174

4.5 Bases 180

4.6 Compactness 189

4.7 Separation Hierarchy 193

4.8 Locally Compact Spaces 197

4.9 Limits of Functions 201

4.10 Nets 204

4.11 Continuous Functions 210

4.12 Homeomorphisms 218

4.13 Connected Topological Spaces 222

4.14 Products of Topological Spaces 225

4.15 Semicontinuous Functions 230

4.16 The Epigraph and the Hypograph of a Function 237

5 Metric Space Topologies 255 5.1 Introduction 255

5.2 Sequences in Metric Spaces 260

5.3 Limits of Functions on Metric Spaces 261

Trang 12

5.4 Continuity of Functions between Metric Spaces 264

5.5 Separation Properties of Metric Spaces 270

5.6 Completeness of Metric Spaces 275

5.7 Pointwise and Uniform Convergence 283

5.8 The Stone-Weierstrass Theorem 286

5.9 Totally Bounded Metric Spaces 291

5.10 Contractions and Fixed Points 295

5.11 The Hausdorﬀ Metric Hyperspace of Compact Subsets 300

5.12 The Topological Space (R, O) 303

5.13 Series and Schauder Bases 307

5.14 Equicontinuity 315

6 Topological Linear Spaces 329 6.1 Introduction 329

6.2 Topologies of Linear Spaces 329

6.3 Topologies on Inner Product Spaces 337

6.4 Locally Convex Linear Spaces 338

6.5 Continuous Linear Operators 340

6.6 Linear Operators on Normed Linear Spaces 341

6.7 Topological Aspects of Convex Sets 348

6.8 The Relative Interior 351

6.9 Separation of Convex Sets 356

6.10 Theorems of Alternatives 366

6.11 The Contingent Cone 370

6.12 Extreme Points and Krein-Milman Theorem 373

Part III Measure and Integration 383 7 Measurable Spaces and Measures 385 7.1 Introduction 385

7.2 Measurable Spaces 385

7.3 Borel Sets 388

7.4 Measurable Functions 392

Trang 13

7.5 Measures and Measure Spaces 398

7.6 Outer Measures 417

7.7 The Lebesgue Measure onRn 427

7.8 Measures on Topological Spaces 450

7.9 Measures in Metric Spaces 453

7.10 Signed and Complex Measures 456

7.11 Probability Spaces 464

8 Integration 485 8.1 Introduction 485

8.2 The Lebesgue Integral 485

8.2.1 The Integral of Simple Measurable Functions 486

8.2.2 The Integral of Non-negative Measurable Functions 491

8.2.3 The Integral of Real-Valued Measurable Functions 500

8.2.4 The Integral of Complex-Valued Measurable Functions 505

8.3 The Dominated Convergence Theorem 508

8.4 Functions of Bounded Variation 512

8.5 Riemann Integral vs Lebesgue Integral 517

8.6 The Radon-Nikodym Theorem 525

8.7 Integration on Products of Measure Spaces 533

8.8 The Riesz-Markov-Kakutani Theorem 540

8.9 Integration Relative to Signed Measures and Complex Measures 547

8.10 Indeﬁnite Integral of a Function 549

8.11 Convergence in Measure 551

8.12 Lp and L p Spaces 556

8.13 Fourier Transforms of Measures 565

8.14 Lebesgue-Stieltjes Measures and Integrals 569

8.15 Distributions of Random Variables 572

8.16 Random Vectors 577

Trang 14

Part IV Functional Analysis and Convexity 595

9.2 Banach Spaces — Examples 597

9.3 Linear Operators on Banach Spaces 603

9.4 Compact Operators 610

9.5 Duals of Normed Linear Spaces 612

9.6 Spectra of Linear Operators on Banach Spaces 616

10 Diﬀerentiability of Functions Deﬁned on Normed Spaces 625 10.1 Introduction 625

10.2 The Fréchet and Gâteaux Differentiation 625

10.3 Taylor’s Formula 649

10.4 The Inverse Function Theorem inRn 658

10.5 Normal and Tangent Subspaces for Surfaces in Rn 663

11 Hilbert Spaces 677 11.1 Introduction 677

11.2 Hilbert Spaces — Examples 677

11.3 Classes of Linear Operators in Hilbert Spaces 679

11.3.1 Self-Adjoint Operators 681

11.3.2 Normal and Unitary Operators 683

11.3.3 Projection Operators 684

11.4 Orthonormal Sets in Hilbert Spaces 686

11.5 The Dual Space of a Hilbert Space 703

11.6 Weak Convergence 704

11.7 Spectra of Linear Operators on Hilbert Spaces 707

11.8 Functions of Positive and Negative Type 712

11.9 Reproducing Kernel Hilbert Spaces 722

11.10 Positive Operators in Hilbert Spaces 733

Trang 15

12 Convex Functions 747

12.2 Convex Functions — Basics 748

12.3 Constructing Convex Functions 756

12.4 Extrema of Convex Functions 759

12.5 Diﬀerentiability and Convexity 760

12.6 Quasi-Convex and Pseudo-Convex Functions 770

12.7 Convexity and Inequalities 775

12.8 Subgradients 780

Part V Applications 817 13 Optimization 819 13.1 Introduction 819

13.2 Local Extrema, Ascent and Descent Directions 819

13.3 General Optimization Problems 826

13.4 Optimization without Diﬀerentiability 827

13.5 Optimization with Diﬀerentiability 831

13.6 Duality 843

13.7 Strong Duality 849

14 Iterative Algorithms 865 14.1 Introduction 865

14.2 Newton’s Method 865

14.3 The Secant Method 869

14.4 Newton’s Method in Banach Spaces 871

14.5 Conjugate Gradient Method 874

14.6 Gradient Descent Algorithm 879

14.7 Stochastic Gradient Descent 882

Trang 16

15 Neural Networks 893

15.2 Neurons 893

15.3 Neural Networks 895

15.4 Neural Networks as Universal Approximators 896

15.5 Weight Adjustment by Back Propagation 899

16 Regression 909 16.1 Introduction 909

16.2 Linear Regression 909

16.3 A Statistical Model of Linear Regression 912

16.4 Logistic Regression 914

16.5 Ridge Regression 916

16.6 Lasso Regression and Regularization 917

17 Support Vector Machines 925 17.1 Introduction 925

17.2 Linearly Separable Data Sets 925

17.3 Soft Support Vector Machines 930

17.4 Non-linear Support Vector Machines 933

17.5 Perceptrons 939

Trang 18

nota-The membership of x in a set S is denoted by x ∈ S; if x is not a member of the set S, we write x ∈ S.

Throughout this book, we use standardized notations for certain tant sets of numbers:

impor-C the set of complex numbers R the set of real numbers

R0 the set of non-negative real

R0 the setR0∪ {−∞} Rˆ>0 the setR<>0 ∪ {+∞}

Q the set of rational numbers I the set of irrational numbers

Z the set of integers N the set of natural numbers

The usual order of real numbers is extended to the set ˆR by −∞ < x <

+∞ for every x ∈ R Addition and multiplication are extended by

x + ∞ = ∞ + x = +∞, and , x − ∞ = −∞ + x = −∞, for every x ∈ R Also, if x = 0 we assume that

x · ∞ = ∞ · x =

+∞ if x > 0,

−∞ if x < 0,

3

Trang 19

Additionally, we assume that 0·∞ = ∞·0 = 0 and 0·(−∞) = (−∞)·0 = 0.

Note that∞ − ∞, −∞ + ∞ are undeﬁned.

Division is extended by x/ ∞ = x/ − ∞ = 0 for every x ∈ R.

The set of complex numbersC is extended by adding a single “inﬁnity”element ∞ The sum ∞ + ∞ is not deﬁned in the complex case.

If S is a ﬁnite set, we denote by |S| the number of elements of S.

1.2 Sets and Collections

We assume that the reader is familiar with elementary set operations:union, intersection, diﬀerence, etc., and with their properties The emptyset is denoted by∅.

We give, without proof, several properties of union and intersection ofsets:

for all sets S, T , U

The associativity of union and intersection allows us to denote

unam-biguously the union of three sets S, T , U by S ∪ T ∪ U and the intersection

of three sets S, T , U by S ∩ T ∩ U.

Deﬁnition 1.1 The sets S and T are disjoint if S ∩ T = ∅.

Sets may contain other sets as elements For example, the set

Trang 20

IfC and D are two collections, we say that C is included in D, or that

C is a subcollection of D, if every member of C is a member of D This is

If C = {S, T }, we have x ∈ C if and only if x ∈ S or x ∈ T and

x ∈C if and only if x ∈ S and y ∈ T The union and the intersection of this two-set collection are denoted by S ∪ T and S ∩ T and are referred to

as the union and the intersection of S and T , respectively.

The diﬀerence of two sets S, T is denoted by S − T When T is a subset

of S we write T for S − T , and we refer to the set T as the complement of

T with respect to S or simply the complement of T

The relationship between set diﬀerence and set union and intersection

is well-known: for every set S and non-empty collectionC of sets, we have

S −C ={S − C | C ∈ C} and S −C ={S − C | C ∈ C} For any sets S, T , U , we have

Trang 21

∅ = S for the empty collection of subsets of S This is consistent with

the fact that∅ ⊆ C impliesC ⊆ S.

The symmetric diﬀerence of sets denoted by ⊕ is deﬁned by U ⊕ V = (U − V ) ∪ (V − U) for all sets U, V

We leave to the reader to verify that for all sets U, V, T we have

Proof. Suppose that{{x, y}, {x}} = {{u, v}, {u}}.

If x = y, the collection {{x, y}, {x}} consists of a single set, {x}, so

the collection {{u, v}, {u}} also consists of a single set This means that {u, v} = {u}, which implies u = v Therefore, x = u, which gives the desired conclusion because we also have y = v.

If x = y, then neither (x, y) nor (u, v) are singletons However, they

both contain exactly one singleton, namely {x} and {u}, respectively, so

x = u They also contain the equal sets {x, y} and {u, v}, which must be equal Since v ∈ {x, y} and v = u = x, we conclude that v = y.

Deﬁnition 1.3 An ordered pair is a collection of sets {{x, y}, {x}}.

Theorem 1.1 implies that for an ordered pair{{x, y}, {x}}, x and y are

uniquely determined This justiﬁes the following deﬁnition

Deﬁnition 1.4 Let {{x, y}, {x}} be an ordered pair Then x is the ﬁrst component of p and y is the second component of p.

From now on, an ordered pair{{x, y}, {x}} is denoted by (x, y) If both

x, y ∈ S, we refer to (x, y) as an ordered pair on the set S.

Deﬁnition 1.5 Let X, Y be two sets Their product is the set X × Y that consists of all pairs of the form (x, y), where x ∈ X and y ∈ Y

The set product is often referred to as the Cartesian product of sets.

Example 1.1 Let X = {a, b, c} and let Y = {1, 2} The Cartesian product

X × Y is given by

X × Y = {(a, 1), (b, 1), (c, 1), (a, 2), (b, 2), (c, 2)}.

Trang 22

Deﬁnition 1.6 Let C and D be two collections of sets such that C =

Since we have (a, b) ⊆ (a, ∞) for every a, b ∈ R such that a < b, it

follows thatD is a reﬁnement of C

Deﬁnition 1.7 A collection of sets C is hereditary if U ∈ C and W ⊆ U implies W ∈ C.

Example 1.3 Let S be a set The collection of subsets of S, denoted by

P(S), is a hereditary collection of sets since a subset of a subset T of S is itself a subset of S.

The set of subsets of S that contain k elements is denoted by Pk (S) Clearly, for every set S, we have P0(S) = {∅} because there is only one subset of S that contains 0 elements, namely the empty set The set of all ﬁnite subsets of a set S is denoted by Pfin(S) It is clear that Pfin(S) =

k ∈NPk (S).

Example 1.4 If S = {a, b, c}, then P(S) consists of the following eight

sets: ∅, {a}, {b}, {c}, {a, b}, {a, c}, {b, c}, {a, b, c} For the empty set, we

haveP(∅) = {∅}.

Deﬁnition 1.8 Let C be a collection of sets and let U be a set The trace

of the collection C on the set U is the collection C U ={U ∩ C | C ∈ C}.

We conclude this presentation of collections of sets with two more erations on collections of sets

op-Deﬁnition 1.9 Let C and D be two collections of sets The collections

C ∨ D, C ∧ D, and C − D are given by

Trang 23

We have

C ∨ D = {{x, y}, {y, z}, {x, y, z}, {u, y, z}, {u, x, y, z}},

C ∧ D = {∅, {x}, {y}, {x, y}, {y, z}},

C − D = {∅, {x}, {z}, {x, z}},

D − C = {∅, {u}, {x}, {y}, {u, z}, {u, y, z}}.

Unlike “∪” and “∩”, the operations “∨” and “∧” between collections of

sets are not idempotent Indeed, we have, for example,

D ∨ D = {{y}, {x, y}, {u, y, z}, {u, x, y, z}} = D.

The trace CK of a collectionC on K can be written as C K =C ∧ {K}.

We conclude this section by introducing a special type of collection ofsubsets of a set

Deﬁnition 1.10 A partition of a non-empty set S is a collection π of

non-empty subsets of S that are pairwise disjoint and whose union equals S.

The members of π are referred to as the blocks of the partition π.

The collection of partitions of a set S is denoted by PART(S) A

parti-tion is finite if it has a finite number of blocks The set of finite partiparti-tions

1.3 Relations and Functions

Deﬁnition 1.11 Let X, Y be two sets A relation on X, Y is a subset ρ

of the set product X × Y

If X = Y = S we refer to ρ as a relation on S.

The relation ρ on S is:

• reﬂexive if (x, x) ∈ ρ for every x ∈ S;

• irreﬂexive if (x, x) ∈ ρ for every x ∈ S;

• symmetric if (x, y) ∈ ρ implies (y, x) ∈ ρ for all x, y ∈ S;

Trang 24

• antisymmetric if (x, y) ∈ ρ and (y, x) ∈ ρ imply x = y for all

x, y ∈ S;

• transitive if (x, y) ∈ ρ and (y, z) ∈ ρ imply (x, z) ∈ ρ for all x, y, z ∈ S.

Denote by REFL(S), SYMM(S), ANTISYMM(S) and TRAN(S) the sets of

reﬂexive relations, the set of symmetric relations, the set of antisymmetric,

and the set of transitive relations on S, respectively.

A partial order on S is a relation ρ that belongs to REFL(S) ∩ ANTISYMM(S) ∩TRAN(S), that is, a relation that is reﬂexive, symmetric

and transitive

Example 1.7 Let δ be the relation that consists of those pairs (p, q) of

natural numbers such that q = pk for some natural number k We have (p, q) ∈ δ if p evenly divides q Since (p, p) ∈ δ for every p it is clear that δ

is symmetric

Suppose that we have both (p, q) ∈ δ and (q, p) ∈ δ Then q = pk and

p = qh If either p or q is 0, then the other number is clearly 0 Assume that neither p nor q is 0 Then 1 = hk, which implies h = k = 1, so p = q, which proves that δ is antisymmetric.

Finally, if (p, q), (q, r) ∈ δ, we have q = pk and r = qh for some k, h ∈ N, which implies r = p(hk), so (p, r) ∈ δ, which shows that δ is transitive.

Example 1.8 Deﬁne the relation λ on R as the set of all ordered pairs

(x, y) such that y = x + t, where t is a non-negative number We have (x, x) ∈ λ because x = x+ 0 for every x ∈ R If (x, y) ∈ λ and (y, x) ∈ λ we have y = x+t and x = y+s for two non-negative numbers t, s, which implies

0 = t + s, so t = s = 0 This means that x = y, so λ is antisymmetric Finally, if (x, y), (y, z) ∈ λ, we have y = x + u and z = y + v for two non-negative numbers u, v, which implies z = x + u + v, so (x, z) ∈ λ.

In current mathematical practice, we often write xρy instead on (x, y) ∈

ρ, where ρ is a relation of S and x, y ∈ S Thus, we write pδq and xλy instead on (p, q) ∈ δ and (x, y) ∈ λ Furthermore, we shall use the standard

notations “| ” and “” for δ and λ, that is, we shall write p | q and x y

if p divides q and x is less or equal to y This alternative way to denote the fact that (x, y) belongs to ρ is known as the inﬁx notation.

Example 1.9 LetP(S) be the set of subsets of S It is easy to verify that

the inclusion between subsets “⊆” is a partial order relation on P(S) If

U, V ∈ P(S), we denote the inclusion of U in V by U ⊆ V using the inﬁx

notation

Trang 25

Functions are special relation that enjoy the property described in thenext deﬁnition.

Deﬁnition 1.12 Let X, Y be two sets A function (or a mapping) from

X to Y is a relation f on X, Y such that (x, y), (x, y )∈ f implies y = y .

In other words, the ﬁrst component of a pair (x, y) ∈ f determines

uniquely the second component of the pair We denote the second

compo-nent of a pair (x, y) ∈ f by f(x) and say, occasionally, that f maps x to y.

If f is a function from X to Y we write f : X −→ Y

Deﬁnition 1.13 Let X, Y be two sets and let f : X −→ Y

The domain of f is the set

Dom(f ) = {x ∈ X | y = f(x) for some y ∈ Y }.

The range of f is the set

Ran(f ) = {y ∈ Y | y = f(x) for some x ∈ X}.

Deﬁnition 1.14 Let X be a set, Y = {0, 1} and let L be a subset of S The characteristic function is the function 1 L : S −→ {0, 1} deﬁned by:

It is easy to see that:

1P ∩Q (x) = 1 P (x) · 1 Q (x),

1P ∪Q (x) = 1 P (x) + 1 Q (x) − 1 P (x) · 1 Q (x),

1P¯(x) = 1 − 1 P (x), for every P, Q ⊆ S and x ∈ S.

Theorem 1.2 Let X, Y, Z be three sets and let f : X −→ Y and g : Y −→

Z be two functions The relation gf : X −→ Z that consists of all pairs (x, z) such that y = f (x) and g(y) = z for some y ∈ Y is a function.

Trang 26

Proof. Let (x, z1), (x, z2)∈ gf There exist y1, y2∈ Y such that

y1= f (x), y2= f (x), g(y1) = z1, and g(y2) = z2.

The ﬁrst two equalities imply y1 = y2; the last two yield z1= z2, so gf is

Note that the composition of the function f and g has been denoted in Theorem 1.2 by gf rather than the relation product f g This manner of

denoting the function composition is applied throughout this book

Deﬁnition 1.15 Let X, Y be two sets and let f : X −→ Y If U is a subset of X, the restriction of f to U is the function g : U −→ Y deﬁned

by g(u) = f (u) for u ∈ U.

The restriction of f to U is denoted by f U

Example 1.10 Let f be the function deﬁned by f (x) = |x| for x ∈ R Its

restriction toR<0 is given by f R<0 (x) = −x.

Deﬁnition 1.16 A function f : X −→ Y is:

(i) injective or one-to-one if f (x1) = f (x2) implies x1= x2for x1, x2∈ Dom(f );

(ii) surjective or onto if Ran(f ) = Y ;

(iii) total if Dom(f ) = X.

If f is both injective and surjective, then it is a bijective function.

Theorem 1.3 A function f : X −→ Y is injective if and only if there exists a function g : Y −→ X such that g(f(x)) = x for every x ∈ Dom(f).

Proof. Suppose that f is an injective function For x ∈ Dom(f) and

y = f (x) deﬁne g(y) = x Note that g is well deﬁned for if y = f (x1) =

f (x2) then x1= x2due to the injectivity of f It follows immediately that g(f (x)) = x for x ∈ Dom(f).

Conversely, suppose that there exists a function g : Y −→ X such that g(f (x)) = x for every x ∈ Dom(f) If f(x1) = f (x2), then x1= g(f (x1)) =

g(f (x2)) = x2, which proves that f is indeed injective.

The function g whose existence is established by Theorem 1.3 is said to

be the left inverse of f Thus, a function f : X −→ Y is injective if and

only if it has a left inverse

To prove a similar result concerning surjective functions we need to state

a basic axiom of set theory

Trang 27

The Axiom of Choice: LetC = {C i | i ∈ I} be a collection

of non-empty sets indexed by a set I There exists a function

φ : I −→C (known as a choice function) such that φ(i) ∈ C i

for each i ∈ I.

Theorem 1.4 A function f : X −→ Y is surjective if and only if there exists a function h : X −→ Y such that f(h(y)) = y for every y ∈ Y

Proof. Suppose that f is a surjective function The collection {f −1 (y) |

y ∈ Y } indexed by Y consists of non-empty sets By the Axiom of Choice there exists a choice function for this collection, that is a function h : Y −→

y ∈Y f −1 (y) such that h(y) ∈ f −1 (y), or f (h(y)) = y for y ∈ Y

Conversely, suppose that there exists a function h : X −→ Y such that

f (h(y)) = y for every y ∈ Y Then, f(x) = y for y = h(y), which shows

The function h whose existence is established by Theorem 1.4 is said to

be the right inverse of f Thus, a function has a right inverse if and only if

it is surjective

Corollary 1.1 A function f : X −→ X is a bijection if and only if there exists a function k : X −→ X that is both a left inverse and a right inverse

of f

Proof. This statement follows from Theorems 1.3 and 1.4

Indeed, if f is a bijection, there exists a left inverse g : X −→ X such that g(f (x)) = x and a right inverse h : X −→ X such that f(h(x)) = x for every x ∈ Y Since h(x) ∈ X we have g(f(h(x))) = h(x), which implies g(x) = h(x) because f (h(x)) = x for x ∈ X This yields g = h. The relationship between the subsets of a set and characteristic func-tions deﬁned on that set is discussed next

Theorem 1.5 There is a bijection Ψ : P(S) −→ (S −→ {0, 1}) between the set of subsets of S and the set of characteristic functions deﬁned on S.

Proof. For P ∈ P(S) deﬁne Ψ(P ) = 1 P The mapping Ψ is one-to-one.Indeed, assume that 1P = 1Q , where P, Q ∈ P(S) We have x ∈ P if and

only if 1P (x) = 1, which is equivalent to 1 Q (x) = 1 This happens if and only if x ∈ Q; hence, P = Q so Ψ is one-to-one.

Let f : S −→ {0, 1} be an arbitrary function Deﬁne the set T f ={x ∈

S | f(x) = 1} It is easy to see that f is the characteristic function of the

Trang 28

set T f ; hence, Ψ(T f ) = f which shows that the mapping Ψ is also onto;

Deﬁnition 1.17 A set S is indexed be a set I if there exists a surjection

f : I −→ S In this case we refer to I as an index set.

If S is indexed by the function f : I −→ S we write the element f(i) just as s i, if there is no risk of confusion

Deﬁnition 1.18 A sequence of length n on a set X is a function x :

p of length 2 on X ∪ Y such that p0= x and p1= y This point of view

allows an immediate generalization of set products

Deﬁnition 1.19 An inﬁnite sequence, or, simply a sequence on a set X is

a function x :N −→ X The set of inﬁnite sequences on X is denoted by

Seq(X).

Deﬁnition 1.20 Let S0, , S n −1 be n sets An n-tuple on S0, , S n −1

is a function t : {0, , n − 1} −→ S0∪ · · · S n −1 such that t(i) ∈ S i for

0 i n − 1 The n-tuple t is denoted by (t(0), , t(n − 1)).

The set of all n-tuples on S0, , S n −1 is referred to as the Cartesian product of S0, , S n −1 and is denoted by S0× · · · × S n −1 .

The Cartesian product of a ﬁnite number of sets can be generalized forarbitrary families of sets Let S = {S i | i ∈ I} be a collection of sets indexed by the set I The Cartesian product ofS is the set of all functions

of the form s : I −→ S such that s(i) ∈ S i for every i ∈ I This set is

denoted by

i ∈I S i.1

The notion of projection is closely associated to Cartesian products.

1 This constructions is implicitly using the axiom of choice that implies the existence

of functions s : I −→

∈I such that s(i) ∈ S i for every i ∈ I.

Trang 29

Deﬁnition 1.21 Let S = {S i | i ∈ I} be a collection of sets indexed by the set I The jth projection (for j ∈ I) is the mapping p j:

i ∈I S i −→ S j

deﬁned by p j (s) = s(j) for j ∈ I.

Deﬁnition 1.22 Let X, Y be two sets and let f : X −→ Y be a function.

If U ⊆ Dom(f), the image of U under f is the set

f (U ) = {y ∈ Y | y = f(u) for some u ∈ X}.

If T ⊆ Y , the pre-image of T under f is the set f −1 (T ) = {x ∈ X | f(x) ∈

Conversely, suppose that x ∈ f −1 (Y − V ), so f(x) ∈ V , hence x ∈

f −1 (V ), which means that x ∈ X − f −1 (V ), which completes the proof.

Theorem 1.7 Let f : X −→ Y be a function If U, V ⊆ Dom(f) we have

f (U ∪ V ) = f(U) ∪ f(V ) and f(U ∩ V ) ⊆ f(U) ∩ f(V ).

If S, T ⊆ Ran(f), then f −1 (S ∪T ) = f −1 (S) ∪f −1 (T ) and f −1 (S ∩T ) =

f −1 (S) ∩ f −1 (T ).

Proof. This statement is a direct consequence of Deﬁnition 1.22

Theorem 1.8 Let f : X −→ Y be a function We have U ⊆ f −1 (f (U ))

for every subset U of X and f (f −1 (V )) ⊆ V for every subset V of Y

Proof. Let x ∈ u Since f(x) ∈ f(U), it follows that u ∈ f −1 (f (U )), so

U ⊆ f −1 (f (U )).

If y ∈ f(f −1 (V )) there exists x ∈ f −1 (V ) such that y = f (x) Since

f (x) ∈ V , it follows that y ∈ V , so f(f −1 (V )) ⊆ V

Theorem 1.9 Let X, Y be two sets and let f : X −→ Y be a function.

If U ⊆ X, then Ran(f) − f(U) ⊆ f(Dom(f) − U) If f is injective, then Ran(f ) − f(U) = f(Dom(f) − U).

Proof. Let y ∈ Ran(f) − f(U) Then, there exists x ∈ Dom(f) such that

y = f (x); however, since y ∈ f(U), there is no x1∈ U such that y = f(x1)

Thus, x ∈ Dom(f) − U and we have the desired inclusion.

Suppose now that f is injective and let y ∈ f(Dom(f) − U) Then,

y = f (x) for some x such that x ∈ U Clearly, y ∈ Ran(f) Furthermore,

Trang 30

y ∈ f(U) for otherwise, we would have y = f(x1) for some x1∈ U, x = x1

(because x ∈ U and x1 ∈ U), and this would contradict the injectivity of

f Thus, y ∈ Ran(f) − f(U) and we obtain the desired equality.

Corollary 1.2 Let f : X −→ X be a injection We have f(U) = f(U) for every subset U of X.

Proof. Since f is a bijection, we have Ran(f ) = X, so X − f(U) =

f (X − U) by Theorem 1.9, which amounts to f(U) = f(U).

Theorem 1.10 Let f : X −→ Y be a function If V, W ⊆ Y , then

There-suppose that x ∈ f −1 (V ) − f −1 (U ), that is, x ∈ f −1 (V ) and x ∈ f −1 (U ).

This means that f (x) ∈ V and f(x) ∈ U, so f(x) ∈ V − U, which yields

f (x) ∈ V − U Thus, x ∈ f −1 (V − U), which concludes the proof of the

Conversely, let x ∈i ∈I f −1 (C i ) We have x ∈ f −1 (C i ) for every i ∈ I,

so f (x) ∈ C i for i ∈ I Therefore, f(x) ∈i ∈I C i , hence x ∈ f −1

i ∈I C i

We leave to the reader to prove the third equality of the theorem

Deﬁnition 1.23 Let X, Y be two ﬁnite non-empty and disjoint sets and

let ρ be a relation, ρ ⊆ X × Y A perfect matching for ρ is an injective mapping f : X −→ Y such that if y = f(x), then (x, y) ∈ ρ.

Note that a perfect matching of ρ is an injective mapping f such that

f ⊆ ρ.

For a subset A of X deﬁne the set ρ[A] as

ρ[A] = {y ∈ Y | (x, y) ∈ ρ for some x ∈ A}.

Theorem 1.11 (Hall’s Perfect Matching Theorem) Let X, Y be two

ﬁnite non-empty and disjoint sets and let ρ be a relation, ρ ⊆ X ×Y There exists a perfect matching for ρ if and only if for every A ∈ P(X) we have

|ρ[A]| |A|.

Trang 31

Proof. The proof is by induction on |X| If |X| = 1, the statement is

immediate

Suppose that the statement holds for|X| n and consider a set X with

|X| = n + 1 We need to consider two cases: either |ρ[A]| > |A| for every subset A of X, or there exists a subset A of X such that |ρ[A]| = |A|.

In the ﬁrst case, since|ρ[{x}]| > 1 there exists y ∈ Y such that (x, y) ∈ ρ Let X = X − {x}, Y = Y − {y}, and let ρ = ρ ∩ (X × Y ) Note that

for every B ⊆ X we have|ρ [B] | |B| + 1 because for every subset A of

X we have |ρ[A| |A| + 1 and deleting a single element y from ρ[A] still

leaves at least|A| elements in this set By the inductive hypothesis, there exists a perfect matching f for ρ This matching extends to a matching f

and consider the relations ρ = ρ ∩ (X × Y ), and ρ = ρ ∩ (X × Y ) We

shall prove that there are perfect matchings f and f for the relations ρ

and ρ A perfect matching for ρ will be given by f ∪ f .

Since A is a proper subset of X we have both |A| n and |X − A| n For any subset B of A we have ρ [B] = ρ[B], so ρ satisﬁes the condition

of the theorem and a perfect matching f for ρ exists.

Suppose that there exists C ⊆ X such that|ρ [C] | < |C| This would

imply |ρ[C ∪ A]| < |C ∪ A| because ρ[C ∪ A] = ρ [C] ∪ ρ[A], which is

impossible Thus, ρ also satisﬁes the condition of the theorem and a

1.4 Sequences and Collections of Sets

Deﬁnition 1.24 A sequence of sets S = (S0, S1, , S n , ) is expanding

if i < j implies S i ⊆ S j for every i, j ∈ N.

If i < j implies S j ⊆ S i for every i, j ∈ N, then we say that S is a

contracting sequence of sets.

A sequence of sets is monotone if it is expanding or contracting.

Deﬁnition 1.25 Let S be an inﬁnite sequence of subsets of a set S, where S(i) = S for i ∈ N.

Trang 32

j =i S j is the upper limit of S These two sets are denoted by lim inf S

and lim sup S, respectively.

If x ∈ lim inf S, then there exists i such that x ∈ ∞ j =i S j; in other

words, x belongs to all but ﬁnitely many sets S i

If x ∈ lim sup S, then, for every i there exists j i such that such that

x ∈ S j ; in this case x belongs to inﬁnitely many sets of the sequence.

Clearly, we have lim inf S⊆ lim sup S.

Deﬁnition 1.26 A sequence of sets S is convergent if lim inf S = lim sup S.

In this case the set L = lim inf S = lim sup S is said to be the limit of the

sequence S and is denoted by lim S.

Example 1.11 Every expanding sequence of sets is convergent Indeed,

since S is expanding we have∞

j =i S j = S i Therefore, lim inf S =∞

i=0S i

On the other hand,∞

j =i S j ⊆∞ i=0S iand, therefore, lim sup S⊆ lim inf S.

This shows that lim inf S = lim sup S, that is, S is convergent.

A similar argument can be used to show that S is convergent when S is

contracting

In this chapter we will use the notion of set countability discussed, forexample, in [56]

Deﬁnition 1.27 LetC be a collection of subsets of a set S The collection

Cσ consists of all countable unions of members ofC

The collection Cδ consists of all countable intersections of members

Observe that by taking C n = C ∈ C for n 0 it follows that C ⊆ C σand

C ⊆ C δ Furthermore, if C, C are two collections of sets such that C ⊆ C ,

Trang 33

The operations σ and δ can be applied iteratively We denote sequences

of applications of these operations by subscripts adorning the aﬀected lection The order of application coincides with the order of these symbols

col-in the subscript For example (C)σδσ means ((Cσ)δ)σ Thus, Theorem 1.12can be restated as the equalities Cσσ=Cσ andCδδ =Cδ

Observe that if C = (C0, C1, ) is a sequence of sets, then lim sup C =

1.5 Partially Ordered Sets

If ρ is a partial order on S, we refer to the pair (S, ρ) as a partially ordered set or as a poset.

A strict partial order, or a strict order on S, is a relation ρ ⊆ S × S that

is irreﬂexive and transitive

Note that if ρ is a partial order on S, the relation

ρ1= ρ − {(x, x) | x ∈ S}

is a strict partial order on S.

From now on we shall denote by “” a generic partial order on a set S; thus, a generic partially ordered set is denoted by (S,)

Example 1.12 Let δ = {(m, n) | m, n ∈ N, n = km for some k ∈ N} Since n = 1 n it follows that (n, n) ∈ δ for every n ∈ N, so δ is a reﬂexive relation Suppose that (m, n) ∈ δ and (n, m) ∈ δ, so n = mk and m = nh for some k, h ∈ N This implies n(1 − kh) = 0 If n = 0, it follows that m = 0 If n = 0 we have kh = 1, which means that k = h = 1 because k, h ∈ N, so again, m = n Thus, δ is antisymmetric Finally, if (m, n), (n, p) ∈ δ we have n = rm and p = sn for some r, s ∈ N, so p = srm, which implies (m, p) ∈ δ This shows that δ is also transitive and, therefore,

it is a partial order relation onN

Example 1.13 Let π, σ be two partitions in PART(S) We deﬁne π σ

if each block C of σ is a π-saturated set.

It is clear that “” is a reﬂexive relation Suppose that π σ and

σ τ, where π, σ, τ ∈ PARTﬁn(S) Then each block D of τ is a union of blocks of σ, and each block of σ is a union of blocks of π Thus, D is a union of blocks of π and, therefore, π τ.

Suppose now that π σ and σ π Then, each block C of σ is a union of π-blocks, C =

i ∈I B i , and every π-block is a union of σ-blocks.

Trang 34

Since no block of a partition can be a subset of another block of the same

partition, it follows that each block of σ coincides with a block of π, that

If K s = ∅, we say that the set K is bounded above Similarly, if K i = ∅,

we say that K is bounded below If K is both bounded above and bounded below we will refer to K as a bounded set.

If K s=∅ (K i=∅), then K is said to be unbounded above (below).

Theorem 1.13 Let (S, ) be a poset and let U and V be two subsets of

S If U ⊆ V , then we have V i ⊆ U i and V s ⊆ U s

Also, for every subset T of S, we have T ⊆ (T s)i and T ⊆ (T i)s

Proof. The argument for both statements of the theorem amounts to a

Note that for every subset T of a poset S, we have both

and

Indeed, since T ⊆ (T i)s, by the ﬁrst part of Theorem 1.13, we have

((T s)i)s ⊆ T s By the second part of the same theorem applied to T s,

we have the reverse inclusion T s ⊆ ((T s)i)s , which yields T s = ((T s)i)s

Theorem 1.14 For any subset K of a poset (S, ρ), the sets K ∩ K s and

K ∩ K i contain at most one element.

Proof. Suppose that y1, y2 ∈ K ∩ K s Since y1 ∈ K and y2 ∈ K s, we

have (y1, y2) ∈ ρ Reversing the roles of y1 and y2 (that is, considering

now that y2∈ K and y1∈ K s ), we obtain (y2, y1)∈ ρ Therefore, we may conclude that y1= y2because of the antisymmetry of the relation ρ, which shows that K ∩ K scontains at most one element

A similar argument can be used for the second part of the proposition;

Deﬁnition 1.29 Let (S, ) be a poset The least (greatest) element of the subset K of S is the unique element of the set K ∩K i (K ∩K s, respectively)

if such an element exists

Trang 35

If K is unbounded above, then it is clear that K has no greatest element Similarly, if K is unbounded below, then K has no least element.

Applying Deﬁnition 1.29 to the set S, the least (greatest) element of the poset (S, ) is an element a of S such that a x (x a, respectively) for all x ∈ S.

It is clear that if a poset has a least element u, then u is the unique

minimal element of that poset A similar statement holds for the greatestand the maximal elements

Deﬁnition 1.30 The subset K of the poset (S, ) has a least upper bound

u if K s ∩ (K s)i={u}.

K has the greatest lower bound v if K i ∩ (K i)s={v}.

We note that a set can have at most one least upper bound and at most

one greatest lower bound Indeed, we have seen above that for any set U the set U ∩ U imay contain an element or be empty Applying this remark

to the set K s , it follows that the set K s ∩ (K s)i may contain at most one

element, which shows that K may have at most one least upper bound A

similar argument can be made for the greatest lower bound

If the set K has a least upper bound, we denote it by sup K The greatest lower bound of a set will be denoted by inf K These notations come from the Latin terms supremum and inﬁmum used alternatively for

the least upper bound and the greatest lower bound, respectively

Lemma 1.1 Let U, V be two subsets of a poset (S, ) If U ⊆ V then

Proof. By Lemma 1.1 we have L s ⊆ K s and L i ⊆ K i By the same

Lemma, we have: (K s)i ⊆ (L s)i and (K i)s ⊆ (L i)s

Let a = sup K and b = sup L Since {a} = K s ∩ (K s)i and (K s)i ⊆ (L s)i , it follows that a ∈ (L s)i Since b ∈ L s , this implies a b.

If c = inf K and d = inf L, taking into account that {c} = K i ∩ (K i)s,

we have c ∈ (L i)s because K i ∩ (K i)s ⊆ (L i)s Since d ∈ L i we have d c.

Trang 36

Example 1.14 A two-element subset{m, n} of the poset (N, δ) introduced

in Example 1.12 has both an inﬁmum and a supremum Indeed, let p be the least common multiple of m and n Since (n, p), (m, p) ∈ δ, it is clear that

p is an upper bound of the set {m, n} On the other hand, if k is an upper

bound of{m, n}, then k is a multiple of both m and n In this case, k must also be a multiple of p because otherwise we could write k = pq + r with

0 < r < p by dividing k by p This would imply r = k − pq; hence, r would

be a multiple of both m and n because both k and p have this property However, this would contradict the fact that p is the least multiple that

m and n share! This shows that the least common multiple of m and n

coincides with the supremum of the set{m, n} Similarly, inf{m, n} equals the greatest common divisor m and n.

Example 1.15 Let π, σ be two partitions in PART(S) It is easy to see

that the collection θ = {B ∩ C | B ∈ π, C ∈ σ, B ∩ C = ∅} is a partition

of S; furthermore, if τ is a partition of S such that π τ and σ τ, then each block E of τ is both a π-saturated set and a σ-saturated set, and, therefore a θ-saturated set This shows that τ = inf {π, σ}.

The partition will be denoted by π ∧ σ.

Deﬁnition 1.31 A minimal element of a poset (S, ) is an element x ∈ S

such that {x} i = {x} A maximal element of (S, ) is an element y ∈ S

such that{y} s={y}.

In other words, x is a minimal element of the poset (S,) if there is no

element less than or equal to x other than itself; similarly, x is maximal if there is no element greater than or equal to x other than itself.

For the poset (R, ), it is possible to give more speciﬁc descriptions of

the supremum and inﬁmum of a subset when they exist

Theorem 1.16 If T ⊆ R, then u = sup T if and only if u is an upper bound of T and, for every > 0, there is t ∈ T such that u − < t u The number v is inf T if and only if v is a lower bound of T and, for every > 0, there is t ∈ T such that v t < v +

Proof. We prove only the ﬁrst part of the theorem; the argument for thesecond part is similar and is left to the reader

Suppose that u = sup T ; that is, {u} = T s ∪ (T s)i Since u ∈ T s, it is

clear that u is an upper bound for T Suppose that there is > 0 such that

no t ∈ T exists such that u − < t u This means that u − is also an

Trang 37

upper bound for T , and in this case u cannot be a lower bound for the set

of upper bounds of T Therefore, no such may exist.

Conversely, suppose that u is an upper bound of T and for every > 0, there is t ∈ T such that u − < t u Suppose that u does not belong

to (K s)i This means that there is another upper bound u of T such

that u < u Choosing = u − u , we would have no t ∈ T such that

u − = u < t u because this would prevent u from being an upper

bound of T This implies u ∈ (K s)i , so u = sup T

Theorem 1.17 In the extended poset of real numbers ( ˆ R, ) every subset has a supremum and an inﬁmum.

Proof. If a set is bounded then the existence of the supremum and ﬁmum is established by the Completeness Axiom Suppose that a subset

in-S of ˆ R has no upper bound in R Then x ∞, so ∞ is an upper bound

of S in ˆ R Moreover ∞ is the unique upper bound of S, so sup S = ∞ Similarly, if S has no lower bound in R, then inf S = −∞ in ˆR. The deﬁnitions of inﬁmum and supremum of the empty set in ( ˆR, )

are

sup∅ = −∞ and inf ∅ = ∞,

in order to remain consistent with Theorem 1.15

A very important axiom for the setR is given next

The Completeness Axiom forR: If T is a non-empty subset

ofRthat is bounded above, then T has a supremum.

A statement equivalent to the Completeness Axiom forR follows

Theorem 1.18 If T is a non-empty subset of R that is bounded below, then T has an inﬁmum.

Proof. Note that the set T i is not empty If s ∈ T i and t ∈ T , we have s t, so the set T i is bounded above By the Completeness Axiom

v = sup T i exists and{v} = (T i)s ∩ ((T i)s)i = (T i)s ∩ T i by equality (1.1)

Trang 38

Theorem 1.19 (Dedekind’s Theorem) Let U and V be non-empty

sub-sets of R such that U ∪ V = R and x ∈ U, y ∈ V imply x < y Then, there exists a ∈ R such that if x > a, then x ∈ V , and if x < a, then x ∈ U.

Proof. Observe that U = ∅ and V ⊆ U s Since V = ∅, it means that U is bounded above, so by the Completeness Axiom sup U exists Let a = sup U Clearly, u ≤ a for every u ∈ U Since V ⊆ U s , it also follows that a v for every v ∈ V

If x > a, then x ∈ V because otherwise we would have x ∈ U since

U ∪ V = R and this would imply x a Similarly, if x < a, then x ∈ U

Using the previously introduced notations, Dedekind’s theorem can be

stated as follows: if U and V are non-empty subsets of R such that U ∪V =

R, U s ⊆ V , V i ⊆ U, then there exists a such that {a} s ⊆ V and {a} i ⊆ U.

One can prove that Dedekind’s theorem implies the Completeness

Ax-iom Indeed, let T be a non-empty subset of R that is bounded above

Therefore V = T s = ∅ Note that U = (T s)i = ∅ and U ∪V = R Moreover,

U s = ((T s)i)s = T s = V and V i = (T s)i = U Therefore, by Dedekind’s theorem, there is a ∈ R such that {a} s ⊆ V = T s and {a} i ⊆ U = (T s)i

Note that a ∈ {a} s ∩ {a} i ⊆ T s ∩ (T s)i , which proves that a = sup T

By adding the symbols +∞ and −∞ to the set R, one obtains the set

ˆ

R The partial order deﬁned on R can now be extended to ˆR by −∞ x and x +∞ for every x ∈ R.

Note that, in the poset ( ˆR, ), the sets T i and T s are non-empty for

every T ∈ P(ˆR) because −∞ ∈ T i and +∞ ∈ T s for any subset T of ˆR

Theorem 1.20 For every set T ⊆ ˆR, both sup T and inf T exist in the poset ( ˆ R, ).

Proof. We present the argument for sup T If sup T exists in ( R, ), then

it is clear that the same number is sup T in ( ˆ R, ).

Assume now that sup T does not exist in ( R, ) By the Completeness

Axiom forR, this means that the set T does not have an upper bound in

(R, ) Therefore, the set of upper bounds of T in ( ˆ T , ) is T sˆ={+∞}.

It follows immediately that in this case sup T = + ∞ in (ˆR, ).

Theorem 1.21 Let I be a partially ordered set and let {x i | i ∈ I} be a subset of ˆ R indexed by I For i ∈ I let S i ={x j | j ∈ I and i j} We have

Trang 39

Proof. Note that if i h, then S h ⊆ S i for i, h ∈ I As we saw earlier, each set S i has both an inﬁmum y i and a supremum z iin the poset ( ˆR, ).

It is clear that if i h, then y i y h z h z i We claim that

sup{y i | i ∈ I} inf{z h | h ∈ I}.

Indeed, since then y i z h for all i, h such that i h, we have y i inf{z h |

h ∈ I} for all i ∈ I Therefore,

sup{y i | i ∈ I} inf{z h | h ∈ I},

which can be written as

sup inf S i inf sup S i

Let S be a set and let f : S −→ ˆR The image of S under f is the set

f (S) = {f(x) | x ∈ S} Since f(S) ⊆ ˆR, sup f(S) exists Furthermore, if there exists u ∈ S such that f(u) = sup f(S), then we say that f attains its supremum at u This is not always the case as the next example shows.

Example 1.16 Let f : (0, 1) −→ ˆR be deﬁned by f(x) = 1

1−x It is clearthat sup f ((0, 1)) = ∞ However, there is no u ∈ (0, 1) such that f(u) = ∞,

so f does not attain its supremum on (0, 1).

Let X, Y be two sets and let f : X × Y −→ ˆR be a function We have

sup

x ∈X yinf∈Y f (x, y) inf

y ∈Y xsup∈X f (x, y). (1.4)

Indeed, note that infy ∈Y f (x, y) f(x, y) for every x ∈ X and y ∈ Y by

the deﬁnition of the inﬁmum Note that the left member of the inequality

depends only on x The last inequality implies sup x ∈Xinfy ∈Y f (x, y) supx ∈X f (x, y), by the monotonicity of sup and now the current left member

is a lower bound of the set {z = sup x ∈X f (x, y) | y ∈ Y } This implies

immediately the inequality (1.4)

If instead of inequality (1.4) the function f satisﬁes the equality:

sup

x ∈X yinf∈Y f (x, y) = inf y ∈Y xsup∈X f (x, y), (1.5)

then the common value of both sides is a saddle value for f

Since infy ∈Y f (x, y) is a function h(x) of x and sup x ∈X f (x, y) is a

function g(y) of y, the existence of a saddle value for f implies that

supx ∈X h(x) = inf y ∈Y g(y) = v.

Trang 40

If both supx ∈X h(x) and inf y ∈Y g(y) are attained, that is, there are

x0∈ X and y0∈ Y such that

is referred to as a saddle point for f

Conversely, if there exists a saddle point (x0, y0) such that f (x, y0)

f (x0, y0) f(x0, y), then f : X × Y −→ R has a saddle value Indeed, in

x ∈X yinf∈Y f (x, y).

Since a saddle value exists, these inequalities become equalities and we have

a saddle value

Deﬁnition 1.32 Let (S, ) be a poset A chain of (S, ) is a subset T

of S such that for every x, y ∈ T such that x = y we have either x < y or

y < x If the set S is a chain, we say that (S, ) is a totally ordered set and

the relation is a total order.

Định dạng
Số trang	968
Dung lượng	6,52 MB