The Asymptotic Optimal Partition and Extensionsof The Nonsubstitution Theorem Julio-Roberto Hasfura-Buenaga†, Allen Holder††∗, and Jeffrey Stuart††† March 13, 2002 AbstractThe data descr
Trang 1Trinity University
Digital Commons @ Trinity
1-2005
The Asymptotic Optimal Partition and Extensions
of the Nonsubstitution Theorem
Follow this and additional works at: https://digitalcommons.trinity.edu/math_faculty
This Post-Print is brought to you for free and open access by the Mathematics Department at Digital Commons @ Trinity It has been accepted for
inclusion in Mathematics Faculty Research by an authorized administrator of Digital Commons @ Trinity For more information, please contact
jcostanz@trinity.edu
Repository Citation
Hasfura-Buenaga, J.-R., Holder, A., & Stuart, J (2005) The asymptotic optimal partition and extensions of the nonsubstitution
theorem Linear Algebra and Its Applications, 394, 145-167 doi:10.1016/j.laa.2004.05.018
Trang 2The Asymptotic Optimal Partition and Extensions
of The Nonsubstitution Theorem Julio-Roberto Hasfura-Buenaga†, Allen Holder††∗, and Jeffrey Stuart†††
March 13, 2002
AbstractThe data describing an asymptotic linear program rely on a single param-
eter, usually referred to as time, and unlike parametric linear programming,
asymptotic linear programming is concerned with the steady state behavior as
time increases to infinity The fundamental result of this work shows that the
optimal partition for an asymptotic linear program attains a steady state for
a large class of functions Consequently, this allows us to define an asymptotic
center solution We show that this solution inherits the analytic properties of
the functions used to describe the feasible region Moreover, our results
al-low significant extensions of an economics result known as the Nonsubstitution
Theorem
Key Words: Asymptotic Linear Programming, Analytic Matrix Theory,
Op-timal Partition, Mathematical Economics, Nonsubstitution Theorem
† Department of Mathematics, Trinity University, San Antonio, TX, USA
††Hearin Center for Enterprise Science, School of Business Administration, The
University of Mississippi, University, MS, USA
††† Department of Mathematics, Pacific Lutheran University, Tacoma, WA,
USA
∗ Research supported by ONR grant N00014-01-1-0917 Research conducted
at Trinity University
Trang 31 Introduction
The data describing many business and economic linear programs depend on asingle parameter t, usually viewed as time As such, understanding the dynam-ics of a solution as time progresses is important, and steady-state properties areoften desired A property stabilizes if it attains a steady-state for all sufficientlylarge t, (typical properties are feasibility and boundedness)
The foundational work on asymptotic linear programming was done byJeroslow in [15] and [16], where the author assumes that the data functionsare rational In [15], the author shows that an optimal basis becomes stable forsufficiently large t, and that the number of basic optimal solutions stabilizes.This article also shows how to use the simplex method to produce a steady-state optimal basis The continuity properties of a basic optimal solution nearits poles are investigated in [16] Bernard [3, 4] has studied the complexity ofupdating a basis in the special case of the data being linear in t Economicmodels are developed and analyzed in [2] and [4]
Throughout, we are concerned with the asymptotic linear program
LP (t) min{cT(t)x : A(t)x = b(t), x≥ 0},and it associated dual
LD(t) max{bT(t)y : AT(t)y + s = c(t), s≥ 0},where A(t) : IR→ IRm×n, b(t) : IR→ IRm, and c(t) : IR→ IRn For any t∈ IR,the data instance defining LP (t) is (A(t), b(t), c(t)) The feasible region for
LP (t) is denoted byP(t), and the strict interior is Po(t) ={x ∈ P(t) : x > 0}.Similarly, the dual feasible region is D(t), and its strict interior is Do(t) ={(y, s) ∈ D(t) : s > 0} The primal and dual optimal sets are denoted by P∗(t)and D∗(t), respectively The necessary and sufficient optimality conditions for
of linear programming, the practical performance of this algorithm was pointing As such, the mathematical programming community’s focus remained
Trang 4disap-on the simplex algorithm This changed in 1986 when Karmarkar [17] claimed
to have an interior point algorithm that out performed the simplex algorithm.This claim was heavily scrutinized by the academic community, and we nowunderstand that interior point algorithms are not just viable alternatives tothe simplex algorithm, but that they do indeed out perform simplex basedprocedures on large problems
The most prevalent interior point algorithms are called path-following rior point algorithms, and these algorithms follow an infinitely smooth curve,called the central path, towards optimality Our succinct development of thecentral path is adequate for our purposes, but interested readers are directed
inte-to the three texts of Roos, Telaky, and Vial [23], Wright [27], and Ye [28]for a complete development The central path is constructed by replacing thecomplementarity constraint in (3) with
where X is the diagonal matrix of x, µ is positive, and e is the vector of ones.Notice that this constraint requires an x and a (y, s) such that x > 0 and s > 0,and hence, it requires that the primal and dual strict interiors be nonempty
—i.e Po(t) 6= ∅ and Do(t) 6= ∅ Because we are interested in the solutionsprovided by path-following interior point algorithms, we make the followingassumption
Assumption 1 For sufficiently large t∈ IR, the strict interiors of the primaland dual feasible regions are nonempty
Assumption 1 is equivalent to assuming that the primal and dual optimal setsare bounded for large t [27], and without loss in generality we assume through-out that t is large enough to satisfy this assumption The x and s components of
a solution to the system (1), (2), and (4) are unique and are denoted by x(µ, t)and s(µ, t)) (see any of [21, 23, 27, 28]) The reason that y is not guaranteed
to be unique is that y and s are not guaranteed to be related in a one-to-onefashion —i.e A(t) is not guaranteed to have full row rank To overcome thisdifficulty, we set y(µ, t) = (AT(t))+(c(t)−s(µ, t)), where (AT(t))+is the Moore-Penrose pseudo inverse of AT(t) We make the following naming conventionsfor a fixed t
The central path at time t : {(x(µ, t), y(µ, t), s(µ, t)) : µ > 0}The primal central path at time t : {x(µ, t) : µ > 0}
The dual central path at time t : {(y(µ, t), s(µ, t) : µ > 0}
The central path has a unique limit, called the center solution, which is in thestrict interior of the optimal set Denoting this limit by (x∗(t), y∗(t), s∗(t)), we
Trang 5have for sufficiently large t that
lim
µ ↓0x(µ, t) = x∗(t)∈ P∗(t), andlim
µ ↓0(y(µ, t), s(µ, t)) = (y∗(t), s∗(t))∈ D∗(t)
Unlike a basic optimal solution, the analytic center solution is always strictlycomplementary, meaning that (x∗(t))Ts∗(t) = 0 and x∗(t) + s∗(t) > 0 (Anearly result due to Goldman and Tucker guarantees that every solvable linearprogram has such a solution [7].) Any strictly complementary solution inducesthe optimal partition, which for sufficiently large t is defined by
B(t) ={i : x∗i(t) > 0}, and
N (t) ={1, 2, 3, , n}\B(t)
The set B(t) indicates the collection of primal variables allowed to be positive
at optimality, and the set N (t) indicates the collection of primal variables thatare zero in every optimal solution The roles of B(t) and N (t) are reversed forthe dual problem, so N (t) indexes the dual slack variables allowed to be positive
at optimality, and B(t) indicates the collection of dual slack variables forced to
be zero at optimality Allowing a set subscript on a vector (matrix) to be thesubvector (submatrix) corresponding with the components (columns) indexed
by the set, we have that the optimal partition characterizes the optimal sets asfollows,
P∗(t) = {x ∈ P(t) : xN (t)= 0}
= {x : AB(t)(t)xB(t)= b(t), xB(t)≥ 0, xN (t)= 0} (5)and
D∗(t) ={(y, s) ∈ D(t) : sB(t)= 0} =
{(y, s) : AT
B(t)(t)y = cB(t)(t), ATN (t)(t)y + sN (t)= cTN (t)(t), sN (t)≥ 0} (6)The strict interiors of the optimal sets are
(P∗(t))o = {x ∈ P∗(t) : xB(t)> 0}, and(D∗(t))o = {(y, s) ∈ D∗(t) : sN (t)> 0}
The primal center solution is the analytic center ofP∗(t), and the dual centersolution is the analytic center of D∗(t) This means that x∗(t) is the uniquesolution to
Trang 6The necessary and sufficient Lagrange conditions for the mathematical program
in (7) are the existence of a ρ and a γ such that
ρ and γ is not one-to-one Subsequently, ρ is unique only if AB(t)(t) has full rowrank We later use the fact that AB(t)(t) and b(t) could have been replaced in (7)
by a submatrix of AB(t)(t) having full row rank and a corresponding subvector
of b(t) —i.e via row reduction If such a substitution were undertaken, we havethat the solution to (8) is unique and that x∗B(t)(t) remains uniquely optimal(but γ and ρ are different) Similar conditions are available for the dual centersolution
Our goal is to revisit the topics first investigated by Jeroslow, but instead ofdealing with basic optimal solutions, we deal with the optimal partition and thecenter solution We note that our approach is more general for the followingtwo reasons First, if LP (t) and LD(t) have unique solutions for sufficientlylarge t, the center solution is basic Since we show in Section 2 that the centersolution stabilizes, our results include the case of unique optimal basis —i.e.our results reduce to Jeroslow’s results when the optimal solution is uniquefor all sufficiently large t Second, our analysis is more general because itdoes not require that the data be rational in t (asymptotic linear programs
in the literature have been built with rational functions [15, 16] and linearfunctions [2, 3, 4, 29]) In fact, the only restriction made on A(t), b(t), and c(t)
is that they adhere to Assumption 2
Assumption 2 We assume that the triple (A(t), b(t), c(t)) is well-behaved,meaning that there exists a time T , such that for t ≥ T , the functions A(t),b(t), and c(t) are continuous and have the property that the determinants of allsquare submatrices of
A(t) 0 b(t)
0 AT(t) c(t)
are either constant or have no roots
Trang 7For example, if (A(t), b(t), c(t)) is rational, the determinants of the square matrices are rational and Assumption 2 is satisfied However, the class offunctions with which we deal is substantially larger than the set of rationalfunctions.
sub-We are interested in properties that reach a steady state or stabilize as timeattains sufficiently large values One of the main results of this paper shows thatthere exists a time T , such that for all t≥ T , the optimal partition stabilizes
In other words, we show that there exists a time T , such that the components of
an optimal solution required to be zero at T are precisely the decision variablesthat must be zero for each t≥ T Hence, the collection of variables that must
be zero in an optimal solution stabilizes
The paper proceeds as follows In Section 2 we present a simple argumentshowing that the optimal partition stabilizes Using this result, we developsome analytic properties in Section 3 In Section 4 we show that the results ofSection 2 have economic implications by extending a famous economics resultcalled The Nonsubstitution Theorem Conclusions and directions for futureresearch are located in Section 5
Some brief notes on notation are warranted before we begin our ment A superscript + on a matrix indicates the Moore-Penrose pseudo in-verse (a good reference is Campbell and Meyer [5]) Capitalizing a vectorvariable forms a diagonal matrix whose main diagonal is comprised of the ele-ments of the vector So, if x and γ are vectors, X = diag(x1, x2, , xn) and
develop-Γ = diag(γ1, γ2, , γn) The rank, column space, and null space of a matrix Aare denoted rank(A), col(A), and null(A), respectively The determinant of thematrix A is det(A) The collection of real valued functions having n continuousderivatives is denoted Cn, and we use the standard notation thatC0 is the set
of continuous functions For notational ease, we say that the matrix function
M (t) is in Cn if every component function of M (t) is inCn Other notation isstandard within the mathematical programming community and may be found
in the Mathematical Programming Glossary [8]
The main objective of this section is to establish that the optimal partitionstabilizes, and we define the asymptotic optimal partition to be the optimalpartition that attains a steady-state The following example clarifies our ob-jectives
t
, and c(t) =
1/ttan−1(t)
Trang 8
Let ˆx(t) be an optimal solution at time t Then,
,
we have that the components forced to be zero at optimality change with everysolution to tan(t) = 1 + 1/et Since this equation has an unbounded sequence
of solutions, the desired stability does not exist Notice that for this c, we havekc(t)k = 1/t, which is monotonically decreasing Hence, component functionsthat provide monotonic norms are not sufficient We also point out that theoptimal partition exists for t =∞ (assuming t is in IR∗ = IR∪ {∞}) In thiscase we have that A(∞) = [1, 1], b(∞) = (1), and c(∞) = (0, 0)T, which impliesthat (B(∞)|N(∞)) = ({1, 2}|∅) We mention this to distinguish the differencebetween behavior at∞, which we are not investigating, and asymptotic behavior,which we are investigating In this last situation we have that the optimalpartition does not stabilize because for every t1 we can find a larger t2 suchthat the optimal partitions are different However, the partition does exist for
t =∞
Let{(B1|N1), (B2|N2), , (B2n|N2 n
)} be all possible two set partitions of{1, 2, , n} For any fixed time, one of these partitions is the optimal partitionfor LP (t) We relate t to a partition by defining φ(t) : IR → {1, 2, , 2n},such that the optimal partition of LP (t) is (Bφ(t), Nφ(t)) We note that φ iswell defined because the optimal partition is unique The goal of this sectionmay now be stated as showing that there exists T such that φ(t) is constantfor t≥ T
For j = 1, 2, , 2n, let vj = (vT1, v2T, vT3)T be partitioned as xTBj, yT, sTNj
T
.Define
Trang 9We say that vj is sufficiently positive, written vj>|0, if v1 > 0 and v3 > 0.Observe that ˆvφ(t)= (ˆv1T, ˆvT2, ˆv3T)T relates to
Lemma 1 Let M (t) be a matrix function whose component functions havethe property that there exists a time T , such that for all t ≥ T , the determi-nants of all square submatrices are either constant or have no roots Then, therank(M (t)) stabilizes
Proof: Let T be such that for all t ≥ T , the determinants of all squaresubmatrices of M (t) have either become constant or have no roots Let S(T )
be a maximal submatrix of M (T ) with nonzero determinant Then, all largersquare submatrices have a determinant of zero for t≥ T Since det(S(t)) 6= 0for t≥ T , we have that rank(M(t)) = rank(S(t)) for t ≥ T
The second lemma shows that the optimal partition remains constant over
a neighborhood provided that hj(t) remains in the column space of Hj(t), andthat the Moore-Penrose pseudo inverse of Hj(t) is continuous The continuity
of Hj+(t) might appear self serving, but as we shall see, this condition is tiedclosely to the rank of Hj(t), which is easier to deal with
Lemma 2 Let t0 be large enough to satisfy Assumption 1, and set j = φ(t0).Let N be a neighborhood of t0 such that Hj+(t) is continuous over N and that
hj(t)∈ col(Hj(t)) for t∈ N Then, the optimal partition is constant over someneighborhood about t0
Proof: Let vj(t0) be a sufficiently positive solution to Hj(t0)vj = hj(t0).Then, vj(t0) = Hj+(t0)hj(t0) + q(t0), where q(t0)∈ null(Hj(t0)) Let
vj(t) = Hj+(t)hj(t) + (I− Hj+(t)Hj(t))(q(t0) + Hj+(t0)hj(to)− Hj+(t)hj(t)).The proof follows once we show that for t sufficiently close to t0, vj(t) is asufficiently positive solution to Hj+(t0)vj = hj(t0) First, since
(I− Hj+(t)Hj(t))(q(t0) + Hj+(t0)hj(to)− Hj+(t)hj(t))∈ Null(Hj(t))
Trang 10we have
Hj(t)vj(t) = Hj(t)Hj+(t)hj(t)
= hj(t),where the last equality follows because hj(t) ∈ col(Hj(t)) Second, becauseboth Hj+(t) and hj(t) are continuous at t0, Hj+(t0)hj(to)− Hj+(t)hj(t)→ 0 as
t→ 0 Hence, as t → t0
(I− Hj+(t)Hj(t))(q(t0) + Hj+(t0)hj(to)− Hj+(t)hj(t))
→ (I − H+
j (t0)Hj(t0))q(t0) = q(t0),where the last equality follows because q(t0)∈ Null(Hj(t0)) We now have that
vj(t) = Hj+(t)hj(t) + (I − Hj+(t)Hj(t))(q(t0) + Hj+(t0)hj(to)− Hj+(t)hj(t))
→ Hj+(t0)hj(t0) + q(t0)
>| 0,
which completes the proof
Lemma 2 connects the local stability of the optimal partition with the tinuity of Hj+(t), and Lemma 3 shows that the Moore-Penrose pseudo inverse
con-is continuous so long as rank con-is preserved Thcon-is result, together with Lemma 1,allow us to use the steady-state behavior of the rank of Hj(t) to show that theoptimal partition stabilizes A proof of the following result is found in [5]
Lemma 3 The matrix function Hj+(t) is continuous at t0 if, and only if,rank(Hj(t0)) = rank(Hj(t)), for t sufficiently close to t0
We are now ready to establish that the optimal partition of LP (t) and LD(t)stabilizes for sufficiently large t
Theorem 1 Assume that (A(t), b(t), c(t)) satisfies Assumptions 1 and 2 Then,there exists a T , such that for all t≥ T , (B(t)|N(t)) = (Bφ(T )|Nφ(T ))
Proof: We first note that Hj(t)vj = hj(t) has a solution if, and only if,rank(Hj(t)) = rank([Hj(t)|hj(t)]) From Assumption 2 and Lemma 1 we havethat there is a T1 such that for all t≥ T1 and all j ∈ {1, 2, , 2n},
rank(Hj(T1)) = rank(Hj(t)) andrank([Hj(T1)|hj(T1)]) = rank([Hj(t)|hj(t)])
Assumption 1 implies that there exists T2> T1such that for t≥ T2, there exists
a sufficiently positive solution to Hφ(t)(t)vφ(t) = hφ(t)(t) Let T > T2 > T1
Trang 11Then, rank(Hφ(t)(t)) is constant and hφ(t)(t) ∈ col(Hφ(t)(t)), for t ≥ T FromLemma 2 we have that there is an open neighborhood,N1, about T such that
T2 6∈ N1 and (B(t)|N(t)) = (B(T )|N(T )), for t ∈ N1 Let
N2 ={T + ˆδ : (B(t + δ)|N(t + δ)) = (B(T )|N(T )), δ ∈ [0, ˆδ]}
Again, from Lemma 2 we have that for any t ∈ N1 ∪ N2, there is an openneighborhood about t over which the optimal partition is stable, which meansthatN1∪ N2 is open Now, let
ˆ
t = inf{t > T : (B(t)|N(t)) 6= (B(T )|N(T ))}
Suppose for the sake of attaining a contradiction that ˆt < ∞ Since N1 ∪ N2
is open, we have that (B(T )|N(T )) 6= (B(ˆt)|N(ˆt)) From Lemma 2 we knowthat there exists an open neighborhood, N3, about ˆt such that (B(t)|N(t)) =(B(ˆt)|N(ˆt)) for t ∈ N3 However, N2∩ N3 6= ∅, and for any t ∈ N1∩ N2 wehave the contradiction that
(B(T )|N(T )) = (B(t)|N(t)) = (B(ˆt)|N(ˆt))
Hence, (B(t)|N(t)) = (B(T )|N(T )) for all t ≥ T
Theorem 1 shows that the optimal partition stabilizes, and this result allows
us to make the following definitions
Definition 1 Assuming the data functions adhere to Assumptions 1 and 2, wedefine the asymptotic optimal partition to be the unique partition for whichthere exists T such that (B(t)|N(t)) = (B(T )|N(T )), for all t ≥ T We denotethis partition by ( ¯B| ¯N ), and we set T to be a sufficiently large time so that(B(T )|N(T )) = ( ¯B| ¯N )
Definition 2 Under Assumptions 1 and 2, and for t ≥ T , the asymptoticcenter solution, x∗(t) = (x∗B¯(t), x∗N¯(t)) = (x∗B¯(t), 0), is defined so that x∗B¯(t)
is the unique solution to
max
X
In this section we have established, under mild assumptions, that the timal partition attains a steady-state as time proceeds to infinity This meansthat the collection of variables that are zero in every optimal solution becomesinvariant for sufficiently large time Using this information, we defined theasymptotic optimal partition and subsequently defined the asymptotic centersolution Properties of this unique solution are studied in the next section
Trang 12op-3 Analytic Properties of the Asymptotic alytic Center
An-In this section we exploit the fact that the optimal partition stabilizes to attainanalytic properties of the asymptotic center solution For a fixed t ≥ T , theanalytic properties of the central path and the center solution are well studied.For example, the elements of the central path are analytic functions of µ, b,and c, a fact first recognized by Sonnevend [25] Differential properties ofthe central path with respect to µ are important for algorithm design and arefound in [1, 10, 11, 13, 26, 30] Analytic properties of the center solution withrespect to b and c are studied in [13] and [14] However, all of these resultsassume that the coefficient matrix is fixed, and the only papers that considerthe more difficult situation of perturbing A are [6] and [9] Since each of A, b,and c depend on t in the asymptotic linear program, the results of this sectionare significantly different than those in the literature Because the results ofthis section are asymptotic, we assume for linguistic simplicity that t ≥ Tthroughout this section
The main result of this section states that the asymptotic center solutioninherits the analytic properties of A(t) and b(t) So, since both A(t) and b(t)are continuous, x∗(t) is continuous, and if A(t) and b(t) are differentiable, x∗(t)
is differentiable The proofs establishing the continuity and differentiability of
x∗(t) are handled separately The reason for the separate arguments is thatthe vehicle of proof for differentiability is the implicit function theorem, which
is not applicable unless the data functions are themselves differentiable Thecontinuity of x∗(t) is proven through an adaptation of an argument in [6] Toexplain this approach, we introduce some notation and generalize the definition
of the analytic center Let {U(t), u(t)} be matrix functions in IRm×n× IRm,and for each t, suppose that P (U (t), u(t)) defined by {x : U(t)x ≤ u(t)} isbounded For x ∈ P (U(t), u(t)), define s = u(t) − U(t)x and let I = {i :
si > 0 for some x ∈ P (U(t), u(t))} The analytic center of P (U(t), u(t)) is
xc(U (t), u(t)) and is the unique solution to
max
(X
i∈I
ln(si) : x∈ P (U(t), u(t))
)
The following small example illustrates the difficulty of dealing with a constant coefficient matrix In particular, it shows that the analytic centerneed not be continuous even if U (t) and u(t) are smooth
Trang 13For t 6= 100 we have that I = {4}, but for t = 100, I = {3, 4} It is easy
to check that xc(U (t), u(t)) = (0, 1), for all t 6= 100 (in fact this is the onlyelement in P (U (t), u(t))), but that xc(U (100), u(100)) = (1/2, 1/2)
From this example we see that the analytic center is not necessarily ous with respect to changes in the matrix coefficients An important observation
continu-is that the first two constraints are implied equalities for t = 100, but that thefirst three constraints are implied for t6= 100 Moreover, notice that
is 2 for t 6= 100 and 1 for t = 100 What the authors of [6] were able to show
is that the analytic center is continuous with respect to matrix perturbations
at t0, so long as the rank of the matrix formed by the implied equalities at
t0 is constant over some sufficiently small neighborhood of t0 To state thisprecisely, we partition the rows of U (t) and u(t) at t = t0 as indicated,
at0(t)
bt0(t)
,
where At0(t0)x = at0(t0) for all x ∈ P (U(t0), u(t0)) and Bt0(t0)x < bt0(t0) forsome x ∈ P (U(t0), u(t0)) —i.e I indexes the rows of the submatrix B Forexample, consider {U(t), u(t)} from the previous example, and let t0 = 100.Then, the first two inequalities form the collection of implied equalities at t0,which means that
0 −1
,
at0(t) =
1
−1
, and bt0(t) =
00