Handbook of Economic Forecasting part 50 pptx

As the advantage of flexibility arises entirely from nonlinearity in the predictors and the computational challenges arise entirely from nonlinearity in the parameters, it makes sense to

Trang 1

= EY t − μ(X t )2

+ Eμ(X t ) − X

t β2

The final equality follows from the fact that for all β

E

Y t − μ(X t )

μ(X t ) − X

t β

= EE

Y t − μ(X t )

μ(X t ) − X

t β X t

= EE

Y t − μ(X t ) X t

μ(X t ) − X tβ

= 0, because E [(Y t − μ(X t )) | X t] = 0 Thus,

E

Y t − X tβ2

= EY t − μ(X t )2

+ Eμ(X t ) − X tβ2

(3)

= σ2

∗ +

μ(x) − xβ2

dH (x),

where dH denotes the joint density of X t and σ∗2 denotes the “pure PMSE”, σ∗2 ≡

E [(Y t − μ(X t ))2]

From(3)we see that the PMSE can be decomposed into two components, the pure

PMSE σ∗2, associated with the best possible prediction (that based on μ), and the

approximation mean squared error (AMSE),

(μ(x) − xβ)2dH (x), for xβ as an

ap-proximation to μ(x) The AMSE is weighted by dH , the joint density of X t, so that the

squared approximation error is more heavily weighted in regions where X tis likely to be

observed and less heavily weighted in areas where X tis less likely to be observed This weighting forces the optimal approximation to be better in more frequently observed

regions of the distribution of X t, at the cost of being less accurate in less frequently

observed regions of the distribution of X t

It follows that to minimize PMSE it is necessary and sufficient to minimize AMSE

That is, because β∗minimizes PMSE, it also satisfies

β∗= arg min

β∈Rk

μ(x) − xβ2

dH (x).

This shows that β∗is the vector delivering the best possible approximation of the form

xβ to the PMSE-best predictor μ(x) of Y t given X t = x, where the approximation is

best in the sense of AMSE For brevity, we refer to this as the “optimal approximation property”

Note that AMSE is nonnegative It is minimized at zero if and only if for some

βo, μ(x) = xβ

o(a.s.-H ), that is, if and only if L is correctly specified In this case,

β∗= βo.

An especially convenient property of β∗is that it can be represented in closed form.

The first order conditions for β∗from problem(2)can be written as

E

X t X

t

β∗− E(X t Y t ) = 0.

Define M ≡ E(X t X

t ) and L ≡ E(X t Y t ) If M is nonsingular then we can solve for β∗

to obtain the desired closed form expression

β∗= M−1L.

Trang 2

The optimal point forecast based on the linear modelL given predictors X tis then given simply by

Y∗

t = lX t , β∗

= Xt β∗.

In forecasting applications we typically have a sample of data that we view as represen-tative of the underlying population distribution generating the data (the joint distribution

of Y t and X t), but the population distribution is itself unknown Typically, we do not

even know the expectations M and L required to compute β∗, so the optimal point

forecast Y∗

t is also unknown Nevertheless, we can obtain a computationally

conve-nient estimator of β∗from the sample data using the “plug-in principle” That is, we

replace the unknown M and L by sample analogs ˆ M ≡ 1

n

t=1X t X

t = XX/n and

ˆL ≡ 1

n

t=1X t Y t = XY /n, where X is the n × k matrix with rows X

t , Y is the

n × 1 vector with elements Y t , and n is the number of sample observations available for

estimation This yields the estimator

ˆβ ≡ ˆM−1ˆL,

which we immediately recognize to be the ordinary least squares (OLS) estimator

To keep the scope of our discussion tightly focused on the more practical aspects of the subject at hand, we shall not pay close attention to technical conditions underlying the statistical properties of ˆβ or the other estimators we discuss, and we will not state

formal theorems here Nevertheless, any claimed properties of the methods discussed here can be established under mild regularity conditions relevant for practical applica-tions In particular, under conditions ensuring that the law of large numbers holds (i.e., ˆ

M → M a.s., ˆL → L a.s.), it follows that as n → ∞, ˆβ → β∗ a.s., that is, ˆβ

con-sistently estimates β∗ Asymptotic normality can also be straightforwardly established

for ˆβ under conditions sufficient to ensure the applicability of a suitable central limit

theorem [SeeWhite (2001, Chapters 2–5)for treatment of these issues.]

For clarity and notational simplicity, we operate throughout with the implicit under-standing that the underlying regularity conditions ensure that our data are generated

by an essentially stationary process that has suitably controlled dependence For cross-section or panel data, it suffices that the observations are independent and identically distributed (i.i.d.) In time series applications, stationarity is compatible with consider-able dependence, so we implicitly permit only as much dependence as is compatible with the availability of suitable asymptotic distribution theory Our discussion thus ap-plies straightforwardly to unit root time-series processes after first differencing or other suitable transformations, such as those relevant for cointegrated processes For sim-plicity, we leave explicit discussion of these cases aside here Relaxing the implicit stationarity assumption to accommodate heterogeneity in the data generating process

is straightforward, but the notation necessary to handle this relaxation is more cumber-some than is justified here

Returning to our main focus, we can now define the point forecast based on the linear modelL using ˆβ for an out-of-sample predictor vector, say X n+1 This is computed

Trang 3

simply as

ˆY n+1= X

n+1ˆβ.

We italicized “out-of-sample” just now to emphasize the fact that in applications,

fore-casts are usually constructed based on predictors X n+1not in the estimation sample, as

the associated target variable (Y n+1) is not available until after X n+1is observed, as we discussed at the outset The point of the forecasting exercise is to reduce our uncertainty

about the as yet unavailable Y n+1

2.2 Nonlinearity

A nonlinear parametric model is generated from a nonlinear parameterization For this,

let & be a finite integer and let the parameter space be a subset of R & Let f be a

function mappingR k × into R This generates the parametric model

N ≡ m : R k → R | m(x) = f (x, θ), θ ∈ .

The parameterization f (equivalently, the parametric model N ) can be nonlinear in

the predictors only, nonlinear in the parameters only, or nonlinear in both Models that are nonlinear in the predictors are of particular interest here, so for convenience we call the forecasts arising from such models “nonlinear forecasts” For now, we keep our discussion at the general level and later pay more particular attention to the special cases

Completely parallel to our discussion of linear models, we have that solving prob-lem (1) withM = N , that is, solving

min

m∈NE

Y t − m(X t )2

yields the optimal forecasting function f ( ·, θ∗), where

(4)

θ∗= arg min

θ ∈ E

Y t − f (X t , θ )2

Here θ∗ is the PMSE-optimal coefficient vector This delivers not only the best

fore-cast for Y t given X t based on the nonlinear modelN , but also the optimal nonlinear approximation to μ [see, e.g.,White (1981)] Now we have

θ∗= arg min

θ ∈

μ(x) − f (x, θ)2dH (x).

The demonstration is completely parallel to that for β∗, simply replacing xβ with

f (x, θ ) Now θ∗is the vector delivering the best possible approximation of the form

f (x, θ ) to the PMSE-best predictor μ(x) of Y t given X t = x, where, as before, the approximation is best in the sense of AMSE, where the weight is again dH , the density

of the X’s

Trang 4

The optimal point forecast based on the nonlinear modelN given predictors X t is thus given explicitly by

Y∗

t = fX t , θ∗

.

The advantage of using a nonlinear modelN is that nonlinearity in the predictors can

afford greater flexibility and thus, in principle, greater forecast accuracy Provided the nonlinear model nests the linear model (i.e.,L ⊂ N ), it follows that

min

m∈NE

Y t − m(X t )2

min

m∈LE

Y t − m(X t )2

,

that is, the best PMSE for the nonlinear model is always at least as good as the best PMSE for the linear model (The same relation also necessarily holds for AMSE.)

A simple means of ensuring thatN nests L is to include a linear component in f ,

for example, by specifying

f (x, θ ) = xα + g(x, β),

where g is some function nonlinear in the predictors.

Against the advantage of theoretically better forecast accuracy, using a nonlinear model has a number of potentially serious disadvantages relative to linear models: (1) the associated estimators can be much more difficult to compute; (2) nonlinear mod-els can easily overfit the sample data, leading to inferior performance in practice; and (3) the resulting forecasts may appear more difficult to interpret It follows that the more appealing nonlinear methods will be those that retain the advantage of flexibility but that mitigate or eliminate these disadvantages relative to linear models We now discuss considerations involved in constructing forecasts with these properties

3 Linear, nonlinear, and highly nonlinear approximation

When a parameterization is nonlinear in the parameters, there generally does not exist a

closed form expression for the PMSE-optimal coefficient vector θ∗ One can

neverthe-less apply the plug-in principle in such cases to construct a potentially useful estimator ˆθ

by solving the sample analog of the optimization problem(4)defining θ∗, which yields

ˆθ ≡ arg min

θ ∈

1

n

t=1

Y t − f (X t , θ )2

.

The point forecast based on the nonlinear modelN using ˆθ for an out-of-sample pre-dictor vector X n+1, is computed simply as

ˆY n+1= fX n+1, ˆ θ

.

The challenge posed by attempting to use ˆθ is that its computation generally requires an

iterative algorithm that may require considerable fine-tuning and that may or may not

Trang 5

behave well, in that the algorithm may or may not converge, and, even with considerable effort, the algorithm may well converge to a local optimum instead of to the desired global optimum These are the computational difficulties alluded to above

As the advantage of flexibility arises entirely from nonlinearity in the predictors and the computational challenges arise entirely from nonlinearity in the parameters, it makes sense to restrict attention to parameterizations that are “series functions” of the form

(5)

f (x, θ ) = xα+

q

j=1

ψ j (x)β j ,

where q is some finite integer and the “basis functions” ψ j are nonlinear functions

of x This provides a parameterization nonlinear in x, but linear in the parameters

θ ≡ (α, β), β ≡ (β1 , , β q ), thus delivering flexibility while simultaneously

elim-inating the computational challenges arising from nonlinearity in the parameters The method of OLS can now deliver the desired sample estimator ˆθ for θ∗.

Restricting attention to parameterizations having the form(5)thus reduces the prob-lem of choosing a forecasting model to the probprob-lem of jointly choosing the basis

functions ψ j and their number, q With the problem framed in this way, an important

next question is, “What choices of basis functions are available, and when should one prefer one choice to another?”

There is a vast range of possible choices of basis functions; below we mention some

of the leading possibilities Choosing among these depends not only on the properties

of the basis functions, but also on one’s prior knowledge about μ, and one’s empirical knowledge about μ, that is, the data.

Certain broad requirements help narrow the field First, given that our objective is to

obtain as good an approximation to μ as possible, a necessary property for any choice

of basis functions is that this choice should yield an increasingly better approximation

to μ as q increases Formally, this is the requirement that the span (the set of all linear combinations) of the basis functions {ψ j , j = 1, 2, } should be dense in the function space inhabited by μ Here, this space is M ≡ L2 ( R k−1, dH ), the separable Hilbert

space of functions m on R k−1for which

m(x)2dH (x) is finite (Recall that x contains the constant unity, so there are only k− 1 variables.) Second, given that we are funda-mentally constrained by the amount of data available, it is also necessary that the basis

functions should deliver a good approximation using as small a value for q as possible.

Although the denseness requirement narrows the field somewhat, there is still an overwhelming variety of choices for{ψ j} that have this property Familiar examples

are algebraic polynomials in x of degree dependent on j , and in particular the related

special polynomials, such as Bernstein, Chebyshev, or Hermite, etc.; and trigonometric

polynomials in x, that is, sines and cosines of linear combinations of x corresponding

to pre-specified (multi-)frequencies, delivering Fourier series Further, one can combine different families, as in Gallant’s (1981) flexible Fourier form, which includes poly-nomials of first and second order, together with sine and cosine terms for a range of frequencies

Trang 6

Important and powerful extensions of the algebraic polynomials are the classes of piecewise polynomials and splines [e.g.,Wahba and Wold (1975), Wahba (1990)] Well-known types of splines are linear splines, cubic splines, and B-splines

The basis functions for the examples given so far are either orthogonal or can be made

so with straightforward modifications Orthogonality is not a necessary requirement, however A particularly powerful class of basis functions that need not be orthogonal is the class of “wavelets”, introduced byDaubechies (1988, 1992) These have the form

ψ j (x) = (A j (x)), where is a “mother wavelet”, a given function satisfying certain specific conditions, and A j (x) is an affine function of x that shifts and rescales x

ac-cording to a specified dyadic schedule analogous to the frequencies of Fourier analysis For a treatment of wavelets from an economics perspective, seeGencay, Selchuk and Whitcher (2001)

Recall that a vector space is linear if (among other things) for any two elements of the space f and g, all linear combinations af + bg also belong to the space, where a and b are any real numbers All of the basis functions mentioned so far define spaces

of functions g q (x, β) ≡q

j=1ψ j (x)β j that are linear in this sense, as taking a linear combination of two elements of this space gives

a

) q

j=1

ψ j (x)β j

*

+ b

) q

j=1

ψ j (x)γ j

*

=

q

j=1

ψ j (x) [aβ j + bγ j ],

which is again a linear combination of the first q of the ψ j’s

Significantly, the second requirement mentioned above, namely that the basis should

deliver a good approximation using as small a value for q as possible, suggests that

we might obtain a better approximation by not restricting ourselves to the functions

g q (x, β), which force the inclusion of the ψ j’s in a strict order (e.g., zero order polyno-mials first, followed by first order polynopolyno-mials, followed by second order polynopolyno-mials, and so on), but instead consider functions of the form

g % (x, β)≡

j ∈%

ψ j (x)β j ,

where % is a set of natural numbers (“indexes”) containing at most q elements, not nec-essarily the integers 1, , q The functions g % are more flexible than the functions g q,

in that g % admits g qas a special case The key idea is that by suitably choosing which basis functions to use in any given instance, one may obtain a better approximation for

a given number of terms q.

The functions g % define a nonlinear space of functions, in that linear combinations

of the form ag % + bg K , where K also has q elements, generally have up to 2q terms, and are therefore not contained in the space of q-term linear combinations of the ψ j’s

Consequently, functions of the form g % are called nonlinear approximations in the

approximation theory literature Note that the nonlinearity referred to here is the

nonlin-earity of the function spaces defined by the functions g % For given %, these functions are still linear in the parameters β , which preserves their appeal for us here

Trang 7

Recent developments in the approximation theory literature have provided consider-able insight into the question of which functions are better approximated using linear

approximation (functions of the form g q), and which functions are better approximated

using nonlinear approximation (functions of the form g %) The survey ofDeVore (1998)

is especially comprehensive and deep, providing a rich catalog of results permitting a comparison of these approaches Given sufficient a priori knowledge about the function

of interest, μ, DeVore’s results may help one decide which approach to take.

To gain some of the flavor of the issues and results treated byDeVore (1998)that are relevant in the present context, consider the following approximation root mean squared errors:

σ q (μ, ψ )≡ inf

β

μ(x) − g q (x, β)2

dH (x)

1/2

,

σ % (μ, ψ )≡ inf

%,β

μ(x) − g % (x, β)2

dH (x)

1/2

.

These are, for linear and nonlinear approximation respectively, the best possible

ap-proximation root mean squared errors (RMSEs) using qψ j’s (For simplicity, we are

ignoring the linear term xα previously made explicit; alternatively, imagine we have

absorbed it into μ.) DeVore devotes primary attention to one of the central issues of

approximation theory, the “degree of approximation” question: “Given a positive real

number a, for what functions μ does the degree of approximation (as measured here

by the above approximation RMSE’s) behave as O(q −a )?” Clearly, the larger is a, the

more quickly the approximation improves with q.

In general, the answer to the degree of approximation question depends on the

smoothness and dimensionality (k − 1) of μ, quantified in precisely the right ways.

For linear approximation, the smoothness conditions typically involve the existence of a

number of derivatives of μ and the finiteness of their moments (e.g., second moments),

such that more smoothness and smaller dimensionality yield quicker approximation

The answer also depends on the particular choice of the ψ j’s; suffice it to say that the details can be quite involved

In the nonlinear case, familiar notions of smoothness in terms of derivatives generally

no longer provide the necessary guidance To describe the smoothness notion relevant in this context, suppose for simplicity that{ψ j} forms an orthonormal basis for the Hilbert

space in which μ lives Then the optimal coefficients β∗

j are given by

β∗

j =

ψ j (x)μ(x) dH (x).

AsDeVore (1998, p 135)states, “smoothness for [nonlinear] approximation should be

viewed as decay of the coefficients with respect to the basis [i.e., the β∗

j’s]” (emphasis

added) In particular, let τ = 1/(a+1/2) Then according toDeVore (1998, Theorem 4)

σ % (μ, ψ ) = O(q −a ) if and only if there exists a finite constant M such that #{j: β∗

j >

z } M τ z −τ For example, σ % (μ, ψ ) = O(q −1/2 ) if for some M we have #{j: β∗

j >

z } Mz−1

Trang 8

An important and striking aspect of this view of smoothness is that it is relative to

the basis A function that is not at all smooth with respect to one basis may be quite

smooth with respect to another Another striking feature of results of this sort is that

the dimensionality of μ no longer plays an explicit role, seemingly suggesting that

non-linear approximation may somehow hold in abeyance the “curse of dimensionality” (the inability to well approximate functions in high-dimensional spaces without inordi-nate amounts of data) A more precise interpretation of this situation seems to be that smoothness with respect to the basis also incorporates dimensionality, such that a given decay rate for the optimal coefficients is a stronger condition in higher dimensions

In some cases, theory alone can inform us about the choice of basis functions For example, it turns out, asDeVore (1998, p 106)discusses, that with respect to nonlinear approximation, rational polynomials have approximation properties essentially equiva-lent to those of piecewise polynomials In this sense, there is nothing to gain or lose in selecting one of these bases over another In other cases, the helpfulness of the theory

in choosing a basis depends on having quite specific knowledge about μ, for example,

that it is very smooth (in the familiar sense) in some places and very rough in others or that it has singularities or discontinuities For example,Dekel and Leviatan (2003)show that in this sense, wavelet approximations do not perform well in capturing singularities along curves, whereas nonlinear piecewise polynomial approximations do

Usually, however, we economists have little prior knowledge about the familiar

smoothness properties of μ, let alone their smoothness with respect to any given

ba-sis As a practical matter, then, it may make sense to consider a collection of different bases, and let the data guide us to the best choice Such a collection of bases is called

a library An example is the wavelet packet library proposed byCoifman and Wicker-hauser (1992)

Alternatively, one can choose the ψ j’s from any suitable subset of the Hilbert space

Such a subset is called a dictionary; the idea is once again to let the data help decide

which elements of the dictionary to select Artificial neural networks (ANNs) are an

example of a dictionary, generated by letting ψ j (x) = (xγ j ) for a given

“activa-tion func“activa-tion” , such as the logistic cdf ((z) = 1/(1 + exp(−z))), and with γ j any element ofR k For a discussion of artificial neural networks from an econometric per-spective, seeKuan and White (1994).Trippi and Turban (1992)contains a collection of papers applying ANNs to economics and finance

Approximating a function μ using a library or dictionary is called highly nonlinear

approximation, as not only is there the nonlinearity associated with choosing q basis

functions, but there is the further choice of the basis itself or of the elements of the dic-tionary Section 8 ofDeVore’s (1998)comprehensive survey is devoted to a discussion

of the so far somewhat fragmentary degree of approximation results for approxima-tions of this sort Nevertheless, some powerful results are available Specifically, for sufficiently rich dictionariesD (e.g., artificial neural networks as above),DeVore and Temlyakov (1996)show [seeDeVore (1998, Theorem 7)] that for a 1

2and sufficiently

smooth functions μ

σ q (μ, D) C a q −a ,

Trang 9

where C a is a constant quantifying the smoothness of μ relative to the dictionary, and,

analogous to the case of nonlinear approximation, we define

σ q (μ, D) ≡ inf

D,β

μ(x) − g D (x, β)2

dH (x)

1/2

,

g D (x, β)≡

ψ j ∈D

ψ j (x)β j ,

where D is a q element subset of D DeVore and Temlyakov’s result generalizes an earlier result for a= 1

2of Maurey [seePisier (1980)].Jones (1992)provides a “greedy

algorithm” and a “relaxed greedy algorithm” achieving a = 1

2 for a specific dictionary

and class of functions μ, andDeVore (1998)discusses further related algorithms The cases discussed so far by no means exhaust the possibilities Among other

no-table choices for the ψ j’s relevant in economics are radial basis functions [Powell (1987), Lendasse et al (2003)] and ridgelets [Candes (1998, 1999a, 1999b, 2003)] Radial basis functions arise by taking

ψ j (x) = p2(x, γ j )

, where p2(x, γ j ) is a polynomial of (at most) degree 2 in x with coefficients γ j , and is typically taken to be such that, with the indicated choice of p2 , (x, γ j ), (p2(x, γ j )) is proportional to a density function Standard radial basis functions treat the γ j’s as free

parameters, and restrict p2(x, γ j ) to have the form

p2(x, γ j ) = −(x − γ1j )γ

2j (x − γ1j )/2, where γ j ≡ (γ

1j , γ

2j ), so that γ

1j acts as a centering vector, and γ 2j is a k ×k symmet-ric positive semi-definite matrix acting to scale the departures of x from γ 1j A common

choice for is = exp, which delivers (p2 (x, γ j )) proportional to the multivariate normal density with mean γ 1j and with γ 2j a suitable generalized inverse of a given covariance matrix Thus, standard radial basis functions have the form of a linear com-bination of multivariate densities, accommodating a mixture of densities as a special

case Treating the γ j’s as free parameters, we may view the radial basis functions as a dictionary, as defined above

Candes’s ridgelets can be thought of as a very carefully constructed special case of ANNs Ridgelets arise by taking

ψ j (x) = γ 1j −1/2

˜xγ 2j − γ0j/γ 1j

,

where˜x denotes the vector of nonconstant elements of x (i.e., x = (1, ˜x)), γ0jis real,

γ 1j > 0, and γ2j belongs to S k−2, the unit sphere inR k−1 The activation function

is taken to belong to the space of rapidly decreasing functions (Schwartz space, a subset of C∞) and to satisfy a specific admissibility property on its Fourier transform

[seeCandles (1999a, Definition 1)], essentially equivalent to the moment conditions

z j (z) dz = 0, j = 0, , k/2 − 1.

Trang 10

This condition ensures that oscillates, has zero average value, zero average slope, etc For example, = D h φ, the hth derivative of the standard normal density φ, is readily verified to be admissible with h = k/2.

The admissibility of the activation function has a number of concrete benefits, but the chief benefit for present purposes is that it leads to the explicit specification of a countable sequence{γ j = (γ0j, γ1j , γ

2j )} such that any function f square integrable

on a compact set has an exact representation of the form

f (x)≡∞

j=1

ψ j (x)β∗

j

The representing coefficients β∗

j are such that good approximations can be obtained

using g q (x, β) or g % (x, β) as above In this sense, the ridgelet dictionary that arises by letting the γ j’s be free parameters (as in the usual ANN approach) can be reduced to a countable subset that delivers a basis with appealing properties

AsCandes (1999b)shows, ridgelets turn out to be optimal for representing otherwise smooth multivariate functions that may exhibit linear singularities, achieving a rate of

approximation of O(q −a ) with a = s/(k − 1), provided the sth derivatives of f exist

and are square integrable This is in sharp contrast to Fourier series or wavelets, which can be badly behaved in the presence of singularities.Candes (2003)provides an ex-tensive discussion of the properties of ridgelet regression estimators, and, in particular, certain shrinkage estimators based on thresholding coefficients from a ridgelet regres-sion (By thresholding is meant setting to zero estimated coefficients whose magnitude does not exceed some pre-specified value.) In particular,Candes (2003)discusses the superiority in multivariate contexts of ridgelet methods to kernel smoothing and wavelet thresholding methods

InDeVore’s (1998)survey, Candes’s papers, and the references cited there, the inter-ested reader can find a wealth of further material describing the approximation

prop-erties of a wide variety of different choices for the ψ j’s From a practical standpoint, however, these results do not yield hard and fast prescriptions about how to choose

the ψ j’s, especially in the circumstances commonly faced by economists, where one may have little prior information about the smoothness of the function of interest Nev-ertheless, certain helpful suggestions emerge Specifically:

(i) nonlinear approximations are an appealing alternative to linear approximations; (ii) using a library or dictionary of basis functions may prove useful;

(iii) ANNs, and ridgelets in particular, may prove useful

These suggestions are simply things to try In any given instance, the data must be the final arbiter of how well any particular approach works In the next section, we provide

a concrete example of how these suggestions may be put into practice and how they interact with other practical concerns

of the so far somewhat fragmentary degree of approximation... affine function of x that shifts and rescales x

ac-cording to a specified dyadic schedule analogous to the frequencies of Fourier analysis For a treatment of wavelets from an economics... approximation (functions of the form g %) The survey ofDeVore (1998)

is especially comprehensive and deep, providing a rich catalog of results permitting a comparison of these approaches

Định dạng
Số trang	10
Dung lượng	114,44 KB