Econometric theory and methods, Russell Davidson - Chapter 6 docx

Another way to write the nonlinear regression model 6.01 is nonlinear analog of the vector Xβ in the linear case.. As a very simple example of a nonlinear regression model, consider the

Trang 1

Chapter 6 Nonlinear Regression

6.1 Introduction

Up to this point, we have discussed only linear regression models For each

regression model consists of all DGPs for which the expectation of the

requirements, such as being IID Since, as we saw in Section 1.3, the elements

many types of nonlinearity can be handled within the framework of the ear regression model However, many other types of nonlinearity cannot behandled within this framework In order to deal with them, we often need to

is a nonlinear function of the parameters

A typical nonlinear regression model can be written as

the dependent variable, and β is a k vector of parameters to be estimated.

explanatory variables These explanatory variables, which may include lagged

depends on explanatory variables, but it can also occur because the functionalform of the regression function actually changes over time The number of

The error terms in (6.01) are specified to be IID By this, we mean somethingvery similar to, but not precisely the same as, the two conditions in (4.48) Inorder for the error terms to be identically distributed, the distribution of each

Trang 2

conditional not only on Ωt but also on all the other error terms, should be

on the other error terms

Another way to write the nonlinear regression model (6.01) is

nonlinear analog of the vector Xβ in the linear case.

As a very simple example of a nonlinear regression model, consider the model

regression models, like (6.03), can be expressed as linear regression models inwhich the parameters must satisfy one or more nonlinear restrictions

The Linear Regression Model with AR(1) Errors

We now consider a particularly important example of a nonlinear regressionmodel that is also a linear regression model subject to nonlinear restrictions

on the parameters In Section 5.5, we briefly mentioned the phenomenon ofserial correlation, in which nearby error terms in a regression model are (orappear to be) correlated Serial correlation is very commonly encountered inapplied work using time-series data, and many techniques for dealing with ithave been proposed One of the simplest and most popular ways of dealingwith serial correlation is to assume that the error terms follow the first-orderautoregressive, or AR(1), process

According to this model, the error at time t is equal to ρ times the error at

and independent of all past and future innovations We see from (6.04) that,

Trang 3

6.2 Method of Moments Estimators for Nonlinear Models 213

shrunk somewhat toward zero and possibly changed in sign, and part is the

and other autoregressive processes, in Chapter 7 At present, we are concernedsolely with the nonlinear regression model that results when the errors of alinear regression model are assumed to follow an AR(1) process

If we combine (6.04) with the linear regression model

we obtain the nonlinear regression model

is a dynamic model As with the other dynamic models that are treated

are assumed not to be available The model is linear in the regressors but

nonlinear in the parameters β and ρ, and it therefore needs to be estimated

by nonlinear least squares or some other nonlinear estimation method

In the next section, we study estimators for nonlinear regression models erated by the method of moments, and we establish conditions for asymptoticidentification, asymptotic normality, and asymptotic efficiency Then, in Sec-tion 6.3, we show that, under the assumption that the error terms are IID, themost efficient MM estimator is nonlinear least squares, or NLS In Section 6.4,

gen-we discuss various methods by which NLS estimates may be computed Themethod of choice in most circumstances is some variant of Newton’s Method.One commonly-used variant is based on an artificial linear regression calledthe Gauss-Newton regression We introduce this artificial regression in Sec-tion 6.5 and show how to use it to compute NLS estimates and estimates oftheir covariance matrix In Section 6.6, we introduce the important concept

of one-step estimation Then, in Section 6.7, we show how to use the Newton regression to compute hypothesis tests Finally, in Section 6.8, weintroduce a modified Gauss-Newton regression suitable for use in the pres-ence of heteroskedasticity of unknown form

Gauss-6.2 Method of Moments Estimators for Nonlinear Models

In Section 1.5, we derived the OLS estimator for linear models from themethod of moments by using the fact that, for each observation, the mean

of the error term in the regression model is zero conditional on the vector ofexplanatory variables This implied that

Trang 4

The sample analog of the middle expression here is n −1 X > (y − Xβ) Setting

conditions

want to employ the same type of argument for nonlinear models

belong to it But, since the realization of any deterministic function of these

contain not only the variables that characterize it but also all

Exercise 6.1, readers are asked to show that the conditional expectation of arandom variable is also its expectation conditional on the set of all determin-istic functions of the conditioning variables

equations in (6.10) These equations can, in principle, be solved to yield an

estimator of the k vector β Geometrically, the moment conditions (6.10)

require that the vector of residuals should be orthogonal to all the columns

of the matrix W.

How should we choose W ? There are infinitely many possibilities Almost

consis-tent estimator of β However, these estimators will in general have different

asymptotic covariance matrices, and it is therefore of interest to see if any

particular choice of W leads to an estimator with smaller asymptotic

var-iance than the others Such a choice would then lead to an efficient estimator,judged by the criterion of the asymptotic variance

Identification and Asymptotic Identification

model (6.01) is asymptotically identified In general, a vector of parameters

Trang 5

is said to be identified by a given data set and a given estimation method if,for that data set, the estimation method provides a unique way to determine

the parameter estimates In the present case, β is identified by a given data

set if equations (6.10) have a unique solution

For the parameters of a model to be asymptotically identified by a given timation method, we require that the estimation method provide a unique

es-way to determine the parameter estimates in the limit as the sample size n

tends to infinity In the present case, asymptotic identification can be

n → ∞ Suppose that the true DGP is a special case of the model (6.02) with

By (6.09), every term in the sum above has mean 0, and the IID assumption

in (6.02) is enough to allow us to apply a law of large numbers to that sum Itfollows that the right-hand side, and therefore also the left-hand side, of (6.11)

tends to zero in probability as n → ∞.

Let us now define the k vector of deterministic functions α(β) as follows:

of large numbers can be applied to the right-hand side of (6.12) whatever the

value of β, thus showing that the components of α are deterministic In the

Although most parameter vectors that are identified by data sets of reasonablesize are also asymptotically identified, neither of these concepts implies theother It is possible for an estimator to be asymptotically identified withoutbeing identified by many data sets, and it is possible for an estimator to

be identified by every data set of finite size without being asymptoticallyidentified To see this, consider the following two examples

is a random variable which follows the Bernoulli distribution Such a randomvariable is often called a binary variable, because there are only two possible

Trang 6

identified asymptotically As n → ∞, a law of large numbers guarantees that

As an example of the second possibility, consider the model (3.20), discussed

of size at least 2, and so the parameters are identified by any data set with

is the same as the OLS estimator Then, using the definition (6.12), we obtain

right-hand side of (6.13) simplifies to

simult-aneous failure of consistency and asymptotic identification in this example isnot a coincidence: It will turn out that asymptotic identification is a necessaryand sufficient condition for consistency

Consistency

Suppose that the DGP is a special case of the model (6.02) with true parameter

have to deal with a number of technical issues that are beyond the scope ofthis book See Amemiya (1985, Section 4.3) or Davidson and MacKinnon(1993, Section 5.3) for more detailed treatments

However, an intuitive, heuristic, proof is not at all hard to provide If we

the result follows easily What makes a formal proof more difficult is showing

Trang 7

For all finite samples large enough for β to be identified by the data, we have,

1

If we take the limit of this as n → ∞, we have 0 on the right-hand side On

as the limit of

1

contradicts the fact that the limits of both sides of (6.14) are equal, since thelimit of the right-hand side is 0

asymp-totic identification is sufficient for consistency Although we will not attempt

to prove it, asymptotic identification is also necessary for consistency Thekey to a proof is showing that, if the parameters of a model are not asymp-totically identified by a given estimation method, then no deterministic limit

Định dạng
Số trang	44
Dung lượng	338,35 KB