Academic year 2003-2004Advanced Econometrics Panel data econometrics and GMM estimation Alban Thomas MF 102, thomas@toulouse.inra.fr... Purpose of the course Present recent developments
Trang 1Academic year 2003-2004
Advanced Econometrics
Panel data econometrics
and GMM estimation
Alban Thomas
MF 102, thomas@toulouse.inra.fr
Trang 2Purpose of the course
Present recent developments in econometrics, that allow for
a consistent treatment of the impact of unobserved heterogeneity
on model predictions: Panel data analysis
Present a convenienteconometric frameworkfor dealing with
restrictionsimposedbytheory: Method of Moments estimation
Deal with discrete-choice models with unobserved
hetero-geneity
Two keywords: unobserved heterogeneity and endogeneity
Methods:
- Fixed Eects Least Squares
- Generalized Least Squares
- Instrumental Variables
- Maximum Likelihood estimation for Panel Data models
- Generalized Method of Moments for Times Series
- Generalized Method of Moments for Panel Data
- Heteroskedasticity-consistent estimation
- Dynamic Panel Data models
- Logit and Probit models for Panel Data
- Simulation-based inference
- Nonparametric and Semiparametricestimation
Trang 5I Panel Data Models 7
1.1 Gains in pooling cross section and time series 9
1.1.1 Discrimination between alternative models 9 1.1.2 Examples 10
1.1.3 Lesscolinearitybetweenexplanatoryvariables 11 1.1.4 May reduce bias due to missing or unob-served variables 11
1.2 Analysis of variance 12
1.3 Some denitions 15
2 The linear model 17 2.1 Notation 17
2.1.1 Model notation 18
2.1.2 Standard matrices and operators 19
2.1.3 Important properties of operators 20
2.2 The One-Way Fixed Eects model 21
2.2.1 Theestimatorintermsofthe Frisch-Waugh-Lovell theorem 21
2.2.2 Interpretation as a covariance estimator 23
2.2.3 Comments 24
2.2.4 Testing for poolability and individual eects 25
Trang 62.3 The Random Eects model 26
2.3.1 Notation and assumptions 26
2.3.2 GLS estimation of the Random-eect model 27
2.3.3 Comparison between GLS, OLS and Within 29
2.3.4 Fixed individual eects or error components? 29
2.3.5 Example: Wage equation, Hausman (1978) 30
2.3.6 BestQuadraticUnbiasedEstimators(BQU)
of variances 31
3.1 The Two-way panel data model 33
3.1.1 The Two-way xed-eect model 33
3.1.2 Example: Production function (Hoch 1962) 36
3.2 More on non-spherical disturbances 37
3.2.1 Heteroskedasticity in individual eect 37
3.2.2 `Typical heteroskedasticity 38
3.3 Unbalanced panel data models 39
3.3.1 Introduction 39
3.3.2 Fixed eect models for unbalanced panels 40
4.1 Introduction 47
4.2 Choice between Within and GLS 48
4.3 An important test for endogeneity 49
4.4 InstrumentalVariableestimation: Hausman-Taylor
Trang 7instru-4.4.4 Moreecientprocedures: Amemiya-MaCurdy
and Breusch-Mizon-Schmidt 53
4.5 Computation of variance-covariance matrix for IV estimators 55
4.5.1 Full IV-GLS estimation procedure 56
4.6 Example: Wage equation 56
4.6.1 Model specication 56
4.7 Application: returns to education 58
4.7.1 Variables related to job status 58
4.7.2 Variablesrelatedtocharacteristicsof house-holds heads 58
5 Dynamic panel data models 63 5.1 Motivation 63
5.1.1 Dynamic formulations from dynamic pro-gramming problems 63
5.1.2 Euler equations and consumption 65
5.1.3 Long-run relationships in economics 67
5.2 The dynamic xed-eect model 69
5.2.1 Bias in the Fixed-Eects estimator 70
5.2.2 Instrumental-variable estimation 73
5.3 The Random-eects model 75
5.3.1 Bias in the ML estimator 75
5.3.2 An equivalent representation 76
5.3.3 The role of initial conditions 77
5.3.4 Possible inconsistency of GLS 78
Trang 8I Generalized Method of Moments estimation 83
6.1 Moment conditions and the method of moments 85
6.1.1 Moment conditions 85
6.1.2 Example: Linear regression model 86
6.1.3 Example: Gamma distribution 87
6.1.4 Method of moments estimation 87
6.1.5 Example: Poisson counting model 88
6.1.6 Comments 89
6.2 The Generalized Method of Moments (GMM) 91
6.2.1 Introduction 91
6.2.2 Example: Just-identied IV model 91
6.2.3 A denition 92
6.2.4 Example: The IV estimator again 92
6.3 Asymptotic properties of the GMM estimator 93
6.3.1 Consistency 94
6.3.2 Asymptotic normality 95
6.4 Optimal and two-step GMM 97
6.5 Inference with GMM 99
6.6 Extension: optimal instruments for GMM 102
6.6.1 Conditional moment restrictions 102
6.6.2 A rst feasible estimator 104
6.6.3 Nearest-neighbor estimation of optimal in-struments 106
6.6.4 Generalizing the approach: other nonpara-metric estimators 109
7 GMM estimators for time series models 115 7.1 GMM and Euler equation models 115
Trang 97.1.2 GMM estimation 117
7.2 GMM Estimation of MA models 118
7.2.1 A simple estimator 118
7.2.2 A more ecient estimator 120
7.2.3 Example: The Durbin estimator 121
7.3 GMM Estimation of ARMA models 122
7.3.1 The ARMA(1,1) model 122
7.3.2 IV estimation 123
7.4 Covariance matrix estimation 125
7.4.1 Example 1: Conditional homoskedasticity 126 7.4.2 Example 2: Conditional heteroskedasticity 126 7.4.3 Example 3: Covariance stationary process 127 7.4.4 The Newey-West estimator 128
7.4.5 Weighted autocovariance estimators 130
7.4.6 Weighted periodogram estimators 133
8 GMM estimators for dynamic panel data 135 8.1 Introduction 135
8.2 The Arellano-Bond estimator 136
8.2.1 Model assumptions 136
8.2.2 Implementation of the GMM estimator 137
8.3 More ecient procedures (Ahn-Schmidt) 139
8.3.1 Additional assumptions 139
8.4 The Blundell-Bond estimator 140
8.5 Dynamic models with Multiplicative eects 141
8.5.1 Multiplicative individual eects 141
8.5.2 Mixed structure 143
8.6 Example: Wage equation 145
Trang 10I I Discrete choice models 149
9.1 Brief review of binary discrete-choice models 151
9.1.1 Linear Probability model 151
9.1.2 Logit model 152
9.1.3 Probit model 152
9.2 Logit models for panel data 153
9.2.1 Sucient statistics 153
9.2.2 Conditional probabilities 155
9.2.3 Example: T = 2 156
9.3 Probit models 157
9.4 Semiparametric estimation of discrete-choice models 158 9.4.1 The binary choice model 159
9.4.2 The IV estimator 162
9.5 SML estimation of selection models 164
9.5.1 The GHK simulator 164
9.5.2 Example 168
Appendix 1 Maximum-Likelihood estimation of the
Appendix 2 The two-way random eects model 173
Appendix 3 The one-way unbalanced random eects
Appendix 4 ML estimation of dynamic panel models181
Trang 11Appendix 6 A framework for simulation-based
Trang 13Panel Data Models
Trang 15Panel data: Sequential observations on a number of
units (individuals, rms)
Also called cross-sections over time, longitudinal data or p oled
cross-section time-series data
1.1 Gains in pooling cross section and time
se-ries
1.1.1 Discrimination between alternative models
Many economic models in the form:
F(Y;X;Z;)= 0;
where Y: individual controlvariables(workers, rms); X: (public
policy or principal's) variables; Z: (xed) individual attributes;
: parameters
Linear model:
Y =
0+
x
X +
z
Z + u:
Trang 16Alternative views concerning this model:
Policy variables have a signicant impact whatever individual
characteristics, or
Dierencesacross individualsare due toidiosyncraticindividual
features, not included in Z
In practice, observed dierences across individuals may be due
to both inter-individual dierences and the impact of policy
vari-ables
1.1.2 Examples
a) WAGE =
0+
1EDUCATION +
2
Z
People with higher education level have higher wages because
rms value those people more;
People have higher education because they have higher ability
(expected productivity) anyway, and rms value worker ability
more
b) SALES =
0+
1ADVER TISEMENT +
2
Z
Advertisement expenditures boost sales;
Moreecient rmsenjoymoresales,and thus havemoremoney
for advertisement expenditures
c) OUTPUT =
0+
1
R EGULATION +
2
Z
Regulatory control aects rm output;
Firms with higher output are more regulated on average
d) WAGE =
0+
1
1I (UNION)+
2
Z
Trang 17Firmsreact to higherwagesimposedbyunions byhiring
higher-quality workers, and 1I (UNION) is a proxy for worker quality
1.1.3 Less colinearity between explanatory variables
In consumeror production economics, input, output or consumer
prices are dicult to use, because:
Time-series: Aggregated macro price indexes are highly
cor-related;
Cross-sections: Not enough price variation across individuals
or rms
With panel data,variationsacrossindividualsand acrosstime
pe-riods are accounted for
Time-series: no information on the impact of individual
char-acteristics (socioeconomic variables, );
Cross-sections: no information on adjustment dynamics
Es-timates may reect inter-individual dierences inherent in
com-parisons of dierent people or rms
1.1.4 May reduce bias due to missing or unobserved
variables
With panel data, easy to control for unobserved heterogeneity
across individuals This is critical in practice, explains why panel
datamodelsarenowsopopularinmicro-andmacro-econometrics
Trang 18Example: Output supply function under perfect competition
1Q) (Quadratic)
Cobb-Douglascase: logQ=
1
1(logp log A ) From
equilibriumconditiontoestimableequation: Observations(Q
it
;p
it),
unobserved heterogeneity
i, rm i, period t
logQ
it
=1
(logp
itlog
it, a
Empirical issue: possible correlation between output price p
it
and eciency term
i
i+"
iare parameters, and T
i: number of
Trang 19Useful rst-order empirical moments are
y
i
=1
i)(y
ity
i);
i) ; i = 1;2;: : N:
Least-square parameter estimates are computed as
Considernowa restricted model with constantslopes and
Undertheserestrictions,least-squaresparameterestimateswould
T
i
t=1(x
it
x)(y
it
y)
P
NP
T
i
2
Trang 20iT
The Residual Sum of Squares is
h
P
N
i=1P
T
i
t=1(y
it
y)(x
it
x)i
2
P
N
i=1P
T
i
t=1(x
it
x)2
i2
Foramajorityofapplications, therstmodelistoogeneraland
estimation would require a great number of time observations If
unobserved heterogeneity is additive in the model, we might
con-sider the following specication with constant slope and dierent
it +"
it:
Minimizing
P
iP
t(y
it
ix
it ) with respect to
iand , we
ix
it ) = 0;
X
iX
t
x
it(y
it
ix
it ) = 0;
i and
^
=P
iP
tx
it(y
it
y
i)
P
iP
tx
it(x
it
x
i):
ResidualSum ofSquares has now
P
iT
i(N+1) degrees offree-
dom (N +1 parameters are estimated)
This is the most popular model encountered in empirical
Trang 21ap-1.3 Some denitions
Typical panel: when number of units (individuals) N is large,
and number of time periods (T) is small
Short (long) panel: when # periods T is small (large)
Balanced panel: same # periods for every unit (individual)
Rotating panel: A subset of individuals is replaced every
pe-riod Rotating panels can be balanced or unbalanced
Pseudo panel: when one is pooling cross-sections made of
dierent individuals for every period
Attrition: withlongpanels,the probabilitythatanindividual
remainsinthesampledecreasesasthenumberofperiods increases
(non response, moving, death, etc.)
Trang 23The linear model
individuals
Component of dependent variable that is unexplained by x
it:
u
it
=
i+
t+"
is the i.i.d component
One-way error-component model: u
it
=
i+"
it
Two-way error-component model: u
it
=
i+
t+"
it
Trang 24Allows several predictions of y
itgiven X
it:
E(y
it
jx
it) = x
it