They have been used as exogenous variables in travel forecasting models such as tripgeneration models, car ownership models, and mode choice models.. This paper proposes afundamental cha
Trang 1AN ENDOGENOUS SWITCHING SIMULTANEOUS EQUATION SYSTEM OF
EMPLOYMENT, INCOME AND CAR OWNERSHIP
Chandra R BhatResearch Assistant ProfessorTransportation CenterNorthwestern UniversityEvanston, Illinois 60208
and Frank S KoppelmanProfessor of Civil Engineeringand TransportationNorthwestern UniversityEvanston, Illinois 60208
Abstract
The research presented here makes an advance toward the inclusion of employment and incomewithin a transportation framework based on the conceptual framework developed by theauthors in a preceding paper Employment and income are important determinants of travelbehavior They have been used as exogenous variables in travel forecasting models such as tripgeneration models, car ownership models, and mode choice models This paper proposes afundamental change in the current view of employment and income as exogenous variables intravel demand models In particular, we emphasize the need, both from a forecasting andestimation point of view, to include employment and income as endogenous variables within adisaggregate travel demand modeling framework The paper formulates and estimates anintegrated model of employment, income and car ownership which takes account ofinterdependencies among these variables and their structural relationships with relevantexogenous variables
Trang 2Introduction
Traditional trip-based travel analyses consider the number of workers in a household andhousehold income as exogenous variables Data on employment and income is obtained fromsupplementary demographic forecasts Such supplementary demographic forecasts are, ingeneral, of an aggregate nature and do not support reliable disaggregate travel behavioranalysis This paper argues for the consideration of employment and income within adisaggregate travel demand framework and formulates and estimates a joint model system ofemployment, income and car ownership
The next section discusses the methodological need to model employment and incomewithin a transportation context The third section discusses the data source used for empiricalanalysis and discusses the sample used for the analysis The fourth section advances a structurefor the integrated model system and presents the estimation methodology The fifth sectionpresents the empirical specification and results of the model system A brief summary andconclusions are presented in the final section
Need to Model Employment and Income within Travel Demand Framework
In an earlier paper, we emphasized the need to model employment and income from an based perspective to travel demand modeling (Bhat and Koppelman, 1993) Here we argue thatthe need (to model employment and income) is also important from a trip-based approach totravel demand modeling
activity-The number of workers in a household and household income are very importantvariables in travel demand models such as car ownership models (Golob, 1989; Kitamura1988), trip generation models (Meurs, 1989), and mode choice models (Beggan, 1988) Despitetheir fundamental importance as determinants of travel behavior, the forecasting of employmentand income has been treated outside the framework of the transportation planning cycle.Employment and income forecasts are relegated to simple aggregate-level side-calculationsrather than being based on causal models that address the behavioral factors underlyingemployment decisions and income-earning potential Such aggregate-level forecasts fail toadequately represent the distribution of changes in employment and income across varioussocio-demographic groups This is likely to lead to inconsistent employment and incomeforecasts and, consequently, misleading and inaccurate forecasting of travel-related variables Acausal disaggregate model of employment and income, using readily available transportation
Trang 3planning data, can be used as part of an overall transportation planning process and willsupport reliable travel behavior analysis.1
In addition to the need to obtain reliable forecasts of employment and income,consideration of employment and income within a travel framework is also important forconsistent parameter estimation of travel demand models There are two sources of potentialinconsistency in traditional estimation procedures The first arises because traditional methodsignore the correlation in unobserved factors that may affect the employment decision ofindividuals in a household and the travel related variable under consideration (car ownership inthis paper) The second source of inconsistency arises from the manner in which traditionalmethods treat grouped (or interval-level) income data Traditional procedures handle groupedincome data by using midpoints of class intervals Open-ended groups (i.e., the two groups ateither extreme of the income spectrum) are assigned values on an even more ad hoc basis Such
a method will, in general, not result in consistent parameter estimates of travel demand models(Hsiao, 1983).2
We use an endogenous switching equation system of employment, income and carownership to overcome the two sources of inconsistencies discussed above
Data Source and Sample Formation
The data source used in the present study is the Dutch National Mobility Panel (Van Wissenand Meurs, 1989) This panel was instituted in 1984, and involves weekly travel diaries andhousehold and personal questionnaires collected at biannual and annual intervals Ten waves (awave refers to cross-sectional data at one time point) were collected between March, 1984 andMarch, 1989 Data for our analysis is obtained from waves 1,3,5,7 and 9 of the panel collectedduring the spring of each year between (and including) 1984 and 1988 The data was screened
to include only nuclear family households3 in which the husband is employed We removedhouseholds in which the husband was unemployed because there were too few of them toundertake any meaningful analysis of the husband’s employment Households in which adults
1 The introduction of car ownership as a component of the demand forecasting cycle emerged about two decades ago from a similar need for a disaggregate causal modeling of car ownership (Lerman and Ben-Akiva 1975) Today, car ownership modeling is considered an integral part of the disaggregate forecasting process
2 See also Gaudry (1979) for a discussion of the importance of the joint treatment of work-related variables and travel demand variables from an equilibrium-oriented demand framework
3A nuclear family household comprises two adults, a male and a female, with one or more children below the age of 18
Trang 4are self-employed were excluded because the concept of income is not clearly defined for suchindividuals Households with seniors over 60 years and/or disabled persons were removed fromthe sample due to their low rate of employment The resulting sample includes 2279observations of nuclear family households We do not account in this paper for biases in thestandard errors due to repeated measurements on households which occur in more than onewave.
Model System
The endogenous variables in our model are husband's income, wife's employment choice, wife'sincome and household car ownership In this section, we develop the equation system of themodel and also present the econometric procedure used in estimation We use a limitedinformation maximum likelihood procedure to estimate the system In this limited informationprocedure, each equation is estimated individually after appropriately accounting for the limiteddependent nature of the endogenous variable The income variables occurring on the right handside of other equations are replaced by their imputed values obtained from the estimation oftheir respective equations (these imputed income values are unbiased estimators of the actual
income values) In the following presentation, the subscript i denotes observations (or
households) and all references to income are in real value terms
Husband’s Income
The first equation in the model system is husband's income We use a logarithm transformation
of income, and express this transformed variable as a linear function of independent variables(an extensive treatment of the theoretical appropriateness of a log-normal form for the incomedistribution is available in Aitchison and Brown, 1976, and Mincer, 1974) The grouped nature
of income is addressed by defining a continuous index function (also referred to as a latentfunction) for the logarithm of husband’s income, *
a I
<
p
a j,
I
i
j
* hi i
j
Trang 5where v hi is a normal random error term with mean 0 and variance σ h, ωhi is a vector of
exogenous variables affecting husband's (log) income and π h is a corresponding vector of
parameters The a j’s in the equation represent known threshold values for each income category
censoring bounds Since the price index p i varies among observations, the thresholds are not
fixed The J income intervals exhaust the real line and hence we assume a 0 /p i = –∞ and a J /p i = +∞ Representing the cumulative standard normal by Φ, the probability that husband's income
falls in category j may be written
)
(
σ
A σ
A j
I
h
hi h i j h
hi h i hi
(2)
Defining a set of dummy variables , otherwise 0 ), 1,2, = , 1,2, = (
category th in the falls if 1 J j N i j I = M mi ij (3)
the likelihood function for estimation of the parameters π h and σ h is Lh A A = h hi h i j h hi h i M J 1 = j N 1 = i ij 1, , (4)
Initial start values for maximum likelihood (ML) iterations are obtained by assigning to each income observation its conditional expectation based on the marginal distribution of *
hi I
and then regressing these conditional expectations on the vector of exogenous variables.4
An imputed value for husband’s (log) income is computed from the estimation of equation (4) as Iˆhi ˆ hh and is used for husband’s (log) income in subsequent equations.
4 In a recent paper, Stern (1991) maximizes equation (4) using a two-step procedure rather than a direct maximization procedure The two-step procedure is not only inefficient, but also is tedious compared to the direct and computationally simple maximization procedure used here His procedure also does not provide an estimate of σ h and assumes that the thresholds (the A j,i’s in equation 4) are fixed across all observations
Trang 6Wife’s Employment
The second equation in our model system is the wife’s employment decision Wife’semployment choice is a function of exogenous variables and household assets or unearnedincome In our model, husband’s (log) income is treated as unearned income to the wife; that is,the wife regards her husband as an “income producing asset” which affects her work decision(Cogan 1980)
We define a latent continuous function E i * denoting the wife’s employment propensity
and view the discrete employment decision E i as a reflection of this underlying propensity Ifthis propensity exceeds zero, the wife will work Otherwise, she will not work We may writethe relationship between the latent employment propensity and the discrete employmentdecision in equation form as follows:
0if
=
0if
1
ˆ
E E
>
E
=
E
v I E
* i i
* i i
ei hi e ei e
where the vector ω ei represents a vector of exogenous variables affecting wife’s employment
We assume a normal distribution for the random error term v ei with mean zero and unit
variance This will be recognized as the familiar probit model The parameters π e and γ e areestimated using a univariate probit procedure
Wife’s Income
Wife’s income is conditional on her employment status In addition, it is available only ingrouped form We specify an index function of wife’s income and assume a lognormal
distribution for this function Defining the index function for wife’s (log) income as I wi * and the
observed categorical wife’s income data as I wi, we write
,0if
only observed
if ,
ˆ
p
d I
=
I
* i i
l
* wi i
wi
l-wi hi w wi w
Trang 7where l is an index for categories (l =1,2, L), d l represents the thresholds of absolute income
and p i is the price index The variable vector ω w contains exogenous variables affecting wife’s
income and v wi is a normal random error term with mean 0 and variance σ w.5
Wife’s income (in log form) is a censored grouped variable (the censoring based on employment) Limiting our attention to observations in the uncensored portion and estimating parameters by a grouped data method similar to the one employed for husband's income equation is subject to problems of selection bias (Heckman, 1979; Greene, 1983) Assuming a bivariate normal distribution between the conditional distributions of the underlying latent wife’s employment and income functions, and defining
, ) ˆ , (
~ and , ) ˆ , (
~ , ) , (
~ , ) , (
~ I ω wi wi I * hi
* hi ei ei w w w e e
the appropriate maximum likelihood estimation (MLE) procedure for estimation of the parameters is shown in the following equation:6
, ,
'
~ , '
~ ,
'
~ , '
~
) ' ~ ( 1 1 , 1 2 , 2 1 1 i il i E L l T ew ei e w wi w i l ew ei e w wi w i E N i ei e f D D L (7) where ρ ew is the correlation between the error terms v e and v w in wife’s employment and income equations respectively, D l,i = d l / p i represents the real income thresholds associated with each income category l and observation i, Φ2 is the cumulative standard bivariate normal function, and , otherwise 0 ) 1,2, = 1,2, = (
category th in the falls if 1 L l N i l I = T hi il (8)
5 Husband’s income is expected to have a negative effect on wife’s hours of work due to the positive income effect of an increase in unearned income on wife’s leisure (Killingsworth, 1983) Since wife's income is related to her hours of work, husband’s income appears in equation (6).
6 We are not aware of any application of this variant of sample selection in econometric literature The probit model with sample selection of van de ven Wynand and Bernard (1981) is a special case of this structure.
Trang 8The maximization of the logarithm of the likelihood function in equation (7) provides estimates
of the wife’s income equation The employment equation (5) is estimated directly and as wewill see, will be estimated again in conjunction with the car ownership equations.Consequently, there is a multiplicity of employment estimates All these estimates areconsistent and were found to be very close empirically We use the univariate probit estimates
of wife’s employment parameter estimates for interpretation Maximum likelihood estimationequation (7) is done to obtain consistent estimates of parameters for wife’s income and,similarly, for car ownership
Initial start values for the maximum likelihood iterations are obtained by a modification
of the procedure adopted for husband’s income estimation We assign to each observation in theuncensored region, its conditional expectation based on the marginal distribution of the
underlying latent continuous variable I wi * We now treat these values as the actual continuousincome values and apply a Heckman’s two step method for sample selection models to obtainstart values for the parameters
An imputed value of the wife’s (log) income for employed wives is computed from thefinal MLE parameters as
)(
ˆ
i w ew hi w wi
’ w
* i
* wi i
* wi
*
wi
+ I +
=
E | I E 1
= E | I E
(9)
where ˆw ,ˆw ,ˆew ,andˆw are estimated values obtained from the maximization of equation(7), and ˆi is an estimate of the familiar selectivity correction term (see Heckman, 1979) Thisimputed value serves as an unbiased estimate of income for employed wives and is used in thecar ownership equation
Household Car Ownership
The household car ownership choice is modeled as a two equation switching system with wife'semployment behaving as the endogenous switch We postulate a latent variable representinghousehold motivation or intention to own cars in each switching regime The observableinformation is the categorical car ownership variable Assuming a normal distribution for thelatent car ownership intention, an ordered response probit correspondence is established in each
Trang 9switching car ownership regime The resulting two-equation switching car ownership system is
as follows:
,0
= 0,1,
=
if
0if
ˆ
=
0if)
ˆˆ
0 1
1
* 0 0
1 1
= ψ ,
= + ψ K
k ψ C <
ψ = k,
C
E v
I
E v I + I C
K k
* i k
i
* i ci
hi c ci c
* i ci
* wi
* hi c ci c
where v c is a random error term associated with the car ownership equations The ψ’s are
thresholds that determine the correspondence between the observed car ownership choice andthe latent propensity to own cars These are estimated along with the other parameters Wife’s(log) income and the husband’s (log) income have identical coefficients in the “wife-employed”regime Wife’s (log) income does not appear in the second equation Statistical tests for the
equality of the income effect (γ c1 and γ c0 ) and elements of the coefficient vectors π c1 and π c0 can
be performed during estimation
We treat the car ownership equations as a switching ordered probit system with wife’semployment behaving as the endogenous switch.7 This switching system accommodates forpossible correlation in unmeasured tastes that affect car ownership and wife's work choice.Defining
),,(
~
1 1 1
1
)ˆ,ˆ,(
wi
* hi ci
),(
~
0 0
0
(11)
)ˆ,( I
hi ci
Trang 10
),
~
~,
~
~(
),
~
~,
~
~(
2 0
1 2 0
2
1 1 - 2 1
ei e ci c k
W K
1
= k
E - 1
ec ei e ci c k ec
ei e ci c k
W K
1
= k
E N
1
= i c
ik i
ik i
categorycar
th the tobelongsn
observatio
th theif
correlation term ρ ec is set to zero
Empirical Specification and Results
The choice of variables and the specification adopted in the model was guided by conceptualarguments, empirical evidence provided by earlier labor economic and car ownership studiesand considerations of parsimony in representation Table 1 provides a list of exogenousvariables used in the model and their definitions The variable termed “work acceptability” isthe ratio of total female labor force (that is, all females who are employed, or, not employed butseeking jobs) to total active female population in each municipality.8 It represents the degree towhich wife's working is considered acceptable or appropriate in each community.9
Price levels are assumed constant across regions in this analysis The Netherlands is asmall country and it may not be unreasonable to assume constant price levels in such a compactgeographic area (Killingsworth, 1983) Thus, variations in the price index arise in this studyfrom time series or wave differences in price level
8 The data used in the computation of this index was obtained from the Central Bureau of Statistics (CBS), Netherlands.
9 We recognize alternative interpretations of the work acceptability index which may represent a combination of location attributes Viewed from this perspective, the index may be considered as a parsimonious representation for the set of local factors influencing wife's work participation.
Trang 11The estimation results for each equation are presented and discussed in the followingsections.
Husband’s Income Equation
The unit of measure used for the husband's income is real annual income in guilders per year.Three sets of variables are considered in the husband's income equation These relate to thehusband’s age, husband’s education and wave dummy variables The results of the grouped dataMLE estimation of husband's income (in log form) are shown in Table 2a
Age has a positive impact on husband's income presumably because it is a proxy forexperience (Hausman and Wise 1976;1977); however, there is a decline in the magnitude of theage effect beyond 35 from +0.025 to +0.010 possibly attributable to decreasing returns to scale
of experience and/or deterioration in efficiency and productivity (Mincer 1974) The effect ofage beyond 45 is more complicated For individuals with a low education, (log) incomedecreases beyond the age of 45 at a rate of –0.011 (= 0.025 – 0.015 – 0.021) However, forindividuals with medium to high education, the net effect is near zero (–0.011+0.009) Theseresults indicate a differential effect of age on productivity based on education level
We introduce two dummy variables corresponding to secondary and high educationlevels (using primary education as the base category) to represent the effect of education onincome-earnings Table 2a shows that there is a strong positive influence of the educationdummy variables on husband's income, with high education having a greater influence thansecondary education
The wave dummy variables capture temporal variations in (real) income earningpotential Such temporal variations may arise from differences in the state of the economy, e.g.,changes in costs of living and/or absolute income earnings
Examination of the marginal effects of exogenous variables on husband's income(computed for mean variables values) provide additional insights into the estimation results andare presented in Table 2
Wife’s Employment Equation
The exogenous variable vector in the wife’s work participation equation includes a dummyvariable for husband’s high education, wife’s age and education variables, variables pertaining