1. Trang chủ
  2. » Giáo Dục - Đào Tạo

Tài liệu KInh tế ứng dụng_ Lecture 4: Use of Dummy Variables docx

9 641 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Lecture 4: Use of Dummy Variables
Tác giả Nguyen Hoang Bao
Chuyên ngành Applied Econometrics
Thể loại Lecture notes
Năm xuất bản 2004
Định dạng
Số trang 9
Dung lượng 76,38 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Applied Econometrics Lecture 4: Use of Dummy Variables ‘Pure and complete sorrow is as impossible as pure and complete joy’ 1 Introduction The quantitative independent variables used

Trang 1

Applied Econometrics

Lecture 4: Use of Dummy Variables

‘Pure and complete sorrow is as impossible as pure and complete joy’

1) Introduction

The quantitative independent variables used in regression equations, which usually take values over some continuous range Frequently, one may wish to include the quality independent variables, often called dummy variables, in the regression model in order to (i) capture the presence or absence of a

‘quality’, such as male or female, poor or rich, urban or rural areas, college degree or do not college degree, different stages of development, different period of time; (ii) to capture the interaction between them; and, (iii) or to take on one or more distinct values

2) Intercept Dummy

An intercept dummy is a variable, says D, has the value of either 0 or 1 It is normally used as a regressor in the model

For example, the consumption function (C) can be written as follows:

C = b0 + b1Y + b2D

where

Y is the gross national income

D is equal to 1 for developing countries and 0 for

developed countries

Then,

If D = 0, C = b0 + b1Y

If D = 1, C = b0 + b1Y + b2D = (b0+ b2)+ b1Y

b 2 C = b 0 + b 1 Y

C = (b 0 + b 2 )+ b 1 Y

Y

C

Illustrative example 1 (Maddala, 308)

Trang 2

=

female is gender if 0

male is gender if 1

=

otherwise 0

25 age if 1

D2

=

otherwise 0

50 age 25 if 1

D3

=

otherwise 0

degree school high education if

1

D4

=

otherwise 0

degree college education

degree school high if 1

D5

Then we run the following regression equation

C = α + βY + γ1D1 + γ2 D2 + γ3 D3 + γ4 D4+ γ5 D5

The assumption made in the dummy variable method is that it is only the intercept that changes for each group but not the slope coefficient of Y

Illustrative example 2 (Maddala, 309)

The dummy variable method is also used if one has to take care of seasonal factors For example, if

we have quarterly data on C and Y, we fit the regression equation

C = α + βY + λ1D1 + λ2 D2 + λ3 D3

where D1, D2, and D3 are seasonal dummies defined by:

=

others for 0

quarter first the for 1

D1

=

others for 0

quarter second the for 1

D2

Trang 3

=

others for 0

quarter third the for 1

D3

3) Slope Dummy

The slope dummy is defined as an interactive variable

DY = D x Y

D is equal to 1 for developing countries and 0

for developed countries

Then,

If D = 0, C = b0 + b1Y

If D = 1, C = b0 + b1Y + b2D = b0+(b1+ b2)Y

C = b 0 + (b 1 + b 2 )Y

C = b 0 + b 1 Y

Y

C

4) Combination of Slope and Intercept Dummies

We may include both slope and intercept dummies in a regression model

DY = D x Y

D is equal to 1 for developing countries and 0 for

developed countries

The general model can be written as follows:

Y = b0 + b1Y + b2D + b3DY

Then,

If D = 0, C = b0 + b1Y

If D = 1, C = b0 + b1Y + b2D = (b0+b2)+(b1+ b3)Y

b 2

C = (b 0 + b 2 ) +(b 1 + b 3 )Y

C = b 0 + b 1 Y

Y

C

5) Piece – Linear Regression Model

Most of the econometric models we have studied have been continuous, with small changes in one variable having a measurable effect on another variable

If we want to explain investment (I) as a function of interest rate (r), the two segments of the piecewise linear regression show in the below figure

Trang 4

The general model can be written as follows:

I = b0 + b1r + b2 (r – r*)D

If r < r*, then D = 0: I = b0 + b1r

If r ≥ r*

, then D = 1: I = b0 – b2r* + (b1 + b2)r

where r* is obtained when we plot the dependent

variable against the explanatory variables and

observing if there seem to be a sharp change in

the relation after a given value of r*

I

r

r*

6) Summary

If a qualitative variable has m categories, we include (m – 1) dummy variables in the model The coefficients attached to the dummy variables must always be interpreted in the relation to the base variable, that is, the group that gets the value zero

The use of dummy variables associated with two or more categorical variables allows us to study partial association and interaction effects in the context of multiple regression Interactive dummies are obtained by multiplying dummies corresponding to the different categorical variables This allows us to test formally whether interaction is present or not

References

Bao, Nguyen Hoang (1995), ‘Applied Econometrics’, Lecture notes and Readings,

Vietnam-Netherlands Project for MA Program in Economics of Development

Maddala, G.S (1992), ‘Introduction to Econometrics’, Macmillan Publishing Company, New York

Mukherjee Chandan, Howard White and Marc Wuyts (1998), ‘Econometrics and Data Analysis for

Developing Countries’ published by Routledge, London, UK

Wonnacott, Thomas H and Ronald J Wonnacott (1990) ‘Introductory Statistics’, Published by John

Wiley and Sons, Inc., Printed in the United States of America

Trang 5

Workshop 4: Use of Dummy Variables

1) To help firms determine which of their executive salaries might be out of line, a management consultant fitted the following multiple regression equation from data base of 270 executives under the age of 40:

SAL = 43.3 + 1.23 EXP + 3.60 EDUC + 0.74 MALE

(SE) (0.30) (1.20) (1.10)

residual standard deviation s = 16.4

where

SAL = the executive’s annual salary ($000)

EDUC = number of years of post – secondary education

EXP = number of years of experience

MALE = dummy variable, coded 1 for male, 0 for female

1.1) From this regression, a firm can calculate the fitted salary of each of its executives If the actual salary is much lower or higher, it can be reviewed to see whether it is appropriate Fred Kopp, for example, is a 32 – year old vice president of a large restaurant chain He has been with the firm since he obtained a 2 – year MBA at age 25, following a 4 – year degree in economics He now earns $126,000 annually

1.1.1) What is Fred’s fitted salary?

1.1.2) How many standard deviations is his actual salary away from his fitted salary?

Would you therefore call his salary exceptional?

1.1.3) Closer inspection of Fred’s record showed that he had spent two years studying

at Oxford as a Rhodes Scholar before obtaining his MBA In light of this information, recalculate your answers to 5.1.1) and 5.1.2)

1.2) In addition to identifying unusual salaries in specific firms, the regression can be used to answer questions about the economy – wide structure of executive salaries in all firms For example,

1.2.1) Is there evidence of sex discrimination?

1.2.2) Is it fair to say that each year’s education (beyond high school) increases the

income of the average executive by $3,600 a year?

Trang 6

2) In an environment study of 1072 men, a multiple regression was calculated to show how lung function was related to several factors, including some hazardous occupations (Lefcoe and Wonnacott, 1974):

AIRCAP = 4500 – 39 AGE – 9.0 SMOK – 350 CHEMW – 380 FARMW – 180 FIREW

(SE) (1.8) (2.2) (46) (53) (54)

where

AIRCAP = air capacity (milliliters) that the worker can expire in one second

AGE = age (years)

SMOK = amount of current smoking (cigarettes per day)

CHEMW = 1 if subject is a chemical worker, 0 if not

FARMW = 1 if subject is a farm worker, 0 if not

FIREW = 1 if subject is a firefighter, 0 if not

A fourth occupation, physician, served as the reference group, and so did not need a dummy Assuming these 1072 people were a random sample,

2.1) Calculate the 95% confidence interval for each coefficient

Fill in the blanks, and choose the correct word in square brackets:

2.2) Other things being equal (things such as _), chemical workers on average have AIRCAP values that are _ milliliters [higher, lower] than physicians

2.3) Other things being equal, chemical workers on average have AIRCAP values that are _ milliliters [higher, lower] than farm workers

2.4) Other things being equal, on average a man who is 1 year older has an AIRCAP value that is _ milliliters [higher, lower]

2.5) Other things being equal, on average a man who smokes one pack (20 cigarettes) a day has an AIRCAP value that is milliliters [higher, lower]

2.6) As far as AIRCAP is concerned, we estimate that smoking one package a day is roughly equivalent to aging _ years But this estimate may be biased because of

Trang 7

3) In an observation study to determine the effect of a drug on blood pressure it was noticed that the treated group (taking the drug) tended to weigh more than the control group Thus, when treated group had higher blood pressure on average, was it because of the treatment or their weight? To untangle this knot, some regressions were computed, using the following variables:

D = 1 if taking the drug, 0 otherwise

The data set is given by:

0

0

0

0

0

0

0

0

1

1

1

1

1

1

180

150

210

140

160

160

150

200

160

190

240

200

180

190

220

81

75

83

74

72

80

78

80

74

85

102

95

86

100

90

3.1) How much higher on average would the blood pressure be:

a) For someone of the same weight who is on the drug?

b) For someone on the same treatment who is 10 lbs heavier?

3.2) How would the simple regression coefficient compare to the multiple regression coefficient for weight? Why?

Trang 8

4) Use data file SRINA

4.1) Regress Ip on Ig

4.2) Repeat the regression using (i) an intercept dummy; (ii) a slope dummy; and, (iii) both slope and intercept dummies Select the break point by looking at the scatter plot Ip against

Ig

4.3) Draw scatter plot and fitted line on each regression

4.4) Comment on your results

5) Use data file LEACCESS

5.1) Regress LE on Y

5.2) Repeat the regression using (i) an intercept dummy; (ii) a slope dummy; and, (iii) both slope and intercept dummies Use t test check whether they are significant or not Select the break point by looking at the scatter plot LE against Y

5.3) Draw scatter plot and fitted line on each regression

5.4) Comment on your results

6) Use data file AIDSAV

6.1) Regress S/Y on A/Y

6.2) Repeat the regression using dummy variable to take on the distinct value

6.3) Draw the scatter plot and fitted line on each regression

6.4) Comment on your results

Trang 9

7) Use data file TOT

7.1) Regress ln(TOT) on t

7.2) Repeat the regression using appropriate dummy

7.3) Draw the time graph of the TOT (not logged) and showing your two fitted line

7.4) Comment on your results

8) Use data file INDIA

8.1) Does your conclusion confirm that gender matter in terms of explaining earning differences?

8.2) Does your conclusion confirm that educational level in terms of explaining earning differences?

8.3) Regress ln(WI) on gender, education, and age using the appropriate dummy variables?

Ngày đăng: 27/01/2014, 11:20

TỪ KHÓA LIÊN QUAN

🧩 Sản phẩm bạn có thể quan tâm