1. Trang chủ
  2. » Giáo Dục - Đào Tạo

tiểu luận kinh tế lượng the factors affecting weekly working time in 1975

41 70 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 41
Dung lượng 2,2 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Chosen Variables for Research We obtained the following result by using command ‘des’ Figure 1: The result of using command 'des' The data set was created on August 18, 1999, containing

Trang 1

CONTENT 1

INTRODUCTION 5

PART 1: DATA DESCRIPTION 6

I GENERAL DATA DESCRIPTION 6

II DATA DESCRIPTION IN DETAILS 8

1 Time worked per week in 1975 8

2 Age in 1975 9

3 Educational level in 1975 9

4 Health status in 1975 10

5 Gender 11

6 Marital status in 1975 11

7 Time of sleeping per week in 1975 12

PART 2: REGRESSION ANALYSIS 13

I THE RELATIONSHIP BETWEEN VARIABLES – STATISTICAL CORRELATION 13

II ESTIMATE THE REGRESSION MODEL BY OLS METHOD 14 1 Population regression function 14

2 Sample regression function 14

3 Analysis of Parameters in the Sample Regression Model 14

III MISTAKE TESTS OF THE MODEL 15

1 Testing multicollinearity 15

2 Testing heteroskedasticity 16

3 Cure for heteroskedasticity 17

IV HYPOTHESES TESTS 17

1 Testing overall significance of the regression 17

2 Testing significance of the regression coefficients 18

3 Testing exclusion restricstions 20

PART 3: CONSTRUCTING FINAL REGRESSION MODEL 22

I. ESTIMATE THE REGRESSION MODEL BY OLS METHOD 22

1 Population regression function 22

1

Trang 2

2 Sample regression function 22

3 Analysis of parameters in the sample regression model 22

II MISTAKE TESTS OF THE MODEL 23

1 Testing multicollinearity 23

2 Testing heteroskedasticity 24

3 Cure for heteroskedasticity 24

III HYPOTHESES TESTS 25

1 Testing the overall significance of regression 25

2 Testing the significance of the regression coefficients 25

CONCLUSION 27

APPENDIX 28

1 Result of using command ‘tab totwrk75’ 28

2 Result of using command ‘tab slpnap75’ 34

Trang 3

TABLE OF FIGURES

Figure 1: The result of using command 'des' 6

Figure 2: The result of using command 'des' for variables chosen 7

Figure 3: The result of using command 'sum' 8

Figure 4: The result of using command 'tab totwrk75' ( full version in appendix ) .8

Figure 5: The result of using command 'tab age75' 9

Figure 6: The result of using command 'tab educ75' 10

Figure 7: The result of using command 'tab gdhlth75' 10

Figure 8: The result of using command 'tab male75' 11

Figure 9: The result of using command 'tab marr75' 11

Figure 10: The result of using command 'tab slpnap75' ( full version in appendix ) .12

Figure 11: The result of using command ‘corr’ in STATA 13

Figure 12: The result of using command 'reg' in STATA (6 variables) 14

Figure 13: The result of using command 'vif' after using 'reg' in STATA 15

Figure 14: The result of using 'imtest, white' in STATA 16

Figure 15: The result of using command robust in STATA 17

Figure 16: The result of command 'test' (after using robust) 17

Figure 17: The result of using command 'reg' (2 variables) 20

Figure 18: The result of using command 'test' for 4 variables above - after robust 21

Figure 19: The result of using command 'reg' after omitting 4 variables 22

Figure 20: The result of using 'corr' with 3 variables 23

Figure 21: The result of using command 'vif' after 'reg totwkr75 male slpnap75' 23

Figure 22: The result of using command ‘imtest, white’ for new function 24

Figure 23: The result of using 'reg robust' 24

Figure 24: The result of using command’ test male slnap’ 25

3

Trang 4

The success and final outcome of this assignment required a lot of supportfrom others, and we are extremely fortunate to have this all along the completion ofour work We would like to express our gratitude to Mrs Dinh Thi Thanh Binh, ourEconometrics lecturer, for excellent expertise and supportive guidance she provided

us throughout the process Without such help, we might not have been able tocomplete this assignment so far

We are really grateful as we managed to complete the assignment on time,which could not be done without the effort and co-operation from our groupmembers Last but not least, we would like to thank all of our friends for their nicesupport and willingness to spend some time helping us finishing the documents

Group 11

Trang 5

Researches have shown that various factors have influences on the workingtime of labor For instance, older workers tend to work less time than younger ones.The same thing happens to female workers who are married and have a family totake care of And for each person, the influences of these factors are different

Therefore, after taking everything into consideration, we decided to choose

and study the project: “The factors affecting weekly working time in 1975” Thus

through our project, we analyze the factors that have major impact on the workingtime of labor in 1975, using the econometric methods Econometrics is a socialscience in which tools of economic, mathematical, and statistical theories are used

to estimate economic relationships, testing economic theories, and evaluating andimplementing government and business policy It is based upon the development ofstatistical methods to forecast economic issues

In this paper, we consider six factors that may affect staffs’ weekly working

time: age, educational level, health status (good or poor), gender (male or female), marital status (married or single), time of sleeping.

Throughout the project, we used STATA as the tool for econometricsanalysis to analyze the data set “11.DTA”

We hope that arguments and statistics in this project will be helpful foranyone who is interested in the topic stated

5

Trang 6

PART 1: DATA DESCRIPTION

I GENERAL DATA DESCRIPTION

1 Chosen Variables for Research

We obtained the following result by using command ‘des’

Figure 1: The result of using command 'des'

The data set was created on August 18, 1999, containing 20 variables, 239observations

After considering the meaning of variables in file 11.dta, our group decided to

choose following variables as variables in regression model:

Dependent variable: totwrk75

Independent variables: age75, educ75, gdhlth75, male, marr75, slpnap75.

2 General Description of Chosen Data

We obtained the following result by using command ‘des’ for variables analyzed:

Trang 7

Figure 2: The result of using command 'des' for variables chosen

From the above result, we can see that age75, educ75 and, slpnap75, totwrk75are quantitative variables and gdhlth75, male, marr75 are qualitative variables.Here is the variables explanation in detail:

Variables Display Format Meaning Unit

Using command ‘sum totwrk75 age75 educ75 gdhlth75 male marr75 slpnap75’,

we can know the number of observations and the mean, standard deviation, min,

max of each variables (age75, educ75, gdhlth75, male, marr75, slpnap75,

totwrk75)

7

Trang 8

sum totwrk75 age75 educ75 gdhlth75 male marr75 slpnap75

Figure 3: The result of using command 'sum'

II DATA DESCRIPTION IN DETAILS

To describe variables in details, we used command ‘tab’ for each variable:

1 Time worked per week in 1975

Figure 4: The result of using command 'tab totwrk75' (full version in appendix)

Trang 9

Minutes of working time per week starts from 0 to 4805 The most frequent is 0minute, with 10 observations, accounted for 4.18% Followed by is 2325 minutes,with 4 observations, accounted for 1.67%

2 Age in 1975

Figure 5: The result of using command 'tab age75'

Age of workers in 1975 varies from 23 years old to 65 years old The most

frequent age is 33 years old, with 14 observations, accounted for 5.8% The leastfrequent age are 49, 63, and 64 years old, with only 1 observation for each,

accounted for 0.42%

3 Educational level in 1975

9

Trang 10

Years of education starts from 1 to 17 Twelve years of education has the highestnumber of observations (with 98 observation, accounted for 41%), while 1 year ofeducation has the lowest (with 1 observation, accounted for 0.42%)

Figure 6: The result of using command 'tab educ75'

4 Health status in 1975

Figure 7: The result of using command 'tab gdhlth75'

- Variable gdhlth = 1 if good health in 1975 has 211 observations, accounted for 88.28%

- Variable gdhlth = 0 if poor health in 1975 has 28 observations, accounted for11.72%

Trang 11

5 Gender

- Variable male = 1 if male has 144 observations, accounted for 60.25%

- Variable male = 0 if female has 95 observations, accounted for 39.75%

Figure 8: The result of using command 'tab male75'

6 Marital status in 1975

Figure 9: The result of using command 'tab marr75'

- Variable marr75 = 1 if maried in 1975 has 179 observations, accounted for 74.9%

- Variable marr75 = 0 if single in 1975 has 60 observations, accounted for 25.1%

11

Trang 12

7 Time of sleeping per week in 1975

Minutes of sleeping per week, including naps, starts from 2053 to 6110 The mostfrequent are 3195, 3353, and 3518 minutes, with 3 observations for each, accountedfor 1.26%

Figure 10: The result of using command 'tab slpnap75' (full version in appendix)

Trang 13

PART 2: REGRESSION ANALYSIS

I THE RELATIONSHIP BETWEEN VARIABLES – STATISTICAL CORRELATION

Figure 11: The result of using command ‘corr’ in STATA

The correlation between dependent variable totwrk75 and others independent variables (age75, educ75, gdhlth75, male, marr75, slpnap75) are different Its

interval is from |r(totwrk75, slpnap75)| = 0.3538 to |r(totwrk75, slpnap75)| =

0.0813

r(totwrk75, age75) = -0.1327 That means totwrk75 and age75 have negative

correlation Sign is expected to be negative

r(totwrk75, educ75) = 0.0813 That means totwk75 and educ75 have positive

correlation Sign is expected to be positive

r(totwrk75, gdhlth75) = 0.1555 That means totwk75 and gdhlth75 have

positive correlation Sign is expected to be positive

r(totwrk75, male) = 0.3822 That means totwk75 and male have positive

correlation Sign is expected to be positive

r(totwrk75, marr75) = 0.1042 That means totwk75 and marr75 have positive

correlation However, sign is expected to be negative

r(totwrk75, slpnap75) = -0.3538 That means totwk75 and slpnap75 have

negative correlation Sign is expected to be negative

13

Trang 14

II ESTIMATE THE REGRESSION MODEL BY OLS

METHOD 1 Population regression function

slpnap75 + 0 + u

The variable u, called error term or disturbance in the relationship, represents

factors other than age75, educ75, gdhlth75, male, marr75, slpnap75 that affect

totwrk75.

2 Sample regression function

By using STATA, we have the following result:

Figure 12: The result of using command 'reg' in STATA (6 variables)

From the above result, we obtain the estimated regression function:

(SRF): ̂ = – 8,061648 age75 –19.7368 educ75 + 231.5114 gdhlth75 +

670.8464 male – 25.161marr75 – 0.5949014 slpnap75 + 4172.318

3 Analysis of Parameters in the Sample Regression Model

F (6, 232) = 14.32 and Prob > F = 0.0000 are the evidence that at least one of the

independent variables (age75, educ75, gdhlth75, male, marr75, slpnap75) help

to explain the dependent variable (totwrk75).

14

Trang 15

Coefficient of determination (R-squared = 0.2702) is interpreted as the fraction of

the sample variation in y that is explained by x In this model, age75, educ75, gdhlth75, male, marr75, slpnap75 can explain 27.02% of the variation in

totwrk75.

Adjusted R-squared ( ̅̅̅ 2 = 0.2513) increases when a group of variables is added

R

to a regression if, and only if, the F statistic for joint significance of the new

Residual sum of squares (RSS = 147852932) measures the sample variation in the ̂u i

III MISTAKE TESTS OF THE MODEL

1 Testing multicollinearity

1.1 Correlation matrix

The correlation matrix (image 11) shows that there is no |rij | ( i = 1,6 , j = 1,6 ) greater than 0,8; therefore, multicollineary does not exist.

Figure 13: The result of using command 'vif' after using 'reg' in STATA

As VIF(i) < 10 ( i= 1,6), we can conclude that multicollineary does not exist.

15

Trang 16

2 Testing heteroskedasticity

Figure 14: The result of using 'imtest, white' in STATA

= 0,05; which means heteroskedasticity exists in this model

Trang 17

3 Cure for heteroskedasticity

To deal with heteroskedasticity, we run robust:

Figure 15: The result of using command robust in STATA

IV HYPOTHESES TESTS

1 Testing overall significance of the regression:̂= ̂=̂=̂=̂=̂=

Hypothesis: {

Figure 16: The result of command 'test' (after using robust)

17

Trang 18

2 Testing significance of the regression coefficients

time of working per week The numbers we used on the second column (P > |t|) is

based on image 5 (The result of using robust in STATA).

Reject

H0, accept H1, intercepthas statistically significant effect on

= – 8.061648 0,137 > α = 0,05 have statistically significant effect on

totwrk75.

= – 19.7368 0,314> α = 0,05 have statistically significant effect on

totwrk75.

= 231.5114 0,233 > α = 0,05 have statistically significant effect on

totwrk75.

= 670.8464 0,000 < α = 0,05 significant effect on totwrk75.

18

Trang 19

4 = 670.8464 means that male’sworking time is 670.8464 minutes onaverage higher than female, ceterisparibus.

= – 25.161 0,838 > α = 0,05 have statistically significant effect on

totwrk75.

significant effect on totwrk75.

̂

= –0.594901 0,000 < α = 0,05 6 = – 0.5949014 means that

corresponds to a decrease in workingtime per week of 0.5949014 minutes,ceteris paribus

In conclusion, only male and slnap75 has statistically significant effect on

totwrk75 at 5% level.

19

Trang 20

3 Testing exclusion restricstions

From the above analysis, age75, educ75, gdhlth75, marr75 can be omitted In

this step, we are testing multiple linear restriction with those variables (q=4) It

means we are constructing a regression function with two variables: slpnap75 and

Trang 21

Figure 18: The result of using command 'test' for 4 variables above - after robust

Since F = 1.02 < F 0,05(4,232) = 2,41, we cannot reject H 0 Therefore, age75, educ75,

gdhlth75, marr75 have no effect on totwrk75 after male and slpnap75 have been

controlled for and therefore should be excluded from the model

21

Trang 22

PART 3: CONSTRUCTING FINAL

REGRESSION MODEL

I ESTIMATE THE REGRESSION MODEL BY OLS

METHOD 1 Population regression function

PRF: totwrk75 = 0 + 1 male + 2 slpnap75+ u

The variable u, called error term or disturbance in the relationship, represents

factors other than male, slpnap75 that affect totwrk75.

2 Sample regression function

By using STATA, we have the following result:

Figure 19: The result of using command 'reg' after omitting 4 variables

From the above result, we obtain the estimated regression function:

3 Analysis of parameters in the sample regression model

F (6, 232) = 40.31 and Prob > F = 0.0000 are the evidence that at least one of the

independent variables (male, slpnap75) help to explain the dependent variable (totwrk75)

Coefficient of determination (R-squared = 0.2546) is interpreted as the fraction of

the sample variation in y that is explained by x In this model, male, slpnap75 can explain 25.46% of the variation in totwrk75 New regression model’s R-

squared is smaller than the previous model’s

22

Trang 23

Adjusted R-squared (̅̅̅̅2 = 0.2483) increases when a group of variables is added

to a regression if, and only if, the F statistic for joint significance of the new

variables is greater than unity We use ̅̅̅̅2 to decide whether a certain

independent variable (or set of variables) should or should not belongs in a model.Total sum of squares (TSS = 202597441) is a measure of the total sample

Figure 20: The result of using 'corr' with 3 variables

1.2 Variance Inflation factors (VIF) method

Figure 21: The result of using command 'vif' after 'reg totwkr75 male slpnap75'

As VIF(i) < 10 ( i= 1,3), we can conclude that multicollineary does not exist.

23

Ngày đăng: 22/06/2020, 21:30

TỪ KHÓA LIÊN QUAN

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

🧩 Sản phẩm bạn có thể quan tâm

w