1. Trang chủ
  2. » Giáo Dục - Đào Tạo

tiểu luận kinh tế lượng factors that determine housing prices

16 58 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 16
Dung lượng 2,17 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Procedure and program used • Procedure Step 1: Questions of interest Step 2: Economic model Step 3: Econometric model Step 4: Data collection Step 5: Estimation of econometric model Ste

Trang 1

FACULTY OF INTERNATIONAL ECONOMICS

- o0o-

Topic : Factors that d etermine housing price s

Student Name – ID : Nguyen Hoang Dang 1815520160

Dr Tu Thuy Anh

Dr Chu Thi Mai Phuong

: Supervisor

H anoi, 2018

Trang 2

Table of Contents

I Introduction 3

II Literature overview 3

1 Questions of interest 3

2 Procedure and program used 3

III Economic model 4

1 Specifying the object for modeling 4

2 Defining the target for modeling by the choice of the variables to analyze, denoted xi 4

3 Embedding that target in a general unrestricted model (GUM) 4

IV Econometric model 5

V Data collection 5

1 Data overview 5

2 Data description 5

VI Estimation of econometric model 6

1 Checking the correlation among variables 6

2 Regression run 8

VII Check multicollinearity and heteroscedasticity 9

1 Multicollinearity 9

2 Heteroskedasticity 10

VIII Hypothesis postulated 12

1 The impact of neighborhood factors 12

2 The impact of accessibility factors 13

IX Result analysis & Policy implication 14

X Conclusion 15

XI References 16

Exhibit 1: Definition of variables in the Housing Price model 4

Exhibit 2: Statistic indicators of variables in the Housing Price model 5

Exhibit 3: Correlation matrix 6

Exhibit 4: Scatterplot of variables in the Housing Price model 7

Exhibit 5: Regression model 8

Exhibit 6: Multicollinearity test 9

Exhibit 7: Heteroskedasticity test 10

Exhibit 8: Residual-versus-fitted plot of the Housing Price model 11

Exhibit 9: Correcting heteroskedasticity 11

Trang 3

Exhibit 10: Hypothesis testing of multiple regression model of neighborhood factors 12 Exhibit 11: Hypothesis testing of multiple regression model of accessibility factors 13

I Introduction

As much as Economy is a meaningful science that determines the social development in general and national growth in particular, Econometrics is the use of statistical techniques

to understand those issues and test theories Without evidence, economic theories are abstract and might have no bearing on reality (even if they are completely rigorous) Econometrics is a set of tools we can use to confront theory with real-world data

Given the data set, our group, which includes three members: Nguyen Ha Trang, Nguyen Mai Thuy Tien, and Nguyen Thi Lan Huong, follows the methodology of econometric comprising eight steps to analyze the data Note that because of the lack of information on the data set, all inferences of abbreviations and others are based on assumptions and self-research As a result,

we hope to have shown clearly our logic and reasoning of analysis

To the extent of purpose and resources, there are still deficiencies in this report, but we look forward to providing readers with a decent view of the overall of the data set given and the knowledge that we have gained through Dr Dinh Thanh Binh’s Econometrics course

II Literature overview

1 Questions of interest

“Why do housing prices differ among locations and regions?” – this is the basic question to which this report targets to find the answer Although there is a variety of factors that might affect housing prices, they are divided into four main categories: structure, neighborhood, accessibility, and air pollution Consequently, elements that represent each of these categories are taken into account to find out whether they do, or at least statistically do have an impact

on housing prices

In following parts, models are going to be built, data are going to be used in order to run the regression model and then the results are going to be analyzed to finally answer the question

of interest above

2 Procedure and program used

• Procedure

Step 1: Questions of interest

Step 2: Economic model

Step 3: Econometric model

Step 4: Data collection

Step 5: Estimation of econometric model

Step 6: Check multicollinearity and heteroscedasticity

Step 7: Hypothesis postulated

Step 8: Result analysis & Policy implication

• Stata program is primarily used to analyze the data and run the regression

Trang 4

III Economic model

As data are provided up front, the economic model used in this report is an empirical one Note that the fundamental model is mathematical; with an empirical model, however, data is gathered for the variables and using accepted statistical techniques, the data are used to provide estimates of the model's values

Empirical model discovery and theory evaluation are suggested to involve five key steps, but for the limitation of purpose and resources, this part of the report only follows three of them: (1) specifying the object for modeling, (2) defining the target for modeling, (3) embedding that target in a general unrestricted model

1 Specifying the object for modeling

price  f x (1)

As such, this report finds the relationship between housing price, which is the object for modeling, and each of relating factors including structure, neighborhood, accessibility, and air pollution ones

2 Defining the target for modeling by the choice of the variables to analyze, denoted

x i

As mentioned above, there are four main categories that are expected to affect housing prices: structure, neighborhood, accessibility, and air pollution Hence, the choices of x i would be such variables that constitute them After thorough research, factors have been narrowed down

to eight significant ones: (structure) number of rooms, (neighborhood) crimes, property tax, the percentage of people of low status, student-teacher ratio, (accessibility) distances to employment centers, accessibility to radial highways and (air pollution) nitrous oxide

3 Embedding that target in a general unrestricted model (GUM)

In its simplest acceptable representation (which will later be specified in the econometric model), the GUM of is determined to be: lprice  f crim,nox,rooms,dist,radial, proptax,stratio,lowstat A brief description of each variable is given in Exhibit 1

Exhibit 1: Definition of variables in the Housing Price model

Variable Definition

lprice logarithm of median housing price, $

crime crimes committed per capita

nox nitrous oxide, parts per 100 million square

rooms average number of rooms per house

dist weighted distances to 5 employment centers

radial accessibility index to radial highways

proptax property tax per $1000

stratio average student-teacher ratio

Trang 5

lowstat percentage of people of low status

IV Econometric model

To demonstrate the relationship between housing price and other factors, the regression function can be constructed as follows:

 (PRF):

lprice o 1crime2nox3rooms4dist 5radial 6proptax7stratio8lowstat i

(SRF): lprice  o 1crime2nox3rooms4dist 5radial 6proptax7stratio8lowstat

i

where:

0 is the intercept of the regression model

i is the slope coefficient of the independent variable x i

 is the disturbance of the regression model

0 is the estimator of 0

i is the estimator of i

i is the residual (the estimator of i )

From this model, this report is interested in explaining lprice in terms of each of the eight independent variables (crim,nox,rooms,dist,radio, proptax,stratio )

V Data collection

1 Data overview

• This set of data is a secondary one, as they are collected from a given source

• Data source: Regression Diagnostics: Identifying Influential Data and Sources of Collinearity,

by D.A Belsey, E Kuh, and R Welsch, 1990 New York: Wiley

• The structure of Economic data: cross-sectional data

2 Data description

To get statistic indicators of the variables, in Stata, the following command is used:

sum lprice crime nox rooms dist radial proptax stratio lowstat

The result is shown in Exhibit 2

Exhibit 2: Statistic indicators of variables in the Housing Price model

Variable Obs Mean Std Dev Min Max

lprice 506 9.941057 .4092549 8.517193 10.8198

crime 506 3.611536 8.590247 .006 88.976

nox 506 5.549783 1.158395 3.85 8.71

rooms 506 6.284051 .7025938 3.56 8.78

dist 506 3.795751 2.106137 1.13 12.13

radial 506 9.549407 8.707259 1 24

proptax 506 40.82372 16.85371 18.7 71.1

stratio 506 18.45929 2.16582 12.6 22

lowstat 506 12.70148 7.238066 1.73 39.07

Trang 6

where:

Obs is the number of observations

Std Dev is the standard deviation of the variable

Min is the minimum value of the variable

Max is the maximum value of the variable

VI Estimation of econometric model

1 Checking the correlation among variables

First of all, the correlation of lprice and nox, rooms, dist, radial, proptax, stratio, lowstat is

checked by calculating the correlation coefficient among these variables The correlation

coefficient r measures the strength and direction of a linear relationship between two variables

on a scatterplot In Stata, the correlation matrix is generated with the command:

corr lprice crime nox rooms dist radial proptax stratio lowstat

The result is shown in Exhibit 3

Exhibit 3: Correlation matrix

lprice crime nox rooms dist radial proptax stratio lowstat lprice 1.0000 crime -0.5275 1.0000 nox -0.5088 0.4212 1.0000 rooms 0.6329 -0.2188 -0.3028 1.0000 dist 0.3420 -0.3799 -0.7702 0.2054 1.0000 radial -0.4810 0.6254 0.6103 -0.2098 -0.4951 1.0000 proptax -0.5597 0.5828 0.6670 -0.2921 -0.5344 0.9102 1.0000 stratio -0.4976 0.2887 0.1869 -0.3540 -0.2293 0.4642 0.4542 1.0000 lowstat -0.7914 0.4470 0.5856 -0.6096 -0.4956 0.4760 0.5276 0.3654 1.0000

From the matrix, it can be inferred that the correlation between lprice and each of the independent

variable is decent enough to run the regression model Specifically:

- lprice and crime have a moderate downhill relationship

- lprice and nox have a moderate downhill relationship

- lprice and nox have a moderate uphill relationship

- lprice and dist have a weak uphill relationship

- lprice and radial have a moderate downhill relationship

- lprice and proptax have a moderate downhill relationship

- lprice and proptax have a moderate downhill relationship

- lprice and proptax have a strong downhill relationship

The correlation between each pair of variables can be visualized using the scatter command

in Stata

The result is shown in Exhibit 4

Trang 7

Exhibit 4: Scatterplot of variables in the Housing Price model

Trang 8

2 Regression run

Having checked the required condition of correlation among variables, the regression model is ready to run In Stata, this is done by using the command:

reg lprice crime nox rooms dist radial proptax stratio lowstat

The result is shown in Exhibit 5

Exhibit 5: Regression model

Source SS df MS Number of obs = 506 F( 8, 497) = 204.33 Model 64.8618936 8 8.1077367 Prob > F = 0.0000 Residual 19.7203314 497 039678735 R-squared = 0.7669 Adj R-squared = 0.7631 Total 84.582225 505 167489554 Root MSE = .1992

l p r i c e C o e f S t d E r r t P > | t | [

9 5 % C o n f I n t e r v a l ]

c r i m e 0 1 1 1 8 2 5 0 0 1 3 6 1 4 8 2 1 0 0 0

0 0 1 3 8 5 7 3 0 0 8 5 0 7 8 n o x 0 7 5 4 5 6 4

0 1 4 6 9 3 6 5 1 4 0 0 0 0 1 0 4 3 2 5 6 0 4 6 5 8 7

3 r o o m s 0 9 9 6 5 4 5 0 1 6 7 6 9 7 5 9 4 0

0 0 0 0 6 6 7 0 6 1 1 3 2 6 0 2 8 d i s t 0 4 6 3 7

0 8 0 0 6 7 5 5 7 6 8 6 0 0 0 0 0 5 9 6 4 4 1 0 3 3

0 9 7 5 r a d i a l 0 1 3 3 6 9 4 0 0 2 6 5 2 5 5 0 4

0 0 0 0 0 0 8 1 5 8 0 1 8 5 8 0 8 p r o p t a x 0 0

6 2 1 3 3 0 0 1 3 8 0 7 4 5 0 0 0 0 0 0 0 8 9 2 6 .

0 0 3 5 0 0 6 s t r a t i o 0 4 1 3 3 2 7 0 0 5 0 6 3 3

8 1 6 0 0 0 0 0 5 1 2 8 0 7 0 3 1 3 8 4 6 l o w s t a t 0 2 8 0 3 8 4 0 0 1 9 1 5 4 1 4 6 4 0 0 0 0 0 3 1 8 0 1

6 0 2 4 2 7 5 2 _ c o n s 1 1 1 9 5 0 7 2 0 3 7 2 9 4

5 4 9 5 0 0 0 0 1 0 7 9 4 7 9 1 1 5 9 5 3 5

From the result, it can be inferred that

• crime, nox, rooms, dist, radial, proptax, stratio and lowstat all have statistically significant effects on lprice at the 5% significant level (as all p-values are smaller than

0.05) In particular, those effects can be specified by the regression coefficients as follows:

-0 11.1951

: When all the independent variables are zero, the expected value of housing price

is 1011.1951

-1 0.0112

: When the number of crime committed per capita increases by one, the expected value of housing price decreases by 1.12%

-2 0.0755: When nitrous oxide increases by one part per 100 million square, the expected value of housing price decreases by 7.55%

Trang 9

-3 0.0997

: When the number of rooms increases by one, the expected value of housing price decreases by 9.97%

-4 0.0464

: When the distance to 5 employment centers increases by one unit, the expected value of housing price decreases by 4.64%

-5  0.013

: When the accessibility index to radial highways increases by one unit, the expected value of housing price increases by 4.64%

-6 0.0062

: When the property tax per $1000 increases by $1, the expected value of housing price decreases by 0.62%

-7 0.0413

: When the student-teacher ratio increases by 1%, the expected value of housing price decreases by 4.13%

-8 0.028

: When the percentage of people of lower status increases by 1%, the expected value of housing price decreases by 2.80%

• The coefficient of determination Rsquared  0.7669: all independent variables (crime,

nox, rooms, dist, radial, proptax, stratio, lowstat) jointly explain 76.69% of the variation

in the dependent variable (lprice); other factors that are not mentioned explain the remaining 23.31% of the variation in the lprice

• Other indicators:

- Adjusted coefficient of determination adj R-squared = 0.7631

- Total Sum of Squares TSS = 84.5822

- Explained Sum of Squares ESS = 64.8619

- Residual Sum of Squares RSS = 19.7203

- The degree of freedom of Model Dfm= 8

- The degree of freedom of residual Dfr = 497

• Based on the data collected from the table, the sample regression function is established:

SRF:lprice 11.20.01crime0.08nox0.1rooms0.05dist 0.01radial

0.01proptax

0.03stratio0.03lowstat 

VII Check multicollinearity and heteroscedasticity

1 Multicollinearity

Multicollinearity is the high degree of correlation amongst the explanatory variables, which may make it difficult to separate out the effects of the individual regressors, standard errors may be overestimated and t-value depressed The problem of Multicollinearity can be detected

by examining the correlation matrix of regressors and carry out auxiliary regressions amongst

them In Stata, the vif command is used, which stand for variance inflation factor

Exhibit 6 shows the result

Exhibit 6: Multicollinearity test

Trang 10

Variable VIF 1/VIF proptax 6.89 0.145103 radial 6.79 0.147301 nox 3.69 0.271206 dist 2.58 0.388106 lowstat 2.45 0.408804 rooms 1.77 0.565985 crime 1.74 0.574531 stratio 1.53 0.653369

Mean VIF 3.43

The value of VIF here is lower than 10, indicating that Multicollinearity is not too worrisome a problem for this set of data

2 Heteroskedasticity

Heteroskedasticity indicates that the variance of the error term is not constant, which makes the least squares results no longer efficient and t tests and F tests results may be misleading The problem of Heteroskedasticity can be detected by plotting the residuals against each of the regressors, most popularly the White’s test It can be remedied by respecifying the model –

look for other missing variables In Stata, the imtest white command is used, which

stands for information matric test

Exhibit 7 shows the result

Exhibit 7: Heteroskedasticity test

imtest, white

White's test for Ho: homoskedasticity

against Ha: unrestricted heteroskedasticity

chi2(44) = 235.31

Prob > chi2 = 0.0000

Cameron & Trivedi's decomposition of IM-test

S o u r

c e

c h i 2 d f p

H e t e r o s k e d

a s t i c i t y

S k e w n

e s s

K u r t o

s i s

2 3 5 3 1 4 4

0 0 0 0 0

3 4 2 0 8 0 0 0 0 0

1 2 3 8 1 0 0 0 0 4

Ngày đăng: 22/06/2020, 21:30

TỪ KHÓA LIÊN QUAN

🧩 Sản phẩm bạn có thể quan tâm

w