FOREIGN TRADE UNIVERSITY FALCUTY OF INTERNATIONAL ECONOMICS ---***---ECONOMETRICS MID-TERM REPORT FACTORS AFFECTING GROSS REGIONAL DOMESTIC PER CAPITA PRODUCT OF VIET NAM IN 2018 Instru
Trang 1FOREIGN TRADE UNIVERSITY FALCUTY OF INTERNATIONAL ECONOMICS
-*** -ECONOMETRICS MID-TERM REPORT
FACTORS AFFECTING GROSS REGIONAL DOMESTIC
PER CAPITA PRODUCT OF VIET NAM IN 2018
Instructor: Ph.D Đinh Thị Thanh Bình Class ID: KTEE 309.2
Group number: 10 Group members:
Trang 2TABLE OF CONTENTS
INTRODUCTION 3
I LITERATURE REVIEW 4
1 Question of interest 4
2 Procedure and program 5
II DATA COLLECTION 7
1 Data type 7
2 Data collection 7
III STATISTICAL DESCRIPTION OF VARIABLES 8
1 Running DES function 8
2 Running SUM function 8
3 Running TAB function 9
IV QUANTITATIVE ANALYSIS 13
1 OLS method and assumptions 13
2 Regression and correlation 14
V TESTING PROBLEM 16
1 Test omit variable 16
2 Multicollinearity testing 17
3 Heteroskedasticity testing 18
4 Normality (of u) testing 21
VI SATISTICAL HYPOTHESIS TESTING 23
1 Critical value method 23
2 Confidence interval method 23
VII CONCLUSIONS AND POLICY IMPLICATION 25
1 Conclusions 25
2 Policy implication 25
VIII REFERENCE 26
IX APPENDIX 27
2
Trang 3Economics is a science which determines social development and national growth With thedevelopment in economics research, econometrics is an important subject which helps peoplestudy many economics issues to find the way to develop the economy Econometrics is based onthe development of statistical methods for estimating economic relationships, testing economicrelationships, testing economic theories, evaluating government and business policies It is anuseful and indispensable tool for economists to measure economic relationships Therefore, werealized the importance of understanding econometrics and successfully applying this knowledge
to logically analyze statistical problems Thanks to econometrics, humans will have a clear viewabout economic policies, theories and phenomena
Similar to GDP, GRDP per capita is an index which reveals the development of the economy but
in small regions such as cities, towns or provinces Many people wonder what factors affect thisindex and their impact on it In this report, we will try our best to clarify for the readers about
“Factors affecting gross regional domestic product of Viet Nam” by using the methodology of
econometrics and the STATA program
We sincerely appreciate our econometrics instructor – PhD Dinh Thi Thanh Binh on helping us
to complete this report During our working process, mistakes are inevitable but we hope that youcan comment on our work and give us some advice to help us develop ourselves
Trang 4I LITERATURE REVIEW
1 Question of interest
Gross domestic product (GDP) is a statistic that measures the size of a region's economy TheGDP per capita is useful in capturing real output per person growth since inflationary effects havebeen removed It is, therefore, the most widely used measure of real income However, we believethat the income of people in each region of a country is relative different, hence, so we choseGRDP per capita (gross regional domestic product per capita) as the main object of our research.The GRDP per capita is one of the most important indexes to rate the growth of the economy of aregion Therefore, our group raised a question:” What are the factors and their impact on the GRDPper capita”
Even though there are many factors that impact on the GRDP per capita, we focus mainly on 4factors They are population density, high school graduate rate, participation labor rate and FDI
We will focus on the factor to find out what impact or statistical impact of them on GRDP per
capita of Viet Nam
These factors have their own ways to affect the economic growth, and can be shown by some significant indexes like GDP, CPI, etc And that’s why we consider that they can affect the GRDP, and GRDP per capita, too
Based on Anna Ek's study in 2007, the theoretical framework shows that FDI has a positive
impact on economic growth because it serves as a channel through which new technology is
transferred from one country to another, and thereby it increases output and GDP/GRDP in the recipient country
About density, too high population density decreases the natural endowment per capita, but eases the development of infrastructure, leading to existence of an optimal population density for
economic growth (Yegorov, 2009)
The Alliance for Excellent Education (2015) released data outlining the economic benefits of a
high school diploma The “Graduation Effect” data shows how increasing the high school
graduation rate to 90 percent creates new jobs, increases consumer spending, boosts tax revenue, and increases the GDP/GRDP
4
Trang 5According to research published by the Federal Reserve Bank of Philadelphia in 2017, a falling
participant labor rate can slow the growth of GRDP at a region, since fewer people are contributing
to the region’s output of goods and services Additionally, a lower participation rate can lead tohigher tax rates, since the government has a narrower tax base from which to draw revenue, theauthors noted
In the following parts, models and data are going to be utilized in order to run the regressionmodel and the result will be analyzed in order to answer the question of interest
2 Procedure and program
Econometrics refers to a branch of business analytics, modeling, and forecasting techniquesfor modeling the behavior or forecasting certain business, financial, economic, physical science,and other variables The Stata program is primarily used to analyze the data and run the regressionmodel
A basic tool for econometrics is the multiple linear regression model Econometric theoryuses statistical theory and mathematical statistics to evaluate and develop econometric methods.Econometricians try to find estimators that have desirable statistical properties includingunbiasedness, efficiency, and consistency
There are 8 steps to conduct an empirical analysis:
Step 1: Question of interest based on economic theories.
In principle, econometric methods can be used to answer a wide range of questions, such as:testing some aspects of an economic theory and effects of a government policy In cases when weneed to test an economic theory, a formal economic model is constructed An economic modelconsists of mathematical equations that describe various relationships For example, individualconsumption decisions, subject to a budget constraint, are described by mathematical methods
Step 2: Set up mathematical model
The mathematical model reflects the exact relationship between variables
Step 3: Set up econometric model
An econometric model can be derived from a mathematical model by allowing for uncertainty
The error term of disturbances in econometric models represents factors that are not included inthe model but can affect the dependent variable
Step 4: Data collection
Trang 6Data can be divided into 2 types: Primary and Secondary data
The structure of economic data: Cross-sectional data, time-series data and pooled data Pooleddata can be furthermore categorized into pool cross sectional data and panel data
Step 5: Estimate parameters of the model
Parameter estimates (also called coefficients) are the change in the response associated with aone-unit change of the predictor, all other predictors being held constant The unknown modelparameters are estimated using least-squares estimation
Step 6: Test mistakes of the model
The assumptions of the model can be violated when there are high multicollinearity,heteroskedasticity and autocorrelation
Step 7: Test hypothesis
Fisher, Durbin-Watson, Lagrange, Hausman test can be used to test the appropriation of the modeland estimated parameters
Step 8: Analyze the estimated results and forecasting/ policy implication
6
Trang 7DATA COLLECTION
1 Data type
- The estimation of the model is in the form of a Cross Sectional Data
- A cross-section data set consists of a sample of individuals, households, firms, cities taken
at a given point of time The analysis might also have no regard to differences in time.Analysis of cross-sectional data usually consists of comparing the differences amongselected subjects The data collected in this report are obtained from the data collected byeach provinces/cities of Vietnam
2 Data collection
- Data in this report is secondary data, as they are collected from a given source
- Collected in 2018, from 62 provinces of Vietnam
- Source of data: General Statistics Office of Vietnam (link: gso.gov.vn)
- The meanings of each variable:
GRDP: GRDP per capita (Mil VND/ Capita/ year)
Grad: Highschool graduation rate (%)
Inv: Foreign Direct Investment (Mil USD)
Dens: Population density (people/km2)
Rate: Labor participation rate (%)
Trang 8STATISTICAL DESCRIPTION OF VARIABLES
1 Running DES fuction
The most important information after using the DES function is the variables’ label.
des grdp grad inv dens rate
storage display value
variable name type format label variable label
dens int %8.0g Population density
rate double %10.0g Labor participation rate
DES function provides the meaning and the measurement of the 5 variables below:
Grdp: stands for Gross regional domestic product per capita (unit: mil VND/capita/year)
Grdp is a quantitative variable.
Grad: stands for High school graduation rate (unit: percent) Grad is a quantitative
variable
Inv: stands for Foreign direct investment (unit: mil USD) Inv is a quantitative variable.
Dens: stands for Population density (unit: people/km2) Dens is a quantitative variable
Rate: stands for Participation labour rate (unit: percent) Rate is a quantitative variable.
2 Running SUM function
SUM function lets us know about observations, mean, standard deviation, max and min value of the variables
sum grdp grad inv dens rate
Variable | Obs Mean Std Dev Min Max
Obs is the number of observations
Std Dev is the standard deviation of the variable
Min/ Max is the minimum/ maximum value of the variable
8
Trang 9By using SUM function, we have:
Grdp: With 62 observations, the mean value is 55.393, Std Dev is 27.996 The minimum
value is 20.7, the maximum value is 154.84
Grad: With 62 observations, the mean value is 94.593, Std Dev is 2.396 The minimum
value is 85.36, the maximum value is 99.4
Inv: With 62 observations, the mean value is 678.366, Std Dev is 1571.956 The minimum
value is 0.1, the maximum value is 8669.7
Dens: With 62 observations, the mean value is 516.0645, Std Dev is 667.7978 The
minimum value is 51, the maximum value is 4363
Rate: With 62 observations, the mean value is 58.166, Std Dev is 3.808 The minimum
value is 50.4, the maximum value is 68.8
3 Running TAB function
Using TAB function respectively allows us to describe more than 1 variable coincidently with frequency and percent of the variables
Trang 10 Gross regional domestic product ranges from 20.7 to 154.84 (mil VND/capita/year)
93.56% of the observations have the gross regional domestic product that is less than 100 mil VND/capita/year
Analyzing information from the table above:
High school graduation rate ranges from 85.36% to 99.4%
Trang 11Analyzing information from the table above:
Foreign direct investment ranges from 0.1 to 8669.7 mil USD
About 87,11% of the observations has foreign direct investment above 1 mil USD
Analyzing information from the table above:
Population density ranges from 51 people/km2 to 4363 people/km2
Trang 1262 | 1 1.61 85.48 62.7 | 1 1.61 87.10 63.1 | 1 1.61 88.71 63.2 | 1 1.61 90.32 63.6 | 1 1.61 91.94 64.2 | 1 1.61 93.55 64.7 | 1 1.61 95.16
65 | 1 1.61 96.77 65.9 | 1 1.61 98.39 68.8 | 1 1.61 100.00
Total | 62 100.00
Analyzing information from the table above:
Labor participation rate ranges from 50.4% to 64.2%
12
Trang 13IV QUANTITATIVE ANALYSIS
1 OLS method and assumption
a OLS method
Ordinary least squares (OLS) regression is a statistical method of analysis thatestimates the relationship between one or more independent variables and a dependent variable; themethod estimates the relationship by minimizing the sum of the squares in the difference betweenthe observed and predicted values of the dependent variable configured as a straight line
b Assumptions
There are seven assumptions in the OLS method:
Assumption 1 - Linear in parameters: In the PRF, the dependent variable, y, is related to theindependent variable, x, and the error term, u, as
Y = β0 + β1X + u
Assumption 2 – Random sampling: We have a random sample of size n
Assumption 3 – Sample variation in the explanatory variable: The sample outcomes on x,namely {X i , i = 1,…, n}, are not all the same value
Assumption 4 – No perfect collinearity: In the sample, there are no exact linear relationshipsamong the independent variables
Assumption 5 - The error term has an expected value of zero given any value of theexplanatory variable In other words, E(u|X)=0
This assumption simply says that the factors not explicitly included in the model,therefore subsumed in u i, do not systematically affect the mean value of Y; the positive
u i values cancel out the negative u i values so that their average or mean effect on Y iszero
Assumption 6 - Homoskedasticity: The error term ui has the same variance given any value
of the independent variable In other words, var (u i/X i)= E[u i- E(u i/X i)]^2= E(u i2
/X i)= σ2
Var(u) reflects the distribution of Y surrounding its E(Y|X) This assumption means that
Y corresponding to various X values have the same variance The variance surroundingthe regression line is the same across the X values, it neither increases nor decreases as
X varies
Assumption 7 - The population error u is independent of the explanatory variables X andnormally distributed:
𝑢~𝑁(0,σ2)
Trang 14If these assumptions hold true, the OLS procedure creates the best possible estimates Instatistics, estimators that produce unbiased estimates that have the smallest variance arereferred to as being “efficient.” Efficiency is a statistical concept that compares thequality of the estimates calculated by different procedures while holding the sample sizeconstant OLS is the most efficient linear regression estimator when the assumptionshold true Another benefit of satisfying these assumptions is that as the sample sizeincreases to infinity, the coefficient estimates converge on the actual populationparameters.
2 Regression and correlation
a Set up model
The relationship between the dependent variable (Y) and independent variables (X)
is illustrated by regression in the following form:
^
GRDP= ^ β0+ ^β1× Grad+^ β2× Inv+^ β3× Dens+ ^ β4× Rate
Where:
GRDP (dependent variable): Gross regional domestic product per
Grad (independent variable): High school graduation rate
Inv (independent variable): Foreign direct investment
Dens (independent variable): Population density
Rate (independent variable): Labor participation rate
b Analyzing the corelation between independent variables
Running function: corr grdp grad inv dens rate
We have the following result:
corr grdp grad inv dens rate
Grad & GRDP: The higher the rate of graduation from high school is, the higher rossregional domestic product is
Inv & GRDP: The higher the foreign direct investment is, the higher ross regional domesticproduct is
14