Thevariables to be included in the data set should be: Because the population and the number of countries in Asia are outnumbered in the America region; therefore, the collecting-data pr
Trang 1TABLE OF CONTENTS
Structure II/ Firm’s behaviors III/ Market Failure
IV/ Guest Speaker’s Reflection
V/ References &
Appendix
Team Report
Course Code: ECON1193
Subject Name: Business Statistics 1
Lecturer: Nguyen Thi Tuong Chau
Words Count: 3166 words
Group members:
1 Nguyen Vu Hai Binh – S3878194
2 Tran Minh Phuong – S3877542
3 Ha Thien Loc – S3878211
4 Dao Thuy Minh Anh – S3878425
5 Nguyen Ngoc Phuong Mai – S3868202
Trang 2First Name Student
ID
Parts Contributed
Contribution
% SignatureHai Binh S3878194 Part 3,4,7 100%
Minh
Phuong S3877542 Part 5,6,7 100%
Thien Loc S3878211 Part 3,4 100%
Phuong Mai S3868202 Part 1,2
100%
Minh Anh S3878425 Part 5,6
100%
2 TABLE OF CONTRIBUTIONS
Trang 3I/ Data Collection
II/ Descriptive Statistics
III/ Regression
IV/ Regression Conclusion
V/ Time Series
VI/ Times Series Conclusion
VII/ Overall Conclusion
Trang 4The data, which was collected from World Bank data, is included in the Excel File.
In the collecting process, our team has successfully collected numerous data in the majority
of countries in two regions including Asia and the America region, specifically in 2014 Thevariables to be included in the data set should be:
Because the population and the number of countries in Asia are outnumbered in the America region; therefore, the collecting-data process is affected More countries in Asia are collected,with 40 countries, compared to 29 countries in the America region Since each region in a continent has a division in income category, it is advisable to continue taking the appropriate countries for each area, not evenly (Appendix1.1) For instance, most countries in Asia are in the middle-income segment, hence we choose more samples in this sector and do the same with The America's samples
Some countries' data is unavailable throughout the data-cleaning process, which is usuallyrelated to the recognition of sovereignty Data-cleaning methods are used to locate duplicateswithin a file or across sets of files to remove missing and unstructured data in these countries(Winkler 2002)
4
Trang 5II/ DESCRIPTIVE STATISTICS
According to the table above, there is no detected Mode in both regions, hence it cannot beused to examine the GDP-GR Furthermore, as there are outliers existing in the dataset, theMean measurement is not applicable Since Median is not affected by the outliers’ values in
its calculation, it is considered the best descriptive measurement to analyse the central
tendency
As shown in table 2.1, the median number of GDP-GR in region A is 3.222%, nearly threetimes higher than that in region B with only 1.783%, which means in 2014, GDP-GR in Asiaregion is higher The America region on average
5
Trang 6The IQR is the reasonable measure for comparing variability in two regions’ datasets to
avoid the effects of outliers on the results Moreover, all the other measurements are affected
by outliers, except for CV, yet it is just suitable when there is a huge difference between meanvalue and sample sizes As shown in the table above, the IQR value of Asia, 3.141%, is 1.126times higher than that of The America, which is 2.789% This indicates that in 2014, GDP-
GR varied amongst countries more in Asia
There is no doubt that the box and whiskers plot is an excellent choice as the central tendencymeasurements' values and quartile values are clearly shown on the chart As a result, thecomparison between the two regions will be more accurate Inferred from Figure1.1, twoplots are left-skewed
Because both Min and Max of Asia are higher than those of The America, GDP-GR of region
A ranges higher and wider than B in 2014 Also, the Q1 of Asia approximately equals to theQ2 of the America, which means that just 25% of countries in Asia have a low GDP-GR(<1.7%), while The America has 50% of countries suffering from it Moreover, Q3 of the
6
Trang 7America sector (2.905%) is smaller than Q2 of The Asia sector (3.222%), indicating that 50%countries in Asia witnessed higher GDP-GR than 75% countries in The America
III/ REGRESSION
REGION A/ ASIA
a) Regression Final Output
After applying backward elimination method (Appendix3.12), the final regression output is illustrated below:
7
Trang 8From above, both scatter plots present a quite alike figure which indicates weak relationship
amongst three variables are GDP-GR(annual%), GDP per capita (current US$) and
Population ages 15-64 (%of total population) in Asia
b) Regression Equation
As can be seen from the Figure 3.1, there are two significant variables, hence the regression equation is written as:
Here, if there is no GDP per capita X1=0 and no percentage of people in
working-age(15-64 years old) X2=0, b0=−6.596 which is likely to make sense However, the interceptssimply indicate that over the sample size selected, the portion of GDP-GR not explained by number of GDP per capita and the percentage of population ages 15-64 is -6.596%. Moreover,
X1=0 is outside the range of observed values Consequently, the relationship inferred from the equation does not match with the relationships presented on scatter plots
c) Interpret the regression coefficient of the significant independent variables
b
1=−0.0001 means that GDP-GR may insignificantly decrease on average, by 0.0001% every year for every $1 increase in GDP per capita, considering population ages 15-64 as constant
b
2=0.166 means that GDP-GR may increase, on average, by 0.166% every year for each 1% increase in total population ages 15-64, considering GDP per capita as constant
d) Interpret the Coefficient of Determination
The Coefficient of Determination: R2
= 0.397 = 39.7%, it can be drawn out that 39.7% of
the variation in the total number of GDP-GR (Y) can be clarified through the variation of GDP per capita (current US$) and population ages 15-64 (% total population) The remaining60.3% may be due to other factors
REGION B/ THE AMERICA
a) Regression Final Output
8
Trang 9After applying backward elimination method (Appendix3.26), the final regression output is illustrated below:
Inferred from the scatter plot above, there is a weak relationship between GDP-GR (annual
%) and GDP per capita (current US$) because plots are far away from the line
b) Regression Equation
As can be seen from the Figure3.4, there is only one significant variable, hence the
regression equation is:
9
Trang 10b0 is the estimated average value of Y when the value of X is zero (if X =0 is in the range of 1 1
observed X values)
Here, if there is no GDP per capita X1=0 , b0=4.488 which appears nonsense.
However, the intercept simply indicates that over the sample size selected, the portion of
GDP-GR not explained by the number of GDP per capita is 4.488%. Moreover, X =0 is 1
outside the range of observed values Hence, the equation also proves that the relationship on visual analysis above is weak, confirming it is correct
IV/ REGRESSION CONCLUSION
As Part 2 analysed, generally during 2014, Asia’s GDP-GR is higher than that of The
America, as the median of Asia’s dataset was three times greater, and its countries’s GDP-GRvary in a higher range Since GDP-GR measures economic growth status, Asia was believed
to have higher economic growth than The America Additionally, the IQR states that
GDP-GR between nations in Asia has varied more than in The America Therefore, a regression analysis was carried out to determine which major factors had an influence on GDP-GR
The two regions show a different number of significant variables based on the regression summary Although Asia has two significant independent variables comprising GDP per
capita (current US$) and population ages 15-64 (%of total population), America just has one significant variable which is GDP per capita (current $)
Looking at the coefficient (the higher it is, the greater impact that variable has) of two
significant variables in Asia’s regression output, population ages 15-64 has a higher impact
on GDP-GR of Asia compared to GDP per capita (|0.166| > |-0.0001|) In terms of The
America, because there is only one significant independent variable which is GDP per capita, its influence level on GDP-GR is incomparable with any other variable Between two
regions, the coefficient value of GDP per capita in Asia is higher than that of The America
(|-10
c) Interpret the regression coefficient of the significant independent variables
means that a decrease in the number of GDP per capita for every dollar will cause a minor decrease in the annual GDP-GR
d) Interpret the Coefficient of Determination
The Coefficient of Determination: =0.375=37.5%, it can be drawn out that 37.5% of the variation in the total number of GDP-GR (Y) can be clarified through the variation of GDP per capita (current US$) The remaining 62.5% may be due to other factors
Trang 110.0001|>|-0.00007|), which means it exerts more influence on Asia than The America
However, both appear to be extremely small illustrating that the influence is minor, also proved by Upreti (2015)
V/ TIME SERIES
1 Trend model
To predict the trend of GDP-GR in those two regions, we have selected two countries fromeach region, in different income categories, to have a comprehensive objective assessment,from 1990-2015 To determine which income segment that those countries are in, we based
on the average GNI of those, as the numbers fluctuated considerably during the period
According to the income classifications (Table 5.1, Appendix5.1), we have selected Korea,
Rep (High-Income) and India (Low-Income) for region Asia In the America region, wechose Canada (High-Income) and Nicaragua (Lower-Middle Income, since almost all thecountries in this region are above Low-Income)
As all four countries suffered at least one negative value in their dataset (Appendix5.2)
Therefore, it is not possible to produce the exponential model, which means there is no
exponential trend in GDP-GR in all four countries, during 1990-2015.
A) C1: Korea (HI)
a.1 Linear Trend
Trang 12Formula and Coefficient explanation:
β0=7.841, presents that when T=0, GDP-GR of the year before 1990 in Korea is predicted to
Trang 13Formula and Coefficient explanation:
β0=2.743, presents that when the T=0, GDP-GR of the year before 1990 in India is predicted
Trang 14C) C3: Canada (HI)
c.1 Linear Trend
c.2 Quadratic Trend
Formula and Coefficient explanation:
As can be inferred, when T increases, GDP-GR in Canada is predicted to change by0.591−0.04 (T ) amount Moreover, as β has a negative value, the quadratic trend model2
has a concave curve shape
D) C4: Nicaragua (LMI)
d.1 Linear Trend
14
Trang 15Formula and Coefficient explanation:
β0=-0.355, presents that when the time period is 0 year (T = 0), GDP-GR of the year before
1990 in Nicaragua is predicted to be approximately -0.355%
β1=0.163 means that GDP-GR in Nicaragua(1990-2015) increased by 0.163% every one year,indicating that it has an upward trend
d.2 Quadratic Trend
15
Trang 162 Recommended trend model
a Region A:
Looking at table 5.2, the linear trend model of India has lower MAD and SSE values, which means it produces less errors than the linear trend model of Korea, so it is more accurate to forecast Therefore, in region A, the linear trend model of India is recommended to predict GDP-GR, which is:
3 Predictions for GDP per capita growth rate in region A and B in 2021, 2022, 2023:
Based on the recommended trend model in part 2, we predict the GDP-GRs in region A andregion B according to Table 5.4 above
As can be inferred, GDP-GR in region A is much higher than that in region B during 2021 to
2023 Moreover, through those three years, region A is predicted to have an increase by
16
Trang 170.258% while region B may suffer a decrease by 1,399% indicating that the rate change ofregion B is faster than that of region A.
VI/ TIMES SERIES CONCLUSION
To predict GDP-GR in the two regions Asia(A)and The America(B), the datasets from 2countries with different income category in each region (Korea Rep, India; Canada,Nicaragua) during 1990-2015, have been collected and illustrated in the line charts below
The graph illustrates GDP-GR data of Asia, represented by Korea Rep (High-Income) andIndia (Low-Income) Over the given period, Korea observed a downward tendency, whileIndia had an upward trend Regarding Korea Rep., the rate fluctuated considerably; there was
an abnormal lowest GDP-GR in 1998 due to its financial crisis in 1997 (Kihwan 2006), thenwitnessed a spectacular recovery, reaching its peak in 1999 Thereafter, Korea saw adownward direction Otherwise, India suffered less chaos In 1999, both countries reachedtheir peak, but Korea’s rate (10.677%) was higher than that of India (7.042%), by 1.5 timesand Korea also suffered the bottom point lower than India, approximately by 6 times,meaning that GDP-GR in Korea ranges wider than that in India Before 2003, India's GDP-
GR seemed to be under Korea’s, the beginning rate of India was lower than Korea about 2.5times (3.336%-8.8%) However, after 2003, the situation reversed, it tended to surplus Korea,the rate of India was triple that of Korea (6.101%-2.268%) in 2015
17
Trang 18This chart gives information about The America’s GDP-GR, represented by Canada Income) and Nicaragua (Low-Middle-Income) As illustrated, both datasets are dramaticallyvolatile The rate range of two countries is fairly equal, as in Canada, the peak and the lowestpoint are 5.836% and -4.796% in 2007 and 1997 orderly, while in Nicaragua, those are5.312% and -4.607% in 1999 and 2009 However, in 1997 and from 2010, Nicaraguaexperienced a much higher GDP-GR than Canada Generally, Canada possessed a downwardnon-linear trend, whereas Nicaragua had an increasing trend.
(High-Comparing both graphs, during 1990-2015, Asia mostly witnessed positive GDP-GR valueswhile The America suffered negative values for several years Moreover, the rate of Asiaranges in higher percentage and wider than that of The America, since in Asia, it could reachabove 10% Moreover, the HI and LI countries of Asia possess higher rates than those of theAmerica respectively It is noticeable that in 2008 and 2009, all four countries' rates declinedsharply, owing to the effects of the 2008 global financial crisis (Bartmann 2017) Thereafter,
LI and LMI countries of both regions tend to surpass HI countries in GDP-GR
Those findings do follow the discovery in the Regression part, which stated that thepopulation (15-64 age) has a positive relationship with GDP-GR of Asia, and GDP (current $)has a negative relationship in terms of both regions’ rate During the given period, thepopulation (15-64 age) of India increased significantly, and GDP (current $) of India andNicaragua were much lower than Korea and Canada (Appendix 6.1, 6.2), resulting in the rise
of the two LI and LMI countries in GDP-GR
From the line graphs, it can be concluded that during 1990-2015, two regions witnessedconsiderable volatility, especially when global issues emerge, GDP-GR of all countries mightsuffer its effect However, Asia's GDP-GR seems to be higher and more varied than theAmerica
Based on the specific analysis above, India’s trend model is chosen to represent Asia's
GDP-GR and in terms of the America, Canada is the representative Therefore, by looking at the
18