- After being divided into three categories, these nations are split into two groups: “low usage of internet” countries L, which have individuals using the internet percentage of populat
Trang 1ASSIGNMENT COVER PAGE
Title of Assignment Individual case study – Inferential
Statistics
Number of page (including 12 pages
this one)
Trang 2Part 1: Introduction
Nowadays, the internet plays an essential role in people's life The internet, which
connects billions of people across the world, is an essential part of today's information society The worldwide penetration rate had grown from almost 17% in 2005 to over 53% in 2019 However, because certain parts of the globe have reached saturation
levels, worldwide growth rates have slowed in recent years (ITU Telecommunication Development Bureau 2019)
As an information distribution system, the internet, and its use can deliver education and knowledge to everyone It also creates substantial new economic prospects as well as the potential for more environmentally friendly choices for the marketplace Furthermore, the internet can help developing-country enterprises jump into the development mainstream It holds great promise for easing the delivery of essential services such as health and
education, which are now unevenly dispersed (United Nations Department of Economic and Social Affairs 2007) To illustrate, most individuals in developed countries are online, with around 87 percent In the least developed countries (LDCs), on the other hand, just 19% of individuals have an internet connection in 2019 Furthermore, Europe has the most outstanding Internet usage rates, while the lowest are in Africa (ITU Telecommunication Development Bureau 2019) Besides, the United Nations has created the 2030 Agenda for
17 Sustainable Development Goals (SDGs) One of which is goal number 8 being
“promoting sustained, inclusive, and sustainable economic growth, full” (United Nations Department of Economic and Social Affairs, n.d) For these reasons, to accomplish the United Nations' SDG 8, it is critical to keep track of who is using the internet
According to Chong, Liew and Suhaimi (2012), there is a significant long-run and short-run connection between gross national income and internet usage rate For more details, investing in Information and Communications Technology (ICT) infrastructure,
particularly encouraging increased internet usage, is advantageous to raising gross national income per capita Therefore, that increasing internet usage should be included as one of the critical components of the New Economy Model for the policy's vision and purpose to
be realized in the future
Part 2: Descriptive Statistics and Probability
- 33 nations are divided into three groups according to their gross national income (GNI): + Low-Income countries (LI): containing countries with a per capita GNI of less than $1,000. + Middle-Income countries (MI): containing countries with a per capita GNI of between
$1,000 and $12,500
+ High-Income countries (HI): containing countries with a per capita GNI of more than
$12,500
Trang 3- After being divided into three categories, these nations are split into two groups: “low usage
of internet” countries (L), which have individuals using the internet (percentage of population) of
no more than 40%, and “high usage of internet” countries (H), which have individuals using the internet of more than 40%
A Probability
Total
of internet (L) of internet (H) Low-Income countries
(LI) Middle-Income countries
(MI) High-Income countries
(HI)
Table 1: Contingency table of internet usage statistics for each nation category
a To see if income and internet usage are statistically independent events or not, we must evaluate
and compare the conditional probability of low internet usage given that low-income nations P (L |
LI), where L denotes the probability for examination and LI denotes the conditional component
Furthermore, the probability of all nations with low internet usage is
P (L)
P (L)= L = 10
=0.3 33 33
4
P (L|LI )= P(L ∧LI )
= P(L∩LI)= 33 =1
33
After calculating, the income and internet usage are statistically dependent events as nations with
low internet usage, given that low-income P (L| LI) have a different probability than countries with
low internet usage P(L) It demostrates that these two probabilities affect each other It is the same
for the other 2 group countries As a result, the gross national income of each country is dependent
on the individual's use of the internet.
b To determine which country categories have more internet usage, we must calculate
the probability of 3 country categories and compare it
Trang 4P (H ∨LI )= 33
=0 4 33 13
P (H ∨MI)= 33
19 =19 13=0.684 33
10 33
P (H ∨HI )= 10=1
P(H|LI)<P(H|MI)<P(H|HI)
- As a result, the chance of low-income nations having high internet usage is 0% In contrast, the
probability of high internet usage is 100% in high-income countries , compared to
68.4% of middle-income countries As a result, the countries that have a higher internet
usage rate will have a higher GNI Therefore, the governments should push the percentage
of citizens using the internet to improve the GNI and boost the economy
B Descriptive statistics
Min <,=,> Lower bound Max <,=,> Upper bound Result
Table 2: Measures of identifying outliers of each country on usage of internet
To get the most accurate analysis of descriptive statistics, the data set must be carefully
examined to see whether it contains any outliers From table 2, it is clear that no extreme
values in the three country categories
a Measurement of Central Tendency
Table 3: Measurements of Central Tendency of each country on usage of internet in 2017
The mean is the most tool for the central tendency The mean is calculated based on all the data's
values and can be further mathematically treated Moreover, it's simple to comprehend
Trang 5for non-technical audiences (Gholba 2012) Furthermore, although the mean is sensitive
to outliers, there are no extreme values in the dataset (table 2)
Table 3 shows that low-income nations have a lower mean than middle-income countries,
and middle-income countries have a lower mean than high-income ones It indicates that
internet usage affects the GNI For more details, countries with higher income will have
the mean of individuals using the internet higher
b Measurement of Variation
Table 4: Measures of Variation of each country on usage of internet in 2017
The best suitable measure for variation is the standard deviation - S Standard deviation
is the most often used in variation measurement It demonstrates variance around the
mean and calculates all the values in the dataset Furthermore, the units are the same as in
the original data (Descriptive Statistics 2021) Besides, the S is sensitive to the extreme
values, there are no outliers in this dataset (table 2)
Table 4 demonstrates that the S of middle-income countries is the highest (19.131%),
following the high-income and low-income countries at 8.709% and 6.313%, respectively For more details, the values of middle-income countries are more spread out from the mean than two country categories In other words, usage internet in MI may be further (higher or lower) from the mean, compared to other countries On the contrary, HI countries have less
S value, so it is more likely to have a high usage rate (the mean of HI countries is 82.647% with spreading around the mean at a low level – 8.709%) Consequently, the countries
should push internet usage to stimulate GNI and economy too
Part 3: Confidence Intervals
a Calculating confidence interval for the worldwide average of an individual using
the Internet (percent of population)
- Assuming the confidence level in this part is equal to 95% since it is the most often used confidence level (Hazra 2017) Therefore, the level of confidence in this case is 95%.
Trang 6Sample size n 33
Table 5: Summary of data regarding the global average of people
who use the internet
Confidence interval = X ± t
42.806 ≤ μ≤ 62.502
We are 95% confident that the world average of internet users is
between 42.806 and 62.502 percent of population.
b Assumption
Since the sample size is large enough (n=33 > 30), no matter how the population
is normally distributed or not, we can apply Central limit theorem (CLT) that the
sampling distribution of all possible sample means can be approximated by
normally distributed
No assumptions are required.
c Assume we know the worldwide standard deviation of Internet users.
In this case, the world standard deviation of internet users, which means
population standard deviation is provided, we will use z-table instead of t-table In
other words, the t-critical value (part 3b) will be replaced by the
z-critical value (the new formula is: X ± Z
σ
) Besides, critical z-values
√ n
will be smaller than critical t-values for any given degree of confidence
Confidence intervals are smaller when critical values are smaller A broader
interval, on the other hand, is a more cautious interval (McEvoy 2018)
Furthermore, the confidence interval is defined by its margins of error
Therefore, when the width of the confidence interval reduces, the margins of
error decrease too Thus, it leads to higher precise results (Simundic 2008)
If σ – the world standard deviation of an individual using the internet
is known, the confidence interval will decrease and be more precise.
Trang 7Part 4: Hypothesis Testing
a Hypothesis Testing (CV approach)
In 2016, the population mean for internet users was 44.7% (percentage of
population), according to a World Health Organization survey We are 95 percent
confident that the global average for an internet user is between 42.806 and
62.502 based on the calculations in part 3 The data of the 2016 year also lies in
this interval It leads to confusion about whether individuals' use of the internet will
increase, decrease, or remain unchanged in the upcoming years So, we should
do the two-tailed test first to test if the internet usage will change or remain
unchanged
Population standard
deviation
Sample standard
deviation
Table 5: Summary of data regarding the global average of people who use the
internet
- Step 1: Check for CLT
33 countries are calculated in this case, so the sample size n=33 which is higher
than 30 Therefore, CLT is applicable Then, the sampling distribution of mean
becomes normally distributed
- Step 2: Determine the null hypothesis H0 and the alternative hypothesis H1
H0: μ=44.7
H1: μ ≠ 44.7
- Step 3: Determine what kinds of test From the result in step 2, it is two-tailed
test
- Step 4: Choose which table to use
The t-table is utilized because the population standard deviation is
unknown
Trang 8- Step 5: Determine critical values (CV)
In this case, α=0.05 , degree of freedom = 32 and two-tailed test, t-critical
value = ± 2.0369
- Step 6: Calculate test statistics t
t= X−μ= 52.654−44.7=1.645
- Step 7: Make statistical decision
0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0
As we can see from the curve above, the t value is in the non-rejection area (-2.0369 < 1.654 < (-2.0369) Consequently, we do not reject the null hypothesis H0 (reject the alternative hypothesis H1)
- Step 8: Make managerial decision
As H0 is not rejected, then with a 95% degree of confidence, the average global individual utilizing the Internet (percentage of population) is 44.7 percent in the future
- Step 9: Discuss the possible error
Since H0 is not rejected, then type II error might have been committed
In this context the error means: It is concluded that the world average individual using the Internet (percentage of population) is 44.7% but in actual, the average world individual using the Internet might not be 44.7% in the upcoming years
The average world number of people utilizing the internet may
change (drop or rise) in the future
b Consider the influence of doubling the number of nations in the dataset on hypothesis testing findings.
Trang 9- The sample size (n) will double if the number of nations in the dataset is doubled For more details, the degree of freedom will be affected directly
- In hypothesis testing, the standard error, determined by sample size, is used to calculate the width of sampling distributions (The University of Texas, n.d) In other words, the standard error represents the distribution's dispersion The dispersion of the distribution decreases as the sample size increases, and the mean of the distribution is near the population mean (Central Limit Theory) As a result, the sample size is
inversely proportional to the standard error of a sample (Zijing Zhu 2020) So, when increasing the sample size, t-distribution will have a skinner curve Hence t – critical values, in this case, will be pushed closer to the mean, a non-rejection region At the same time, the sample mean approaches the actual population mean, and the data distribution becomes less variable, resulting in a lower standard deviation S With a lower S and a higher n, a
t'= X −μ
t’ statistical test has a new formula: S Then, t and t’ will move to √ n
each other Besides, in this case, a test statistic point is significantly far from
critical numbers Thus, it is difficult for a test statistic point to fall the rejection area even with these adjustments As a result, it is reasonable to state that the
statistical conclusion will not change
- Furthermore, the statistical power and sample size have a positive correlation with each other Increasing the sample size enhances power by lowering the standard error
to raise the test statistic value (Introduction to Hypothesis Testing n.d) Then, the power
of the test will increase In other words, the sample will be more representative of the general population if the standard error is lower The sample size has an inverse
relationship with the standard error; the larger the sample size, the smaller the standard error as the statistic approaches the real value The standard error is a type of inferential statistic that is used to make inferences It indicates the standard deviation of the mean within a dataset This works as a measure of variance for random variables and offers a measurement for the spread The dataset will be more exact if the dispersion is less (Kenton and Mansa 2020) Consequently, the result will be more accurate
- 1−β , which β is the Type II error is used to determine the power of the test With higher power, we're less likely to commit a Type II error, which does not reject the null hypothesis when the null hypothesis is false P (Reject H0 | H0 is false) = 1 – P (Fail to reject H0 | H0 is false) = 1−β
Decreasing beta error ( β ) through increasing the sample size increases power of
the test 1−β In other words, the lower β is, the higher statistical power (Zijing
Zhu 2020)
From these reasons above, when increasing the sample size, n, the statistical decision will remain unchanged and the results of
hypothesis testing will be more accurate.
Trang 10Part 5: Conclusion
Key findings from the analysis and calculation of individuals using the internet at three different income levels in 33 countries are listed below
- In part 1, there is an upward trend in using the internet in the whole world It is a link between internet usage and gross national income It means in countries with a high- middle-income, individuals using the internet will be higher than in low-income countries GNI indicates how the countries’ economy is So, the government should encourage citizens to access more education and pay attention to developing the internet (for example, high-speed internet, 5G, and so on) to push the economy
- To strengthen this relationship, in part 2, it is concluded that the GNI and internet usage are dependent event when the example of probability of conditional probability of low usage internet given that low-income countries is not equal to the probability of low usage countries (P (L|LI) ≠ P (L)) In addition, the high-income nations are more likely to have using internet rate of 100%, compared to 68.4% and 0% of middle- and
low-income, respectively (P (H | LI) < P (H | MI) < P (H | HI)) The descriptive statistics also show the link between GNI and internet usage In measuring the central tendency for high-, middle-, and low-income countries, the mean of individuals using the internet (%
of the population) is 9.519%, 45.949%, and 82.647%, respectively It demonstrates the significant difference in internet usage between three major groups of countries
- Besides, in part 3, we are 95% confident that the world average of an individual using the internet is between 42.806 and 62.502 % of the population In 2016, 44.7% of people who use the internet in the world lay in the confidence interval It is predicted in part 4 that the rate will still be unchanged in the upcoming years in hypothesis testing Moreover, in this case, the sample size is 33 countries; if increasing that number
(doubling, for instance), the results will be more accurate For more details,
increasing the sample size n will decrease the Type II error β , the power of test
1−β will increase
In conclusion, individuals who use the internet have a significant link with gross national income GNI through the above analysis The more
accessing the internet, the higher income Since when people can access the internet, they can access higher education As a result, it is a
sustainable development economy which is one of the Sustainable
Development Goals of the United Nations Therefore, the countries should invest in advance the internet and encourage using the internet in order to develop the economy sustainably