Part 1 summarize descriptive statistics

the mean of the population in 2013 was 7.7, it seems that the world average crude death rate was increased to some extent.. By not rejecting H0 , i might have committed type II error Fai

Trang 1

Subject’s Code: ECON1193

Subject’s Name:

Business Statistics

Location &

Campus:

RMIT Vietnam, Hanoi

Student’s Name

& Number:

Vu Quang Hung- s3743984

Lecturer’s

This report contains 5 parts:

Trang 2

1 Summary descriptive statistics

2 Confidence intervals

3 Hypothesis testing

4 Regression analysis

5 Overall conclusion

Part 1: Summarize descriptive statistics

From my assignment 1 results, i have concluded that crude death rate and national income is related I will list the analytics i have worked on previously to show why these two variables

is related

Firstly, Iceland has lower death rate than Uganda Given that Iceland is a more

developed country (GNI of 49960 in 2015) compare to Uganda, has lower income (GNI of 670 in 2015), and still a developing country

(CDR from 1980 to 2016: Iceland & Uganda)

Trang 3

Taking the larger sample size of 35 countries, the results from Contingency Table suggested that high and middle-income countries have higher probability of having low crude death rate

(CDR Contingency Table)

Moreover, as can be seen in the table 1 below, CDR and Income are not independent, which means that the data of death rate might be affected by national earnings and vice versa

(Table 1) These evidences above all suggest that there is a relationship between income and death rate

of a country

Part 2: Confidence Intervals

Death rate, crude (per 1,000 people)

Trang 4

Firstly, i choose a level of significance of 5% -> confidence level is

95% -> α = 0.05, α/2 = 0.025

Population standard deviation (σ) is unknown, so i can substitute σ by sample standard deviation

S, i will use t-table to calculate the confidence interval

Sample size n = 35

d.F = (n-1) = 35-1 = 34

t34,0,025 = 2.0322

(Distribution plot)

(table 2) Point estimate X = 8.22 (Mean value in table 2)

S = 2.26

-> Confidence interval estimation: µ =

Trang 5

={8.22-0.7766;8.22+0.7766} =

{7.444;8.997} -> 7.444 < µ < 8.997

This means that I can be 95% confident that the true mean crude death rate falls between 7.444 and 8.997 (per 1000 people)

Gross National Income (GNI) per capita

Apply the same method

+) Level of significance: 5% (α = 0.05)

+) Confidence interval estimation calculated is µ =

{11408,137;27545.577} -> 11408,137 < µ < 27545.577

+) I am 95% confident that the average of Gross National Income per capita is between 11408.137 and 27545.577 (US dollars)

Domestic general government health expenditure per capita, PPP

+) Level of significance: 5 % (α= 0.05)

+) Confidence interval calculated is µ = {854.93;2065.128}

-> 854.93 < µ < 2065.128

Trang 6

+) That means i can be 95 % confident that the domestic general government health

expenditure per capita is in between 854.93 and 2065.128 (current international $)

Assumptions:

No assumptions required in this case, since i use t-table and the sample size (n) is greater than 30, i am able to use Central Limit Theorem

Suppose the number of countries will double:

o Firstly, the confidence interval width will decrease if the number of

countries were doubled

According to the equation above, the confidence interval depends on the square root

ofthe number of measurements (n) Therefore, when we double the number of

countriesfrom 35 to 70 countries (double the n), the standard error decrease,

confidence intervalrange will become smaller

Overall, confidence interval will become narrower towards the true mean of the distribution, hence increasing the accuracy of the results Moreover, increasing the number of countries involved in the test will make the result a better representative of all

195 countries

Part 3: Hypothesis Testing

a Prediction:

As what have been calculated in part 2 above, it is 95% confidence that the world average crude death

rate is in between 7.444 and 8.997 Given that the sample mean X is 8.2 in 2015, since

Trang 7

the mean of the population in 2013 was 7.7, it seems that the world average crude death rate was increased to some extent

However, to have better judgement, hypothesis testing is required to determine whether the crude death rate per thousand people has increased or not

b Hypothesis testing:

Level of significant: α = 0.05

Sample size n = 35 > 30, Central Limit Theorem is applicable and thus sampling distribution of mean becomes normally distributed I will use t-value to do the test Null hypothesis and alternative hypothesis :

This is a one-tail test (upper-tailed test) since the alternative hypothesis H1 is focused

on the upper tail above the mean of 7.7

Alpha = 0.05

d.F = 34

Upper-tailed test

-> t34,0.05 = 1.6909

Compute test statistics: (Sample mean X=8.22, µ=7.7, Sample standard deviation S=2.26, n=35)

-> t = 1.36

t = 1.36 < 1.6909, the test statistic falls into the non-rejection region, therefore we do not reject the null hypothesis H0

As H0 is not rejected, hence with 95% level of confidence it can be concluded that the average crude death rate has decreased or stay the same (not increased)

Trang 8

By not rejecting H0 , i might have committed type II error (Failed to reject a false null hypothesis)

It means that there is a probability that the average world crude death rate has increased (H0 false) but i still claim that it has decreased or remain unchanged (not reject H0 ).

Type II error can be minimized by picking a larger sample size (n) By increasing the sample size, I make the hypothesis test more sensitive, which means that it is more likely to reject the null hypothesis when it is, in fact, false Another solution for this is

to increase the level of significant α, which makes the rejection area larger hence less likely to ignore a false null hypothesis

Part 4: Regression analysis

a Dependent variable and independent variables:

Dependent variable: Death rate, crude (per 1,000

people) Independent variables:

- GNI per capita, Atlas method (current US$)

- Domestic general government health expenditure per capita, PPP

(current international $)

- Immunization, measles (% of children ages 12-23 months)

- Smoking prevalence, total (age 15+)

b Regression analysis:

As i have concluded in the previous report, i expect the relationship between CDR and GNI to

be Negative Linear, higher income countries tend to have lower crude death rate.

Scatter plot of CDR and GNI:

Trang 9

CDR & GNI

14

12

10

8

6

4

2

0

0 10000 20000 30000 40000 50000 60000 70000 80000 90000 100000

GNI per capita ( current us$ )

(Regression model) Comment on scatter plot:

- According to the graph, it seems to have no linear relationship between these two variables, many countries in this plot have the same income but also have very different CDR

Regression output (from excel):

- Based on the summary output, the Simple Linear Regression equation is

Trang 10

- Regression coefficient (slope): b1 = 0.00001 shows the average increase in the crude death rate when the gross national income grows by 1$ per capita

- Coefficient of determination R square: 2 % This means that 2% of the variance

of the crude death rate is explained by the country income

- Test the significance of the independent variable:

o Using t-value method: (two-tailed test)

α= 0.05 -> α/2 = 0.025

d.F = n-2 = 35-2 = 33

t critical value (from t-table): t33,0.025 =

±2.0345 Test statistic t = 0.813

t statistic is in the non-rejection area -> Do not reject H0

o Using p-value method

Pb1 = 0.422; α = 0.05

Pb1 > α -> Do not reject H0

There is no sufficient evidence to conclude a linear relationship between crude death rate and gross national income

2 CDR & Domestic health expenditure

Expected result: Negative linear relationship, countries spent more money on domestic health programs will have less people die from health problems thus should have lower CDR

Scatter plot of CDR & Domestic general government health expenditure:

Trang 11

CDR & Domestic health expenditure

14

12

10

8

6

4

2

0

domestic health expenditure ( current international $ )

Comment on the scatter plot: This scatter plot shows that CDR and Domestic health expenditure does not have linear relationship Almost a half of 35 countries spent approximately the same budget on health (less than 500$ per capita), but these countries have very different death rate which means that money spent on health program seems to have no effect on CDR

Regression output:

- Based on the regression statistics, linear equation is

Trang 12

- Regression slope b1 = 0.00023 -> For every dollar the government spends on healthcare for each of its citizen, death rate will increase by per 1000 people

- Coefficient of determination R square = 0.033 -> 3.3 % of CDR is explained by domestic healthcare expenses

o Using t-value method:

H1 : β1 ≠ 0 (linear relationship does exist)

α = 0.05 -> α/2 = 0.025

t critical value (from t-table): t = ±2.0345

Test statistic t = 1.0623

t statistic is in the non-rejection area -> Do not reject H0

Pb1 = 0.2958; α= 0.05

Pb1 > α -> Do not reject H0

There is no sufficient evidence to conclude a linear relationship between crude death rate and domestic general government health expenditure

Expected result: Negative linear relationship Children from 12-23 months age get immunized will less likely to be exposed by measles in the future, this might reduce people die from this disease Therefore, death rate overall may decrease

Scatter plot of CDR and Immunization, measles (%)

Trang 13

CDR & IMMUNIZATION,measles

14

12

10

8

6

4

2

0

3 0 iMMUNIZATION,measles ( % of children ages 12-23 months )

Comment on the scatter plot: This scatter plot shows that CDR and IMR do has linear relationship The more immunization children get, the lower the death rate, hence this relationship is negative However, since the value on the trend line and the actual value have quite large distance between them, this is a weak relationship

Regression output:

- Regression slope b1 = -0.06871 shows how much dependent variable (death rate) decrease if the proportion of infant get measles immunization increases by 1%.

Trang 14

- Coefficient of determination R square = 0.126 -> 12.6 % of CDR is explained

by measles immunization percentage While the remaining 87.4% is due to other factors

H1 : β1 ≠ 0 (linear relationship does exist)

α = 0.05 -> α/2 = 0.025

t critical value (from t-table) : t = 2.0345

Test statistic t = -2.182

Pb1 = 0.036; α= 0.05

Pb1 < α -> Reject H0

There is sufficient evidence to conclude a linear relationship between crude death rate and proportion of 12-24 months age children who have measles immunization

Expected result: Positive linear relationship, Smoking cause numerous deadly diseases Therefore, it is dead rate will rise if smoking prevalence among teenagers (under 15) rise. Scatter plot of CDR & Smoking prevalence:

Trang 15

cdr & sMOKING PREVALENCE

14

12

10

8

6

4

2

0

sMOKING PREVALENCE ( % )

Comment on the scatter plot: This scatter plot shows that CDR and SP has non-linear relationship It can be seen that data points scatter randomly around the trend line Regression output:

- Regression slope b1 = 0.0467 This means that smoking prevalence rate increase by 1/1000 will result in the rise by 0.0467 in the crude death rate

- Coefficient of determination R square = 0.040 -> 4 % of CDR is explained by Smoking prevalence

Trang 16

H0 : β1 = 0 (no linear relationship)

H1 : β1 ≠ 0 (linear relationship does exist) α = 0.05

t critical value (from t-table): t=2.0345 Test statistic t = 1.177

Pb1 = 0.248; α = 0.05

Pb1 > α -> Do not reject H0 There is no sufficient evidence to conclude a linear relationship between crude death rate and smoking prevalence

c Variable recommended for further research on crude death rate:

After considering all 4 independent variables above, I recommend immunization, measles (% of children ages 12-23 months) for further research

on crude death rate The main reason for this is that among 4 given variables, this has the most significant correlation with CDR (R square = 0.126) although the relationship is weak Also, by doing regression analysis, this is the only variable that I can conclude it has linear relationship with CDR

Part 5: Overall conclusion.

According to the findings of this report and the results from the previous report, I have made some conclusion on the world average crude death rate Based on the confidence interval part, i can be

95 % confident that the world average CDR is in between 7.444 and 8.997 (per 1000 people) 95 %

is also the confidence level that i use to conclude that average CDR has decreasing trend from the mean of 7.7/1000 in 2013 to 8.2 in 2015.In part 4 (Regression analysis), I figured out that only

independent variable X 3 (immunization, measles) do have effect on the dependent variable CDR, but only accounted for a small 12.6% of the total change in CDR (confidence

Trang 17

level 95%),while income, expenditure on healthcare and smoking prevalence seems to have no linear relationship with CDR Therefore, apart from these 4 above, there are still many prominent factors that have better impact on CDR and worth to do research on, but we have not yet covered in this report

To sum up, by looking at the crude death rate, we are able to have a better measurement of how well we ensure the human well-beings (Soares 2007) Our mission is to expand our studies to find out other aspects affecting the death rate, in the end reducing the world average CDR.

Reference list:

1.Soares.R.R 2007, On the Determinants of Mortality Reductions in the

Developing World,Population and Development Review,vol 33,pp 247-287,viewed

23th December 2018,Wiley Online Library database

Định dạng
Số trang	17
Dung lượng	361,37 KB