1. Trang chủ
  2. » Giáo Dục - Đào Tạo

Best trend model for predicting the worlds number of deaths due to covid 19

26 5 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Best Trend Model for Predicting The World's Number Of Deaths Due To Covid-19
Người hướng dẫn Mr. Nguyen Thanh Liem
Trường học RMIT International University Vietnam
Chuyên ngành Business Statistics
Thể loại Assignment
Năm xuất bản 2021
Thành phố Saigon
Định dạng
Số trang 26
Dung lượng 534,06 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

These figures represent the country's population in millions in 2021, the estimated total number of deaths from Covid-19 from April 1 to July 31 per million population in 2021, the avera

Trang 1

RMIT International University Vietnam ECON1193B – BUSINESS STATISTICS 1 Assignment 3: Team Report (35%)

Course name Business Statistics 1

Assignment #3: Team Report

Members & sID

Nguyen Thu Huong s3878242Nguyen Thi Thuy Trang s3892060Nguyen Dinh Khanh Uyen s3891815Tran Lam Kim Ngan s3891636Nguyen Ton Minh Nhat s3878695

Words count 3964 (content only)

Number of Pages 26

TABLE OF CONTENTS

Trang 2

PART 1: DATA COLLECTION 5

PART 2: DESCRIPTIVE STATISTICS 5

1 Measures Of Central Tendency 5

2 Measures Of Variation 6

3 Measures Of Shape 6

PART 3: 9

REGION A: NORTH AMERICA 9

REGION B: ASIA 10

PART 4: TEAM REGRESSION CONCLUSION 12

PART 5: TIME SERIES 12

REGION A – NORTH AMERICA 12

REGION B – ASIA 15

PART 6: TIME SERIES CONCLUSION 19

1 Line Chart 19

2 Best Trend Model For Predicting The World's Number Of Deaths Due To Covid-19 20 PART 7: OVERALL TEAM CONCLUSION 20

1 The Main Factor That Impacts The Number Of Deaths Due To Covid-19? 20

2 Predict The Number Of Deaths On October 31 21

3 The Prediction In The Number Of Covid-19 Deaths At The End Of 2021 21

4 Two Variables Might Impact The Global Deaths Case. 22

REFERENCES 22

APPENDIX 25

PART CONTRIBUTION

First name Student ID Parts Contribution Signature

Trang 3

contributed %

Part 1: Data collection

The analysis data in this report was collected from 20 countries of Asia and 20 countries of North America; it included six categories in the Excel File These figures represent the country's population (in millions) in 2021, the estimated total number of deaths from Covid-19 from April 1 to July 31 (per million population) in 2021, the average rainfall (in millimeters) and average temperature (in degrees Celsius) from 1991 to 2020, and the total number of hospital beds and medical doctors (per 10,000 people) in 2017 The figures are collected from The World Bank database, Our World in Data, World Health Organization (WHO), government sources with high reliability Because some countries are missing information, this report will analyze only 20 countries in each region to ensure that all of them have enough information for the other parts.

Part 2: Descriptive Statistics

1 Measures of Central Tendency

Min >,<,= Lower Max >,<,= Upper Result

Table 1: Test of outliers of total deaths (million per population) due to

COVID-19 in Asia from April 1st to July 31st, 2021

To conduct proper research of descriptive statistics measurements, outliers from each region are tested The table reveals no outliers in Asia's data, while there is one in North America's data.

Trang 4

Type of Measures Region A: Asia Comparison Region B: North America

(>,<,=)

Table 2: Measures of Variation of total deaths (million per population) due to COVID-19 in two

regions from April 1st to July 31st, 2021

Table 2 shows that A and B have no Mode values Outliers in the sample influence the Mean value (Frost n.d.) It is unaffected by the entire dataset unlike the Mean Since the Median is not affected by extreme values (Frost n.d.) It therefore compares and evaluates overall death statistics well Asia had 49.58 median COVID-19 deaths vs 75.84 in North America COVID-19 has a significant impact on both regions because to climate change (Sandoiu 2020).

Table 3: Measures of Variation of total deaths (million per population) due to

COVID-19 in two regions from April 1st to July 31st, 2021

The Interquartile Range (IQR) is the best choice for comparing Asian and North American

datasets since it removes the impact of outliers Because the IQR shows the midpoint

between the upper and lower quartiles, it is ideal for skewed distributions (Bhandari 2020)

Table 3 indicates that North America had 77.76 deaths whereas Asia had 44.38 Using the

Median to compare IQR, the two regions are similar

3 Measures of Shape

Box-and-whisker Plot

Trang 5

Figure 1: Box and whisker plots of total deaths (million per population) due to COVID-19 in

Asia from April 1st to July 31st, 2021

Table 4: Comparision of Asia's COVID-19 death rate

Trang 6

Figure 2: Box and whisker plots of total deaths (million per population) due to COVID-19 in

North America from April 1st to July 31st, 2021

Table 5: Comparision of North America's COVID-19 death rate

Trang 7

This plot is preferred because outliers are visible on the chart While death rates vary, Asiaand North America are expected to have similar distributions Their values are asymmetricand tend to be more positive From 47.26 to 125.03, the box and whisker plot in NorthAmerica is larger and higher than in Asia (35.63 to 80.01) COVID-19 reportedly impactedNorth America more than Asia, according to one report.

Figure 3 FINAL regression model of Region A: NA

2 Regression Equation

In this equation, units are:

Estimated Total number of deaths due to COVID-19 (per million

population) Population in a million in 2021

The average temperature in 2020

Trang 8

3 Regression coefficients:

= 0.01, the total number of fatalities due to COVID-19 in NA will rise by 0.01 deaths per million people for every 1 Celsius increase in average temperature, assuming the population

remains constant

= 0.399, under the assumption that the average temperature is 0 degrees Celsius and there is

no population, the total number of fatalities attributable to COVID-19 in NA between April 1 andJuly 31, 2021 is 0.399 This implies that there are about 0.399 instances that are independent ofaverage temperature

= 9.03, the total number of fatalities attributable to COVID-19 in NA between April 1 and July

31, 2021, will rise by 9.03 deaths per million population for every 1 million increase in thecountry's population, assuming the average temperature remains constant

4 Coefficients of determination:

According to the coefficient of determination (R Square = 0.39), variations in

average temperature and population account for 39% of the variation in total COVID 19 fatalities in NA between April 1 and July 31, 2021, while other factors (61%) account for theremaining 39% This regression model has an average goodness of fit and explanatory

power, and 61 percent of the observation points do not lie on the regression line

Region B: Asia

1 Regression Analysis:

According to backward elimination, the linear regression for Asia has just one significant hospital bed variable at the 0.05 level of significance

Trang 9

Figure 4 FINAL regression model of Region B: Aisa

2 Regression Equation

In this equation, units are:

Estimated Total number of deaths due to COVID-19 (per million

population) Hospital beds (per 10,000 people)

3 Regression coefficients:

With the Y-intercept: b 0 = 0.465, when the hospital bed total is zero, the estimated number of

fatalities is 0.465 That implies there are about 0.465 patients that are not dependent onhospital beds

With the slope coefficient of the population, b 1 = 0.01 indicates that for every hospital bed per

10,000 people, the overall number of fatalities would rise by 0.01 cases

→ Hospital beds significantly affect the overall number of fatalities caused by COVID-19 inAsia's nations Additionally, there is a positive connection since a more significant number ofhospital beds corresponds to a greater number of COVID-19 mortality instances

However, owing to the unrealistic covariate connection between the number of fatalities and hospital beds in reality, this conclusion does not make sense

4 Coefficients of determination:

Trang 10

The R square = 0.02 percent indicates that hospital beds explain approximately 2% of theCOVID-19 total number of fatalities variance Other variables, on the other hand, account for98% of the variance in the COVID-19 total number of deaths.

Part 4: Team Regression conclusion

Part 3 shows that the two areas' regression models have distinct significant independent variables In the United States, the two independent variables of average temperature (in Celsius) from

1991 to 2020 and population (in million) in 2021 could be used to reflect changes in the total deaths due

to COVID 19 between April 1 and July 31, 2021; in Asia, only one significant independent variable the number of hospital beds per 10,000 people - can be used.

-The significant independent variables explain 39 percent of the total variation in thetotal deaths due to COVID 19 in NA, and the average temperature has a larger stimulatoryeffect on the number of people who died due to COVID-19 in the period than the population

of this country, when compared to the regression coefficient of the two variables While inAsia, only 2% of the overall variance in the total number of COVID 19-related fatalities can

be explained by the number of hospital beds in our research

Following a comparison of central tendency analysis by median and skewness analysis

by boxplot, Part 2 finds that the number of deaths due to COVID 19 in NA is considerablyhigher than Asia As a result of the former's larger absolute regression coefficient than thelatter, in our study, the total COVID-19 deaths in NA are more affected by significantindependent variables than in Asia, despite the fact that the latter's regression coefficient isless Finally, North America has been hit worse than Asia by the pandemic

Part 5: TIME SERIES

REGION A – North America

Linear trend Quadratic trend Exponential trend

Trang 11

Table 6: P-value and R2 of the three trend models.

1 Trend Models

Through calculations, the P-values of linear, quadratic, and exponential trends arecollected in the table above In the next stage, those numbers are used to examine the trends inthe number of humans who died by Covid-19 from April 1st, 2021 to July 31st, 2021

H0: 1 = 0 (There is no trend in the total number of deaths in North America)

H1: 1 ≠ 0 (There is a trend in the total number of deaths in North America)

Linear trend: According to the table above, it can be seen that the p-value is 2.474x10-29 < α

= 0.05, H0 is rejected Therefore, with 95% certainty, it is clear that North American nations are seeing a linear trend in the daily number of deaths caused by COVID-19.

Quadratic trend: Based on Table 6, we accept H1 because the p-value (0.000209075) is lower than

α (0.05) As a result, there is a quadratic trend in the number of COVID-19 fatalities

per day in North America from April 1st to July 31st occurs with 95% confidence

Exponential trend: We reject H0 because p-value = 2.373x10-31 < α = 0.05 (Table 6) Thus,

from April 1st to July 31st, North American countries had an exponential trend in the daily

casualties caused by COVID-19, with 95% of confidence

Regression Output – QUA

Figure 5: Quadratic trend regression output of North America.

Trang 12

The figure above also shows the regression output of three trend models of North American nations (Region A), we may compare the coefficient of R2 was calculated to identify the best model to forecast the daily number of new victims due to COVID-19 As a result, out of the three trend models, the Quadratic trend model is the most effective option for estimating future death cases in North American nations since it has the least amount of erroneous computation.

Formula & Coefficient Explanation

= -0.0255+0.0001×2T

When T = 0, = 2.6935 indicates a 2.6935% change in the number of COVID-19 deaths

across North American nations every day However, because 0 is an unobservable value, thisinterpretation is illogical

In contrast, = -0.0255+0.0001×2T, demonstrating that as time passes, T rises, causing thetotal number of COVID-19 fatalities across Asia to change by -0.0255+0.0001×2T

2 Recommended Trend Model

Table 7: SSE & MAD results of the three trend models.

To evaluate which trend model is an ideal choice for estimating the number of COVID-19 fatalities

in North America, we created a table of two computations that analyze the errors - SSE & MAD As in the table demonstrated above, the Exponential trend model produces the smallest

Trang 13

SSE and MAD results Therefore, the Exponential appears to be the best acceptable trend model for forecasting the number of deaths caused by COVID-19 in North America.

3 Prediction for the number of COVID-19 deaths

The following formulas are used to estimate the number of fatalities on September 28,

September 29, and September 30:

September 28: Ŷ = 2.531 x 0.992181 = 0.591

September 29: Ŷ = 2.531 x 0.992182 = 0.587

September 30: Ŷ = 2.531 x 0.992183 = 0.582

The calculations above predict North America will record 0.591, 0.587, 0.582 new death

cases because of the COVID-19 pandemic per million people on September 28, September 29,and September 30, respectively

REGION B – Asia

T-Squared: 1.31210-14

Trang 14

R2 0.029% 39.416% 1.754%

Table 9: P-value and R2 of the three trend models.

1 Trend Models

The same as Region A, a table of P-values of the three trend models is conducted for further

analysis Accordingly, based on those outcomes above, we can determine the significant trend model of

the total number of COVID-19 deaths in Asia between April 01 and July 31, 2021.

H0: 1=0 (There is no trend in the total number of deaths in Asia)

H1: 10 (There is a trend in the total number of deaths in Asia)

Linear Trend: Since the p-value is larger than � (0.852 > 0.05) (Table 9), H0 is

accepted Hence, we are 95% confident that there is no linear trend in the total number of

daily deaths due to COVID-19 in Asia from April 01 to July 31, 2021

Quadratic Trend: As can be seen from the table above, both p-values of T (4.191×10-14)

and T-squared (1.31210-14) are smaller than � (0.05) Consequently, we reject H0 With

95% certainty, we ensure that the total number of COVID-19 daily deaths in Asia between

April 01 and July 31, 2021 has a quadratic trend.

Exponential Trend: As � is appeared to be lower than the p-value (0.05 < 0.146), we do not

reject H0 With a 95% confidence level, it is clear that the total number of daily deaths due to the

COVID-19 in Asia from April 01 to July 31, 2021 does not follow an exponential trend.

According to the findings above, it is undoubted that the total number of daily deaths

due to COVID-19 in Asia countries follows a quadratic trend model In addition, the

smaller the R2, the less the regression model fits our observations (Statistic By Jim n.d)

Thus, the highest R2 comes from the Quadratic trend model (Table 9) has once more

reinforced our conclusion that Quadratic is the significant trend model of the total

number of COVID-19 death cases in Asia.

Regression Output – QUA

Trang 15

Figure 6: Quadratic trend regression output of the total number of daily

COVID-19 deaths in Asia.

Formula & Coefficient Explanation

Y=0.4017+0.0192(T)-0.0002(T)2

=0.0192-0.0002×2T

When T = 0, = 0.4017, the rate of change in the total number of daily COVID-19 deaths in

Asia is 0.4017% Nevertheless, 0 is a value that cannot be observed, so this interpretation is

unreasonable

Conversely, = 0.0192-0.0002×2T, indicating that as one year goes by, T increases, leading

the rate of the total number of daily COVID-19 deaths in Asia to change by 0.0192-0.0002×2T

(%)

2 Recommended Trend Model

Ngày đăng: 10/05/2022, 08:49

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

🧩 Sản phẩm bạn có thể quan tâm

w