1. Trang chủ
  2. » Giáo Dục - Đào Tạo

CASE STUDY FACTORS AFFECTING NUMBER OF DEATHS DUE TO COVID 19

23 7 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 23
Dung lượng 0,97 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Measures of Central Tendency Table 1: Measures of Central Tendency for the numbers of COVID-19 deaths by April 23rd, 2020, distributed by region As can be seen from Table 1, the mean of

Trang 1

RMIT University Vietnam [ECON1193] Business Statistics

CASE STUDY

FACTORS AFFECTING NUMBER

OF DEATHS DUE TO COVID-19

Lecturer: Pham Thi Minh Thuy Students and Work Allocation:

Student Name Student ID Parts Contributed Contribution % Signature

Trang 2

TABLE OF CONTENT

Trang 3

I Data Collection

The data are collected and shown in the Excel File

II Descriptive Statistics

1 Measures of Central Tendency

Table 1: Measures of Central Tendency for the numbers of COVID-19 deaths

by April 23rd, 2020, distributed by region

As can be seen from Table 1, the mean of number of COVID-19 deaths in the EU, which is 3347.519 deaths, is higher than the figure for the Middle East (ME) at 421.634 Hence, it can be concluded that

on average there are more death cases in EU countries than in ME countries Regarding the median, which is the middle value of the data set when putting the data in ascending order, half of the EU countries have more than 225 deaths, whereas, 50% of ME countries count only more than 13 In terms

of the mode, while there is no mode in the EU, in ME, the mode is 7 deaths, meaning that the most frequent number of deaths in ME countries is 7.

In this situation, outliers are spotted in both categories (see Figure 1) Therefore, the mean, which is affected by outliers, cannot be used to compare both categories Moreover, the majority of data in each group differ from each other, the mode is not suitable, either Thus, median is the best measure The EU's median is nearly 20 times higher than that of the ME, indicating that COVID-19 has a more negative impact on the population of EU countries compared with ME countries.

2 Measures of Variation

Table 2: Measures of Variation for the numbers of COVID-19 deaths

by April 23rd, 2020, distributed by region

Trang 4

Measures of variation indicate how data are distributed or, specifically in this case, how the number of COVID-19 deaths among countries in each region vary from one another While the range, or the gap between the minimum and maximum values of the data set, of the EU is around 25 thousand deaths, that of the Middle East is only slightly more than one fifths of the EU’s That means the numbers of deaths of the EU’s countries spread quite widely, much more significantly than the already wide range

of the Middle East Apart from that, sample variance, standard deviation and coefficient of variation are the measures that the larger their values are, the more widely data spread around the mean, and standard deviation is the most common one to be focused on among the three Again, both the EU and the Middle East’s standard deviations are quite large, and the one belonging to the EU’s numbers of deaths due to COVID-19 is substantially higher than that of the Middle East, which signifies the same idea the ranges tell.

However, as the data of the number of COVID-19 deaths collected have outliers (see Figure1), the above measures might not be effective enough to tell the data variation as they areaffected by outliers, where interquartile range, an independent measure of outliers, should bethe primary one to be considered Interquartile range is the range of 50% middle values of anordered data set Although the interquartile range of EU’s data is still at the high level ofnearly 2000, that of the Middle East is quite insignificant – 76 deaths Similarly put, there isevery likelihood that the EU’s countries’ COVID-19 mortality numbers are considerablydifferent from one another, in contrast with the moderately comparable values of the MiddleEast’s countries It might result from the lack of solidarity and political games among theEU’s countries that prevent them from making mutual efforts to deal with the coronaviruspandemic (Szucs 2020), while the Middle East countries remain good communication and riskmanagement between states (Pietromarchi 2020)

3 Measures of Shape

Figure 1: Box and Whisker Plots for the numbers of COVID-19 deaths

by April 23rd, 2020, distributed by region

Figure 1 illustrates the distributions of total number of deaths due to COVID-19 of countries in

European Union and Middle East region It shows a highly right-skewed distribution of the total

number of deaths in both regions due to the presence of extremely high total deaths belonging to

Trang 5

some countries This indicates that there are more countries in both groups with lower thanaverage numbers of deaths due to COVID-19, which is a positive news for both regions Whenconsidering the position of the mean in relation to the boxes and whiskers, the EU’s meandeath toll locates in the fourth quartile, pointing out that more than 75% of countries in the EUzone have total deaths below the region’s average Meanwhile, the Middle East region has aless severe and deathly situation, with only one country having total deaths higher than theregion’s average as the mean falls outside the Middle East’s boxplot Moreover, the box-and-whisker plot of the Middle East is lower than the second quartile of the EU, which indicatesthat 13 of out 14 countries, excluding Iran, in the Middle East have lower death toll byCOVID-19 than over 50% of the EU countries, which also shows a much more serioussituation of COVID-19 in the EU than in the Middle East.

III Multiple Regression

All dataset cases are applied:

● Dependent variable (DV): Total number of deaths due to COVID 19 between January

22 and April 23, 2020

● Independent variables (IV):

- Average rainfall (in mm)

- Average temperature (in Celsius)

- Hospital beds (per 10,000 people)

- Population of the country (in 1000s)

- Medical doctors (per 10,000)

Since the significance level is α=0.05

- When p-value ¿0.05 , the variable is insignificant

- When p-value ¿0.05 , the variable is significant

According to Upton and Cook (2014), backward elimination, which is the opposite of forwardelimination, is a search procedure where the initial model contains all variables and removesineffective variables one by one At each step, one variable is extracted from the model, andthis procedure continues to the point where all remaining variables have p-value less than agiven threshold, which is 0.05 in this case

Regarding the given case, we shall start the backward elimination process with the full model,which consists of all predictors The initial model includes average rainfall, averagetemperature, hospital beds, country population, and medical doctors

Trang 6

A EU countries:

1 Regression output 1 - All variables

Significant variables: Population (p-value: 0.0000029) and Hospital beds (p-value:

0.01206)

Insignificant variables: Average rainfall (highest p-value: 0.643), Medical doctors

(p-value: 0.627), Average Temperature (p-(p-value: 0.474)

2 Regression output 2 - Exclude Average rainfall

Trang 7

Significant variables: Population (p-value: 0.0000014) and Hospital beds (p-value:

0.01005)

Insignificant variables: Medical doctors (highest p-value: 0.591), Average temperature

(p-value: 0.422)

3 Regression output 3 - Exclude Medical doctors

Significant variables: Population (p-value: 0.000001) and Hospital beds (p-value:

0.01001)

Insignificant variables: Average temperature (p-value: 0.508).

4 Regression FINAL Output - Exclude Average temperature

Trang 8

Significant variables: Population (p-value: 0.000001) and Hospital beds (0.0102)

Insignificant variables: None

- bhospital beds = -151.453 is the coefficient of number of hospital beds (per 10,000 people).For 1 extra hospital bed per 10,000 people, the number of COVID-19 deaths will decrease byapproximately 151 people assuming that other variables are constant

- R2=0.665 is the coefficient of determination 66.5% of the total COVID-19 deaths in EUvariation can be evaluated through the total population of EU countries and number of hospital beds

1 Regression Output 1 - All variables

Significant variables: Population (p-value: 0.0093)

Insignificant variables: Average temperature (p-value: 0.951), Average rainfall (highest

p-value: 0.959), Hospital beds (p-value: 0.859), and Medical doctors (p-value: 0.635)

Trang 9

2 Regression Output 2 - Exclude Rainfall

Significant variables: Population (p-value: 0.0037)

Insignificant variables: Average temperature (highest p-value: 0.971), Hospital beds

(p-value: 0.855), and Medical doctors (p-(p-value: 0.568)

3 Regression Output 3 - Exclude Temperature

Significant variables: Population (p-value: 0.001)

Insignificant variables: Hospital beds (highest p-value: 0.804), and Medical doctors

(p-value: 0.540)

Trang 10

4 Regression Output 4 - Exclude Hospital Beds

Significant variables: Population (p-value: 0.00032)

Insignificant variables: Medical doctors (highest p-value: 0.316).

5 Regression FINAL Output - Exclude Medical Doctors

Significant variables: Population (p-value: 0.00027)

Insignificant variables: None.

6 Regression Equation

Total death = - 613.6408143 + 0.00005597 × (Population)

Trang 11

7 Interpretation

- b population=0.00005597 indicates that the total number of deaths due to COVID-19increases by 1 person when the population of Middle East countries increases by approximately17,867 (=1/0.00005597) people

- R2=71.5 %(¿0.715) This means that 71.5% of the variation of the total number ofdeaths due to COVID-19 is explained by the population of the countries in the Middle East It can

be seen that the relationship between the total number of deaths and population of Middle Eastcountries is relatively strong

IV Team Regression Conclusion

From the previous section, at 95% level of significance, total COVID-19 deaths of the EUregion have correlations with two independent variables: population and hospital beds (per10,000 people); while the Middle East region is only affected by total population Moreover,the regression model of the Middle East reflects a better estimation with a higher coefficient of

determination ( R2=0.744 ) than that of the EU zone ( R2=0.671 ), which indicates morevariation in total deaths of the Middle East region (74%) is predicted by the variation ofindependent variables compared with the EU (67%)

Based on the regression analysis, the EU region appears to be more impacted by this pandemic Both regions are partly influenced by total population, however, the EU’s slope of population (0.0002) is higher than that of the Middle East (0.00005), indicating that the EU countries will experience more significant variation in total deaths when total population changes It is quite comprehensible as the fact that the bigger the population in a country is, the wider the virus is able to spread, resulting in more infected people, which contributes to the number of deaths On the other hand, the EU’s total deaths by COVID-19 is correlated to another variable, which is hospital beds (per 10,000 people) The Middle East countries suffer from fewer cases of COVID-19 infected citizens and deaths, so they might manage to take care of their COVID-19 patients with their already available number of hospital beds Meanwhile, it might not be the same situation for the EU members as they have been burdened by lack

of hospital beds due to the large number of infected and death cases (Furlong & Hirsch 2020) As a result, the number of hospital beds might matter to the EU while it might not matter much to the Middle East This second variable also makes the EU more sensitive to COVID-19’s impacts, because it is likely to be affected by more elements than that of the Middle East.

Therefore, in order to reduce the number of COVID-19 deaths, especially in these regions, it isadvisable they focus on the virus’s ability to spread in the community, as well as improve thenumber of hospital beds

Trang 12

V Time Series

1 Significant Models

According to the significance of the three trend models for each country as shown below, wehave come to the conclusion that both regions have all three significant models which areLinear, Quadratic and Exponential trends since all models have p-value < 0.05

● Linear Model

Formula: y = - 445.068 + 46.804 × ( number of days since 1 st death)

● Quadratic Model

Trang 13

Formula: y = - 975.628 + 87.617 × ( number of days since 1 st death ) - 0.530 ×

number of days since 1 st death )2

¿

● Exponential Model

Formula: y = 0.002 × 1.291(number of days since 1st death)

● Linear Model

Formula: y = 24.745 + 1.897 × ( number of days since 1 st death )

● Quadratic Model

Trang 14

Formula: y = - 54.540 + 8.414 × ( number of days since 1 st death ) - 0.091

number of days since 1 st death

× ¿

● Exponential Model

Formula: y = 2.82 × 1.07(number of days since 1st death)

2 Recommendation on the Best Model

To determine the best forecasting model, we have calculated measuring errors by calculating SSE and MAD The sum of squared errors (SSE) measures how the observations are scattered from the regression line or, in other words, the errors we make when predicting the observations using the regression line When the observations are scattered quite randomly, we will have a high SSE and vice versa However, SSE is affected by outliers (observations which vary too much from the mean) The mean absolute deviation (MAD) measures how far each of the

Trang 15

observations are from the mean Unlike SSE, it is not sensitive to extreme observations

(observations which vary too much from the mean)

Table 3: SSE and MAD Values of each Time Model for each Region

The trend model that shows the smallest error levels is believed to be the most significant

trend model Regarding the EU, although the MAD of the quadratic model calculated is

slightly higher than that of the linear model, the quadratic model’s SSE is significantly larger

in comparison to the linear one, and the differences among the EU death adjacent numbers are

not too large, resulting in a moderate level of outliers’ errors Apart from that, both SSE and

MAD levels of the quadratic model in the Middle East are the smallest among the 3 models

Therefore, the quadratic model is chosen to be the best trend model for both the EU and the

Middle East to predict the number of deaths due to COVID-19 (NCD).

- On May 30th - day number 106 since the first confirmed death in the EU

^

+81.617×106−0.530 ×1062 =1,720.694∼1,721( people) NCD=−975.628

- On May 31st - day number 107 since the first confirmed death in the EU

^

+81.617×107 −0.530 ×107 2 =1,689.421∼1,689( people) NCD=−975.628

Based on the formula above, the number of deaths due to COVID-19 on May 29th, 30th and 31st are

predicted to be approximately 1,751 people, 1,721 people and 1,689 people respectively.

Trang 16

Hence, it can be drawn from those results that the number of deaths due to COVID-19 would

see a downward trend in the future in the EU

b Middle East Countries

The daily number of COVID-19 deaths in a particular day in the Middle East can then be

- On May 31st - day number 102 since the first confirmed death in the Middle East

^

×102 −0.091× 102 2=−143.076 ∼−143 ( people ) NCD=−54.540+8.414

After applying the quadratic formula, the predicted new confirmed deaths in the Middle East

on the last three days of May are -124; -133 and -143 respectively However, these figures are

negative, which does not make sense because the daily confirmed deaths must be equal or

larger than 0 Hence, the results may suggest that the Middle East will not have any new

COVID-19 deaths on May 29, May 30, and May 31

VI Time Series Conclusion

Figure 2: Daily Recorded Number of COVID-19 Deaths

in 2 Regions since the 1st recorded death in each Region

Figure 2 provides an overview for the daily recorded number of COVID-19 deaths in the EU

and the Middle East By these line charts, the level of errors is easily reflected Although there

are two extreme values in the data set of the Middle East, the rest values do not show much

Ngày đăng: 10/05/2022, 08:49

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

🧩 Sản phẩm bạn có thể quan tâm

w