(Tiểu luận) group mid term exam course mathematical statistics topic hypothesis testing

In order to perform hypothesis testing method first weestablish two hypotheses - alternative hypothesis and null hypothesis in order to begin with the procedure.. t test Experimentation

Trang 1

MINISTRY OF EDUCATION AND TRAINING NATIONAL ECONOMICS UNIVERSITY

-*** -

GROUP MID-TERM EXAM Course: Mathematical Statistics TOPIC: Hypothesis testing Lecturer: Trần Thị Bích

Class: Finance Economics FE64 –

Group: 1

Members: Mai Thái Sơn – 11225626

Phan M nh Hùng ạ – 11222590

Ph m Ng c Hạ ọ ải 11222029 –

Tr n M nh Hào 11222180 ầ ạ –

Phạm Văn Lâm – 11223240 Hoàng Phú Khánh - 11223030

Trang 2

1 | P a g e

CONTENTS

PART I: SUMMARIZE THE ARTICLE 2

1 The article 2

2 The issue of interest 2

3 The technique 2

4 Additional references of Hypothesis testing 3

5 Viewpoint 3

PART II: DATA ANALYSIS TO A SPECIFIC ORGANIZATIONAL PROBLEM 5

1 Overall view: Insurance claims and Dataset 5

2 The purpose 6

3 Descriptive Statistics for Variables 7

4 Data analyze 7

Trang 3

2 | P a g e

PART I: SUMMARIZE THE ARTICLE

1 The article

Stress management through regulation of blood pressure among college students

Source: https://content.iospress.com/articles/work/wor2308

2 The issue of interest

Stress is a pervasive condition that affects individuals on multiple levels, disrupting their sleep, work, and overall quality of life Extensive research has identified job-related and academic-related factors as major contributors to stress However, the field lacks studies investigating immediate remedies or first aid measures to alleviate stress This article introduces the concept of Deep Breathing Technique (DBT) and explores its potential application as a means of stress management by regulating blood pressure among Indian college engineering students A comparative analysis is conducted between DBT and Ordinary Breathing Technique (OBT) to assess their effectiveness

The primary objective of this article is to investigate whether deep breathing techniques can effectively control blood pressure and subsequently reduce stress levels By examining the impact of different breathing techniques, the article aims to provide recommendations based on the findings

3 The technique

This academic article used hypothesis testing to analyze data relating to stress management In data science and statistic hypothesis testing is an important step

as it involves the verification of an assumption that could help develop a statistical parameter It is the act of testing a hypothesis or a supposition in relation to a statistical parameter Analysts implement hypothesis testing in order

to test if a hypothesis is plausible or not In order to find the plausibility of the hypothesis we have to use hypothesis testing method

In order to perform hypothesis testing method first weestablish two hypotheses - alternative hypothesis and null hypothesis in order to begin with the procedure Then we collect data to test the hypothesis after that, perform an

Trang 4

3 | P a g e

appropriate statistical test, through the test we decide whether reject the null hypothesis or no

Thus, as an organizational manager, we see this as an useful tool to create a reliable environment for deciding on sample data It helps us move on knowing that there is no possibility being overlooked that may have an effect in the future

4 Additional references of Hypothesis testing

a The first source

Hypothesis and Hypothesis Testing in the Clinical Trial

https://www.psychiatrist.com/wp-content/uploads/2021/02/13947_hypothesis-hypothesis-testing-clinical-trial.pdf

b The second source

Testing a Hypothesis—Plant Growth

https://fathom.concord.org/resources/tutorials/testing-a-hypothesis-plant-growth/

5 Viewpoint

In order to have a better understanding of Deep Breathing Technique (DBT) and its application such as control blood pressure, level of strees, it is essential that we examine the technique carefully so that we could draw the most suitable conclusions

For the target of the research, a total of 123 students were selected Sample students are filtered and selected via an initial screening (a questionnaire on academic stress) and the ones reported high mental stress during the interview were chosen for the main drills The total data set was divided into two groups named as control group and experimental group In the control group, the first readings were recorded as “before the drill readings” for Systolic Blood Pressure (SBP) and Diastolic Blood Pressure (DBP) The second reading was recorded as

“after Ordinary Breathing Technique (OBT)”

Trang 5

4 | P a g e

Table 1 Hypothesis to be tested

Using the t test formula, the mean of the differences as well as standard deviation can be calculated and illustrated in Table 2 As listed in Table 2 (for control group), that the average DBP is 87.75 before the OBT drill and 87.27 after the OBT drill, with a variance of 7.54 and 9.34 respectively Based on the t test, it is calculated that the P-value (= 0.089 and 0.274) > α Therefore, it is concluded that both H01 and H02 are not rejected at 1% level of significance

Table 2 t test (Control Group)

Onto Table 3 (for experimental group), the DBP is 87.27 before the DBT Drill and 79.89 after the DBT drill, which is significantly lower and closer to the desired level of 80 Asvident, the P-value is less that at 0.01 (P-value = 0.000 α

< = 0.01) And thα e delivered conclusion is “ H03 is rejected”, and we can say that there is a significant positive effect of DBT on DBP

On the other hand, the SBP was 128.82 before the DBT drill and 121.03 after the drill gain indicating a significant positive effect as the desired level is 120

Trang 6

5 | P a g e

References including the P-value being 0.000 (lower than α at 0.01) in the t test confirms this (H is rejected) 04

Table 3 t test (Experimentation Group)

Table 4 shows the status of hypothesis for control and experimental groups after testing and statistical analysis

Table 4 Hypothesis status

Based on the result of the hypothesis testing, we could draw a conclusion that Deep Breathing Technique has a great effect on students It is recommended that people should use this techniquie in order to have a better health condition

PART II: DATA ANALYSIS TO A SPECIFIC ORGANIZATIONAL PROBLEM

1 Overall view: Insurance claims and Dataset

Leveraging customer information is of paramount importance for most businesses Most firms place an emphasis on leveraging customer information

In the case of an insurance company, consumer characteristics such as those listed below might be critical in making business decisions So, we have collected data from 1338 of our customers, 676 male and 662 female or as mentioned below as policy holders or insurance holders

Trang 7

Discover more

from:

Document continues below

Mathematical

statistics

Đại học Kinh tế Quố…

392 documents

Go to course

Bai tap powerpoint 1

Mathematical

statistics 100% (2)

2

Premium

[Hồ Thức Thuận] full bộ công thức giải nhanh…

Mathematical

statistics 83% (6)

15

Premium

toán cho các nhà kinh

tế 1

Mathematical

statistics 100% (2)

14

Premium

SFM A1.1 - Dist

Mathematical

statistics 100% (1)

40

Premium

Sfm 1 - Statistic for management

Mathematical

statistics 100% (1)

20

Premium

Trang 8

6 | P a g e

Dataset source: https://www.kaggle.com/code/yogidsba/insurance-claims-eda-hypothesis-testing

2 The purpose

The objective of this testing is to find the profile of customers that will benefit the company most through 2 questions:

- If insurance holders with no kids pay less charges than average at 90% confidence level?

- If medical charges made by the people who smoke are greater than those who don’t at 90% confident level?

DATASET:

- Age : This is an integer indicating the age of the primary beneficiary (excluding those above 64 years, since they are generally covered by the government)

- Sex : This is the policy holder's gender, either male or female

- BMI :This is the body mass index (BMI), which provides a sense of how over or under-weight a person is relative to their height BMI is equal to weight (in kilograms) divided by height (in meters) squared An ideal BMI is within the range of 18.5 to 24.9

- Children : This is an integer indicating the number of children / dependents covered by the insurance plan

- Smoker : This is yes or no depending on whether the insured regularly smokes tobacco

- Region :This is the beneficiary's place of residence in the U.S., divided into four geographic regions - northeast, southeast, southwest, or northwest

- Charges :Individual medical costs billed to health insurance

Above are all the variances in the data that we collected, however, to answer the 2 questions of this test, we will only be using the "Children", "Smoker" and

"Charges" variances in our calculation

Công văn nghỉ lễ Giõ tổ Hùng vương và 30-4

Mathematical statistics 100% (1)

1 Premium

Trang 9

7 | P a g e

3 Descriptive Statistics for Variables

As you can see from the table above, the dataset consists of 1338 samples and all are valid with 0 missing information

4 Data analyze

a Question 1

- State the hypothesis

Null hypothesis: the average insurance charge of people don’t have kids is $13270

H0: µ = 13270

Alternative hypothesis: the average insurance charge of people don’t have kids

less than $13270 H : µ < 13270 a

Trang 10

8 | P a g e

- Compute the test statistic

This measures (in standardized unit) how far how hypothesis µ to our sample average is

𝑡 =𝑥 + 𝜇𝑆

√𝑛 Then, we used the One Sample T-Test to compare the means In the “Test Value” section, enter the 13270 as null hypothesis for the insurance charge of people who don’t have children

The outcomes:

- Make decisions

As we can see on the table, the average insurance charge of people of people who don’t have kids in this sample is 12365.975 We can use a Decision Rule using either the Rejection Region, p-value found from appropriate distribution (std normal), or confidence interval approach

• With rejection region

We wan to be 90% certain This means a 10% chance of rejecting H when it is true 0 According to our alternative hypothesis, 10% of the standard normal will be our rejection region

We use a t-table with t= -1.801 with a degree of freedom of 573

Trang 11

9 | P a g e

T-value os negative, so significance woud only be found in the negative one-tailed t-test Comparing the critical value at 10% of a standard normal is -1.646, our test statistics= -1.801, lying below -1.646, so we reject the null hypothesis in face of the alternative hypothesis

• P-value

We compare the Sig (2-tailed) with significance level Sig (2-tailed) = 0.072 smaller than = 0.1, therefore, we reject the null hypothesis and conclude that the α average insurance charge of people don't have kids less than $13270

• Confidence Interval (CI) approach

The 90% confidence interval for µ is: -1730.820 < µ < -77.231

Because µ= 13270 doesn’t lie between this CI, we reject the null and conclude that the average insurance charge of people don’t have kids less than $13270

Conclusion: Insurance holders with no kids pay less charges than average at 90%

confidence interval

b Question 2

- State the hypothesis

Null hypothesis: smokers have the same medical charges as non-smokers

H0: µ1 = µ2

Alternative hypothesis: smokers have greater medical charges than non-smokers

Ha: µ > µ 1 2

- Compute the test statistics: compute data on SPSS with Independent Sample

T-test

Trang 12

10 | P a g e

- Make decisions

We can see in the Table that the 1st group, which is smokers, has a sample of 274 and the mean of their medical charges is $32050.232 While the group of non-smokers has a sample of 1064 and their average medical charges is $8434.268 From observation, we can see that non-smokers pay way less than smokers, however, the sample of the 2 groups is different so we need to use the table below for testing

• Tests for equal variances:

Null hypothesis: H0: σ1 2 = σ2

Alternative hypothesis: Ha: σ1 ≠ σ22

We will use the data for Sig., which is the p-value, is almost equal to 0 (.000) It is smaller than the significance level of 1% Therefore, we reject the null hypothesis

of "H0: σ1 = σ22 " and we can conclude that "Ha: σ1 ≠ σ2 " Which means in the next steps when testing for equal means, we will use data in the "Equal variances not assumed"

Trang 13

11 | P a g e

• Test for equal means:

We can see the data for Sig (2-tailed) is also equal to 0.000, smaller than 1% of significance level We can reject the null hypothesis and conclude that the average medical charges of smokers and non-smokersare different and non-smokers pay less than smokers on average at 99% confidence level

Tiêu đề	Hypothesis Testing
Tác giả	Mai Thái Sơn, Phan Mạnh Hùng, Phạm Ngọc Hải, Trần Mạnh Hào, Phạm Văn Lâm, Hoàng Phú Khánh
Người hướng dẫn	Trần Thị Bích
Trường học	National Economics University
Chuyên ngành	Mathematical Statistics
Thể loại	Group Mid-Term Exam
Năm xuất bản	2024
Thành phố	Hanoi

Định dạng
Số trang	13
Dung lượng	2,64 MB