In order to perform hypothesis testing method first weestablish two hypotheses - alternative hypothesis and null hypothesis in order to begin with the procedure.. t test Experimentation
Trang 1MINISTRY OF EDUCATION AND TRAINING NATIONAL ECONOMICS UNIVERSITY
-*** -
GROUP MID-TERM EXAM Course: Mathematical Statistics TOPIC: Hypothesis testing Lecturer: Trần Thị Bích
Class: Finance Economics FE64 –
Group: 1
Members: Mai Thái Sơn – 11225626
Phan M nh Hùng ạ – 11222590
Ph m Ng c Hạ ọ ải 11222029 –
Tr n M nh Hào 11222180 ầ ạ –
Phạm Văn Lâm – 11223240 Hoàng Phú Khánh - 11223030
Trang 21 | P a g e
CONTENTS
PART I: SUMMARIZE THE ARTICLE 2
1 The article 2
2 The issue of interest 2
3 The technique 2
4 Additional references of Hypothesis testing 3
5 Viewpoint 3
PART II: DATA ANALYSIS TO A SPECIFIC ORGANIZATIONAL PROBLEM 5
1 Overall view: Insurance claims and Dataset 5
2 The purpose 6
3 Descriptive Statistics for Variables 7
4 Data analyze 7
Trang 32 | P a g e
PART I: SUMMARIZE THE ARTICLE
1 The article
Stress management through regulation of blood pressure among college students
Source: https://content.iospress.com/articles/work/wor2308
2 The issue of interest
Stress is a pervasive condition that affects individuals on multiple levels, disrupting their sleep, work, and overall quality of life Extensive research has identified job-related and academic-related factors as major contributors to stress However, the field lacks studies investigating immediate remedies or first aid measures to alleviate stress This article introduces the concept of Deep Breathing Technique (DBT) and explores its potential application as a means of stress management by regulating blood pressure among Indian college engineering students A comparative analysis is conducted between DBT and Ordinary Breathing Technique (OBT) to assess their effectiveness
The primary objective of this article is to investigate whether deep breathing techniques can effectively control blood pressure and subsequently reduce stress levels By examining the impact of different breathing techniques, the article aims to provide recommendations based on the findings
3 The technique
This academic article used hypothesis testing to analyze data relating to stress management In data science and statistic hypothesis testing is an important step
as it involves the verification of an assumption that could help develop a statistical parameter It is the act of testing a hypothesis or a supposition in relation to a statistical parameter Analysts implement hypothesis testing in order
to test if a hypothesis is plausible or not In order to find the plausibility of the hypothesis we have to use hypothesis testing method
In order to perform hypothesis testing method first weestablish two hypotheses - alternative hypothesis and null hypothesis in order to begin with the procedure Then we collect data to test the hypothesis after that, perform an
Trang 43 | P a g e
appropriate statistical test, through the test we decide whether reject the null hypothesis or no
Thus, as an organizational manager, we see this as an useful tool to create a reliable environment for deciding on sample data It helps us move on knowing that there is no possibility being overlooked that may have an effect in the future
4 Additional references of Hypothesis testing
a The first source
Hypothesis and Hypothesis Testing in the Clinical Trial
https://www.psychiatrist.com/wp-content/uploads/2021/02/13947_hypothesis-hypothesis-testing-clinical-trial.pdf
b The second source
Testing a Hypothesis—Plant Growth
https://fathom.concord.org/resources/tutorials/testing-a-hypothesis-plant-growth/
5 Viewpoint
In order to have a better understanding of Deep Breathing Technique (DBT) and its application such as control blood pressure, level of strees, it is essential that we examine the technique carefully so that we could draw the most suitable conclusions
For the target of the research, a total of 123 students were selected Sample students are filtered and selected via an initial screening (a questionnaire on academic stress) and the ones reported high mental stress during the interview were chosen for the main drills The total data set was divided into two groups named as control group and experimental group In the control group, the first readings were recorded as “before the drill readings” for Systolic Blood Pressure (SBP) and Diastolic Blood Pressure (DBP) The second reading was recorded as
“after Ordinary Breathing Technique (OBT)”
Trang 54 | P a g e
Table 1 Hypothesis to be tested
Using the t test formula, the mean of the differences as well as standard deviation can be calculated and illustrated in Table 2 As listed in Table 2 (for control group), that the average DBP is 87.75 before the OBT drill and 87.27 after the OBT drill, with a variance of 7.54 and 9.34 respectively Based on the t test, it is calculated that the P-value (= 0.089 and 0.274) > α Therefore, it is concluded that both H01 and H02 are not rejected at 1% level of significance
Table 2 t test (Control Group)
Onto Table 3 (for experimental group), the DBP is 87.27 before the DBT Drill and 79.89 after the DBT drill, which is significantly lower and closer to the desired level of 80 Asvident, the P-value is less that at 0.01 (P-value = 0.000 α
< = 0.01) And thα e delivered conclusion is “ H03 is rejected”, and we can say that there is a significant positive effect of DBT on DBP
On the other hand, the SBP was 128.82 before the DBT drill and 121.03 after the drill gain indicating a significant positive effect as the desired level is 120
Trang 65 | P a g e
References including the P-value being 0.000 (lower than α at 0.01) in the t test confirms this (H is rejected) 04
Table 3 t test (Experimentation Group)
Table 4 shows the status of hypothesis for control and experimental groups after testing and statistical analysis
Table 4 Hypothesis status
Based on the result of the hypothesis testing, we could draw a conclusion that Deep Breathing Technique has a great effect on students It is recommended that people should use this techniquie in order to have a better health condition
PART II: DATA ANALYSIS TO A SPECIFIC ORGANIZATIONAL PROBLEM
1 Overall view: Insurance claims and Dataset
Leveraging customer information is of paramount importance for most businesses Most firms place an emphasis on leveraging customer information
In the case of an insurance company, consumer characteristics such as those listed below might be critical in making business decisions So, we have collected data from 1338 of our customers, 676 male and 662 female or as mentioned below as policy holders or insurance holders
Trang 7Discover more
from:
Document continues below
Mathematical
statistics
Đại học Kinh tế Quố…
392 documents
Go to course
Bai tap powerpoint 1
Mathematical
statistics 100% (2)
2
Premium
[Hồ Thức Thuận] full bộ công thức giải nhanh…
Mathematical
statistics 83% (6)
15
Premium
toán cho các nhà kinh
tế 1
Mathematical
statistics 100% (2)
14
Premium
SFM A1.1 - Dist
Mathematical
statistics 100% (1)
40
Premium
Sfm 1 - Statistic for management
Mathematical
statistics 100% (1)
20
Premium
Trang 86 | P a g e
Dataset source: https://www.kaggle.com/code/yogidsba/insurance-claims-eda-hypothesis-testing
2 The purpose
The objective of this testing is to find the profile of customers that will benefit the company most through 2 questions:
- If insurance holders with no kids pay less charges than average at 90% confidence level?
- If medical charges made by the people who smoke are greater than those who don’t at 90% confident level?
DATASET:
- Age : This is an integer indicating the age of the primary beneficiary (excluding those above 64 years, since they are generally covered by the government)
- Sex : This is the policy holder's gender, either male or female
- BMI :This is the body mass index (BMI), which provides a sense of how over or under-weight a person is relative to their height BMI is equal to weight (in kilograms) divided by height (in meters) squared An ideal BMI is within the range of 18.5 to 24.9
- Children : This is an integer indicating the number of children / dependents covered by the insurance plan
- Smoker : This is yes or no depending on whether the insured regularly smokes tobacco
- Region :This is the beneficiary's place of residence in the U.S., divided into four geographic regions - northeast, southeast, southwest, or northwest
- Charges :Individual medical costs billed to health insurance
Above are all the variances in the data that we collected, however, to answer the 2 questions of this test, we will only be using the "Children", "Smoker" and
"Charges" variances in our calculation
Công văn nghỉ lễ Giõ tổ Hùng vương và 30-4
Mathematical statistics 100% (1)
1 Premium
Trang 97 | P a g e
3 Descriptive Statistics for Variables
As you can see from the table above, the dataset consists of 1338 samples and all are valid with 0 missing information
4 Data analyze
a Question 1
- State the hypothesis
Null hypothesis: the average insurance charge of people don’t have kids is $13270
H0: µ = 13270
Alternative hypothesis: the average insurance charge of people don’t have kids
less than $13270 H : µ < 13270 a
Trang 108 | P a g e
- Compute the test statistic
This measures (in standardized unit) how far how hypothesis µ to our sample average is
𝑡 =𝑥 + 𝜇𝑆
√𝑛 Then, we used the One Sample T-Test to compare the means In the “Test Value” section, enter the 13270 as null hypothesis for the insurance charge of people who don’t have children
The outcomes:
- Make decisions
As we can see on the table, the average insurance charge of people of people who don’t have kids in this sample is 12365.975 We can use a Decision Rule using either the Rejection Region, p-value found from appropriate distribution (std normal), or confidence interval approach
• With rejection region
We wan to be 90% certain This means a 10% chance of rejecting H when it is true 0 According to our alternative hypothesis, 10% of the standard normal will be our rejection region
We use a t-table with t= -1.801 with a degree of freedom of 573
Trang 119 | P a g e
T-value os negative, so significance woud only be found in the negative one-tailed t-test Comparing the critical value at 10% of a standard normal is -1.646, our test statistics= -1.801, lying below -1.646, so we reject the null hypothesis in face of the alternative hypothesis
• P-value
We compare the Sig (2-tailed) with significance level Sig (2-tailed) = 0.072 smaller than = 0.1, therefore, we reject the null hypothesis and conclude that the α average insurance charge of people don't have kids less than $13270
• Confidence Interval (CI) approach
The 90% confidence interval for µ is: -1730.820 < µ < -77.231
Because µ= 13270 doesn’t lie between this CI, we reject the null and conclude that the average insurance charge of people don’t have kids less than $13270
Conclusion: Insurance holders with no kids pay less charges than average at 90%
confidence interval
b Question 2
- State the hypothesis
Null hypothesis: smokers have the same medical charges as non-smokers
H0: µ1 = µ2
Alternative hypothesis: smokers have greater medical charges than non-smokers
Ha: µ > µ 1 2
- Compute the test statistics: compute data on SPSS with Independent Sample
T-test
Trang 1210 | P a g e
- Make decisions
We can see in the Table that the 1st group, which is smokers, has a sample of 274 and the mean of their medical charges is $32050.232 While the group of non-smokers has a sample of 1064 and their average medical charges is $8434.268 From observation, we can see that non-smokers pay way less than smokers, however, the sample of the 2 groups is different so we need to use the table below for testing
• Tests for equal variances:
Null hypothesis: H0: σ1 2 = σ2
Alternative hypothesis: Ha: σ1 ≠ σ22
We will use the data for Sig., which is the p-value, is almost equal to 0 (.000) It is smaller than the significance level of 1% Therefore, we reject the null hypothesis
of "H0: σ1 = σ22 " and we can conclude that "Ha: σ1 ≠ σ2 " Which means in the next steps when testing for equal means, we will use data in the "Equal variances not assumed"
Trang 1311 | P a g e
• Test for equal means:
We can see the data for Sig (2-tailed) is also equal to 0.000, smaller than 1% of significance level We can reject the null hypothesis and conclude that the average medical charges of smokers and non-smokersare different and non-smokers pay less than smokers on average at 99% confidence level