- Some states have number of deaths from cardiovascular diseases related to the samebut the number of deaths related to cardiovascular disease the most common maximumfrequency is 226 per
Trang 1Bài tập về Thống kê trong kinh doanh
Thread:
To study the rate of people dying from diseases related to heart disease, a researchgroup at 1 American University of collecting data in several states across the U.S in thenumber of deaths and some socio-economic data related The table given below:
State
Thenumber ofdeaths
Age 65 Income The rate
Trang 2- Age 65: Percentage of population aged 65 years and older
- Income: per capita income measured in thousands of dollars
- The rate of color: the percentage of the population are people of color
- Region: The states are divided into two research areas of Zone 1 and Zone 2
Please use the above data to answer the following questions:
1 Use appropriate statistical description to comment on the events in the data
Trang 32 Use the appropriate graph and correlation coefficient to comment on therelationship between the number of deaths due to cardiovascular diseases associated witheach remaining variable Since then identify if set up linear regression models with thedependent variable is the number of deaths, which of the variables remaining in the canaffect the dependent variable (no need to distinguish the region).
3 Please estimate the confidence interval for the average number of deaths for thestates in the region 1 and region 2
4 Compare the average number of deaths for the states in the region 1 and region 2(for testing) Compared with results similar to the comparison of income?
5 Estimate a linear regression model with the dependent variable is the number ofdeaths, the remaining independent variables are variables (by region):
a Explain the significance of the regression coefficients and the R2 coefficient
b Use the appropriate expertise to know which independent variables affect and donot affect the dependent variable? Since it can be made the comment about the factorscan affect the rate of deaths due to cardiovascular diseases Whether there are otherfactors that can affect mortality rates this?
c Use F for testing whether the model makes sense or not? If the meaning of theresults obtained
d Predict the percentage of deaths in one state with the independent variables,respectively:
15% aged 65 years or older and average income 25000usd, 4% black
Explanation for the results received
Answer:
To answer the question, the members of the group I've been using materials Megastatsoftware to analyze the data and then use the results from the software to answer thequestion
1 Use appropriate statistical description to comment on the events in the data 1.1 The number of deaths related to cardiovascular disease.
Trang 4From software Megastat/ Descriptive statistics, then enter the number of deaths onthe table I have the following tables.
Descriptive statistics
The number ofdeaths
The above table we find:
- The number of the states studied are: the 50 states
- The number of deaths related to cardiovascular disease in an American stateaverage is 259 per 100,000 population The number of deaths related to cardiovasculardisease median is 265,100 Thus, 50% of the states studied The number of deaths related
to cardiovascular disease is lower than 265.1 and 50% of the states studied The number
of deaths related to major cardiovascular disease more than 265.1 The number of deathsrelated to cardiovascular disease in an average state median approximation shows thesample distribution is quite symmetrical
- The sample standard deviation is: 56496 shows the deviation of the distribution
Trang 5- Some states have number of deaths from cardiovascular diseases related to the samebut the number of deaths related to cardiovascular disease the most common (maximumfrequency) is 226 per 100,000 population The number of deaths related to cardiovasculardisease in a low state are: 90.9 per 100,000 people The number of deaths related tocardiovascular disease in the highest state is: 377.5 per 100,000 population Range, infact, is 286.6.
The chart shows the frequency of the number of deaths related to cardiovasculardisease in an American state
From software Megastat/Frequency Distribution/Quantitative, data entry people die
on the table, from which we have the following tables:
Frequency Distribution - Quantitative
widt h
frequenc y
percen t
frequenc y
percen t
Trang 6Frequency distribution graph of the number of deaths is quite variable balance, focus
in the middle However, the deviation (Sknewness) of the chart is -0482 <0 indicates thatthe left direction skewed distribution
1.2.Percentage of population aged 65 years and older
From software Magastat/Descriptive Statistics I enter the data portion of thepopulation aged 65 years and above, from which we have the following tables:
Trang 73rd quartile 13.475interquartile range 1.775
Comment:
The above table we find:
- The number of the states studied are: the 50 states
- Percentage of the population aged 65 years or more in a state of the United States
an average of 12,538 percent Percentage of population aged 65 years or more in a U.S.state median is 12.75% Thus, 50% of the states studied Percentage of population aged 65years or older in the U.S state of less than 12.75% and 50% of the states studiedPercentage of population aged 65 years or more in a states greater than 12.75%.Percentage of population aged 65 years or more in a state of the United States in a state
of roughly median shows the sample distribution is quite symmetrical
- The sample standard deviation is: 1905% shows the deviation of the distribution
- Some states have Percentage of population aged 65 years or more same but thepercentage of the population aged 65 years or more in a state of the United States themost common (maximum frequency) is 12.1% Percentage of population aged 65 years
or more in a state of the United States is low: 5.7% Percentage of population aged 65 orolder in most states is: 17.6% About the fact that 11.9% variation
The chart shows the frequency of the percentage of the population aged 65 years ormore in a state of the United States
Frequency Distribution - Quantitative
lowe
r
uppe r
midpoin t
widt h
frequenc y
percen t
frequenc y
percen t
Trang 8Frequency distribution graph of the variable percentage of the population aged 65years or more fairly balanced, concentrated in the center However, the deviation(Sknewness) of the chart is -0741 <0 indicates that the distribution of the direction ofdeviation left over right.
1.3.The average income of people with thousands of dollars
Age 65
Trang 9From software Megastat/ Descriptive statistics Then we enter income data in thetable, from which we have the following tables.
The above table we find:
- The number of the states studied are: the 50 states
- The average income of the people in a state average of U.S $ 28,824 thousand Theaverage income of the people in a state of the U.S median of $ 27.85 trillion Thus, 50%
of the states studied had average income of less than $ 27.85 thousand people and 50% ofthe states studied had average income of the people is greater than $ 27.85 trillion Theaverage income of the average people in a state of roughly median shows the sampledistribution is quite disproportionate
- The sample standard deviation: 6.19 shows the deviation of the distribution
- The average income of the people in each state of the United States is different (nocommon values) The average income of the people in the lowest state: $ 20,993thousand The average income of people in the highest state: $ 59,685 thousand Range,the fact that $ 38,692 thousand
Trang 10The chart shows the frequency of the average income of people in a state of theUnited States
Frequency Distribution - Quantitative
lowe
r
uppe r
midpoin t
widt h
Frequenc y
percen t
frequenc y
percen t
Conclusion: Based on the results of the investigation, authorities may considerapplication of the various public service fees in accordance with the average income inthe state or make policy about fees and other costs related to appropriate treatment
Trang 11Frequency distribution graph of the average income variable tend to be concentrated
in the middle However, some high-income areas than other regions, the income from56,000 - $ 60,000
1.4 Percent of the population are people of color.
From software Megastat/ Descriptive statistics, then enter the data rate of color onthe table, from which we have the following tables:
Descriptive statistics
The number ofdeaths
Trang 12The above table we find:
- The number of the states studied are: the 50 states
- Percentage of the population are people of color in a state of the United States anaverage of 9.9% Percent of the population are people of color in a U.S state median is6.75% Thus, 50% of the states studied had Percent black population is less than 6.75%and 50% of the states studied had Percent of black population is greater than 6.75%.Percentage of population aged 65 years or more in a state of the United States in a stategreater than the median for the sample distribution skewed right
- The sample standard deviation: 9.58% shows the deviation of the distribution
- Some states have the same percentage of the population are colored people but thepercentage of the population is the most popular color (maximum frequency) of 3.5%.Percent of the population are people of color in a state of the United States is low: 0.3%.Percent of the population are people of color in the highest states: 36.3% Range, the factthat 36%
The chart shows the frequency of the average income of people in a state of theUnited States
Frequency Distribution - Quantitative
lowe
midpoin t
widt h
Frequenc y
percen t
Trang 13Conclusion: Based on the results of the investigation, authorities may considerwelfare policy, priority for colored people
Frequency distribution graph of the variable percentage of the population are people
of color distribution as skewed left
2 Use graphs and correlation coefficients to examine the relationship between the number of deaths due to cardiovascular diseases associated with the remaining variables:
Rate of colored people
Trang 142.1 Graph the relationship between the number of deaths due to cardiovascular diseases related to the percentage of the population aged 65 years or older
From software Megastat / Correlation / Regession / Scatter Plot Then we enter thedata the number of deaths and the percentage of the population aged 65 years or more tothe table, we have the following tables
Based on the above chart, we find: Dynamics of linear dispersion form Thus,between the number of deaths due to cardiovascular diseases related to the percentage ofthe population aged 65 years and older have proportional relationship to each other
2.2 Graph the relationship between the number of deaths due to cardiovascular diseases associated with per capita income of the people
From software Megastat a / Correlation / Regession / Scatter Plot Then we enterthe death toll figures and earnings on the table, we have the following tables
Trang 15Based on the above chart, we find: dispersion graph is not linear Thus, between thenumber of deaths due to cardiovascular diseases associated with per capita income doesnot have a relationship with each other.
1.3 Graph the relationship between the number of deaths due to cardiovascular diseases associated with the percentage of the population are people of color
From software Megastat/ Correlation / Regession / Scatter Plot Then we enter thedata the number of deaths and percent of the population are people of color to the table,
we have the following tables
Trang 16Based on the above chart, we find: Graph the linear dispersion of the form ->Between The number of deaths from cardiovascular diseases associated with the rate ofcolor have relationship with each other.
The correlation coefficient between the variables
From software Megastat / Correlation / Regresion / Correlation Matrix, then we putall the data the number of deaths, age 65, of income, the rate of color on the table, withthe following results:
Correlation Matrix
Number of deaths Age 65 Income
Rate of colored people Number of deaths 1.000
Rate of colored people 312 -.095 -.093 1.000
Based on the above table we find:
- Correlation between the number of deaths and the percentage of the populationaged 65 years and older is 0788
- Correlation between the number of deaths and the average income is -0044
Trang 17- Correlation between the number of deaths and the percentage of the population arepeople of color is 0312
Thus, the percentage of the population aged 65 years and older have greatest impact
on the change of the number of deaths related to cardiovascular disease, then thepercentage of the population are people of color The average income does not affect thenumber of deaths related to cardiovascular disease
3 Confidence interval estimate for the average number of deaths for the state in Region 1 and Region 2
3.1 The average number of deaths for the states in Region 1
a Basic statistical description of the number of deaths in the states in Region 1
From software Megastat/ Descriptive statistics Then we enter the data, the number
of deaths in the first tables, which have the following data:
Descriptive statistics
The number ofdeaths
b.Estimate the average number of deaths for the states in Region 1
From the data in the table above, we use The software Megastat/Confidenceinterval-mean, enter data into the table, we have the following tables:
Confidence interval – mean
Trang 1895% confidence level257.138 Mean
57.981 std dev
2.060 t (df = 25)23.419 half-width280.557 upper confidence limit233.719 lower confidence limit
Based on the above results, we can estimate the confidence interval of the number
of deaths due to cardiovascular diseases related to the average in the states of Region 1 is
in the range (233 719; 280 557) In other words, we can estimate that 95% of the states inRegion 1 Number of deaths due to cardiovascular diseases related to range from 233.7 to280.6 of 100,000 people
3.2 The average number of deaths for the states in Region 2
Do the same as the above one
a Basic statistical description of the number of deaths in the state in Region 2.
Trang 19b Estimate the average number of deaths for the states in Region 2.
Confidence interval - mean
Confidence interval of the number of deaths due to cardiovascular diseases related
to the average in the states of Region 2 is in the range (237 266; 284 576) In other words, we can estimate that 95% of the states in Region 2 deaths related to cardiovascular diseases in the range from 237.3 to 284.6 of 100,000 people.
4.1 Comparison of the average number of deaths for the state in Zone 1 and Zone 2 From software Megastat / Hypothesis tets / Compare Two Independent Groups Then we enter the data of the dead zone 1 and zone 2 on the table, then we have the following tables:
Hypothesis Test: Independent Groups (t-test, pooled variance)
16.1489 standard error of difference
0 hypothesized difference
.8158 p-value (two-tailed)