A random sample of 40 survey respondents and obtained the following results: - The average time watching TV one week of 40 is 15.3 hours and the standard deviation is 3.8 hours.. That ne
Trang 1Bài tập về Thống kê trong kinh doanh MBA
S & P 500 is a very large group of 500 companies listed in the United States In 2006, the average annual return of the stocks in the S & P 500 was 13.62% with a standard deviation
of approximately 20% According to historical data, profits of this stock has a normal distribution Use the above parameters to evaluate a number of possibilities by answering the following questions
a Possibility (probability) to a share of the S & P 500 will gain at least 25%, at least 50%?
b Possibility (probability) to a share of the S & P 500 lost at least 25%, at least 50%?
c Use 3 sigma principles for a stock in the S & P 500 will have profit and loss fluctuations like?
Note:
For each question, students use the blanket Megastat to do exercises, the student is not clear how to use and make data and software After putting the data into the software to calculate, students use the results calculated by software included in the exercises to answer each question
Assignments.
a Possibility (probability) to a share of the S & P 500 will gain at least 25%, at least 50%?
I use the blanket Megastat to answer questions From software to the Probability / Normal Distribution then enter the data into the table we have the following result
1.1 Possibility (probability) to a share of the S & P 500 will gain at least 25%.
Đỗ Quốc Việt
Trang 2Theo bài ra ta có: µ = 13.62; চ = 20
Normal distribution
P(lowe
Thus we have the possibility (probability) to a share of the S & P 500 will gain at least 25%, P = 0.2843 Possibility (probability) to a group neck phie in the S & P 500 will gain at least 50%
Trang 31.2 Tính khả năng (xác suất) để một cổ phiế trong nhóm S&P 500 sẽ thu lợi ít nhất 50%.
Normal distribution
P(lower
We have the ability (probability) to a share of the S & P 500 will gain at least 50%, P = 0.0344
Trang 4a Possibility (probability) to a share of the S & P 500 lost at least 25%, at least 50%?
2.1 The possibility (probability) to a share of the S & P 500 lost at least 25%.
Normal distribution
P(lowe
.0268 9732 -1.93 -25 14 20
Trang 5We have the ability (probability) to a share of the S & P 500 lost at least 25% is P = 0268
2.2 The possibility (probability) to a share of the S & P 500 lost at least 50%.
Normal distribution
P(lower )
P(upper
.0007 .9993 -3.18 -50 14 20
We have the ability (probability) to a share of the S & P 500 will be at a loss of at least 50%, P = 0.0007
Trang 6c Use 3 sigma principles for a stock in the S & P 500 will have profit and loss fluctuations like?
We have μ - 3 Ϭ = 13.62 - 3 * 20 = - 46.38
μ + 3 Ϭ = 13.62 + 3 * 20 = 73.62
Let X be the income, we have P (μ - 3 Ϭ <X> μ + 3 Ϭ), so that we can return to a profit and loss shares will be in the range of (μ - 3 Ϭ <X> μ + 3 Ϭ )
Normal distribution
P(lowe r)
P(uppe r) z X mean std.dev
3 2
1 0
-1 -2
-3
f(z)
z
Trang 7.0013 .9987 -3.00 -46.38 13.62 20.00
.9987 0013 3.00 73.62 13.62 20.00
We have the probability P = 0.9987 number of stock losses and profits are in the range 46.38 to 73.62
Question 2.
A market research for an electronics company to investigate the TV viewing habits of the population in a region A random sample of 40 survey respondents and obtained the following results:
- The average time watching TV one week of 40 is 15.3 hours and the standard deviation is 3.8 hours
- Of the 40 people, 27 people watching the show "Who Wants to be a Millionaire" during the week
a Construction 95% confidence interval for the average TV viewing time per week of local people
b Is it the average TV viewing time has increased over the past 5 years (under investigation average time watching TV 13go 1 week (use hypothesis testing to check)
c Find the 95% confidence interval for the percentage of people watching the show "Who Wants to be a Millionaire" Explain the significance of the findings
Suppose researchers want to perform in one area of investigation Said:
d That need to get as many people to investigate that with 95% reliability will estimate the average time watching TV deviation less than 2 hours around the sample average (assuming the overall standard deviation of 5 hours )
e How many people need to investigate if you want to estimate the proportion Son gười View program "Who Wants to be a Millionaire" with a 95% confidence level and margin of error 0.035)
a Construction 95% confidence interval for the average TV viewing time per week of local people.
Trang 8From software Megastat a / Confidence Interals / Sample Size-approaches, given the data we have the following result.
Confidence interval - mean
95% confidence level
15.30 mean
3.80 std dev
40.00 n
1.960 z
1.178 half-width
16.478 upper confidence limit
14.122 lower confidence limit
Reliability of 95% of the time watching TV in an average week of people in the area from 14,122 hours to 16,478 hours
b Is it the average TV viewing time has increased over the past 5 years (when chronological survey view TV average 13giờ 1 week (using hypothesis testing to check).
We use hypothesis testing methods From software Megastat / hypothesis Tests / Meanve Hypothesized Value, put the data in the following table
Hypothesis Test: Mean vs Hypothesized Value
13.00 hypothesized value
15.30 mean thoi gian
3.80 std dev
0.60 std error
40.00 n
3.83 z
0.0001 p-value (one-tailed, upper)
Conclusion: We have P-value <α = 5% Thus, the time watching television of the region has increased Therefore the stated assumption is correct
Trang 9c Find the 95% confidence interval for the percentage of people watching the show "Who Wants to be a Millionaire" Explain the significance of the findings.
Calculated from the formula the blanket Megastat / Confidence Interval / Sample Size / Confidence Interval - Proportion, we enter the data, we have the following result
Confidence interval - proportion
95% confidence level 0.67
5
proportion
40 n 1.96 0
z
0.14 5
half-width
0.82 0
upper confidence limit
0.53 0 lower confidence limit
With the reliability of 95%, the percentage of people watch the show is to be a millionaire at least 53% and at most 82%
d That need to get as many people to investigate that with 95% reliability will estimate the average time watching TV deviation less than 2 hours around the sample average (assuming the overall standard deviation of 5 hours).
From the blanket Megastat / Confidence quãng / Sample Size / Confidence Interval - Proportion, we enter the data, we have the following result Sample size - mean
Trang 102.00 E, error tolerance 5.00 standard deviation 95% confidence level 1.960 z
24.009 sample size 25.00 rounded up
With the reliability of 95% and a standard deviation of 5 hours, we need to sample at least 25 people
e. How many people need to investigate if you want to estimate the proportion Son gười View program "Who Wants to be a Millionaire" with a 95% confidence level and margin of error 0.035)
To search for people to investigate, we use the method to estimate confidence intervals Find the sample size estimation in the case of P
From the blanket Megastat / Confidence quãng / Sample Size / Sample Size-P, we enter the data, we have the following result
Sample size - proportion
0.035 E, error tolerance
0.50 estimated population proportion
95% confidence level
1.960 z
783.971 sample size
784.00 rounded up
From the above results we see the need to investigate from 784 people
Question 3
Trang 11House prices (Asessedvalue) depends on the area (Size of house) and age of the house (Age) The following data collected from 15 homes.
Assessed Value ($000)
Area (1000s feet square)
Age (years)
a Multiple regression model estimates
b Explain the meaning of the coefficients get
c Said the projected price of the house with area 1750 squarefeets and 10 years old
d Using t-test to check the real impact of the area and age of the house to its price
e Determination coefficient R2 and explain its meaning
Trang 12a Multiple regression model estimates
1.1 We define the area of influence on the price of the house.We use the blanket Megastat, choose the Regression / Correlation - Scatter plot, enter data into the system we have.
From the chart above shows, the house has the greater area, the higher the price That is the area affected in the same direction as the value of the house
1.2 We determine the effects of to the house so their values.
Trang 13From the chart we can see, the house age the greater the price decreases and vice versa Thus, the influence of the end-to-home price is in the opposite direction
1.3 The relationship between size, age and price.
From software Megastat / Correlation / Regression / Correlation matrix, we put the data in, we have the following results
Correlation Matrix
Assessed Value ($000) Area (1000s feet square) Age (years) Assessed Value ($000) 1.00 0.81 - 0.80
Trang 14Area (1000s feet square) 0.81 1.00 - 0.58
Age (years) - 0.80 - 0.58 1.00
15.00 sample size
± 514 critical value 05 (two-tail) ± 641 critical value 01 (two-tail)
We have the relationship between house prices and area, age is pretty tight, the influence of these factors are equal Area affected in the same direction as price, age effects in the opposite direction to the price
b.Explain the meaning of coefficients received.
We analyze the relationship between the area and the age of them
From software Megastat / Correlation / Regression / Correlation Analysis, we put the data in, we have the following results
Regression Analysis
R² 0.826
Std Error 2.168 Dep Var Assessed Value ($000)
ANOVA table
variables coefficients std error t (df=12) p-value 95% lower 95% upper
Trang 15Intercept 163.7751 5.4072 30.288 1.05E-12 151.9939 175.5563 Area (1000s feet
From the table above we have Y = β0 + β1 * X1 + β2 * X2 = 163.7751 + 10.72 * Area - 0284 * Age
Meaning: - If the life of the house does not change, when the area increases by 1 unit (square feet), the average price of homes increased to $ 10,725.00.
- If the area of the house does not change, the age of the house increases by 1 unit (one year), the average house price is reduced $ 284.00.
c Said the projected price of the house with area 1750 squarefeets and 10 years old.
We use the model to estimate the multiple linear regression to calculate
From software Megastat / Correlation / Regression / Correlation Analysis, we put the data in, we have the following results
Regression Analysis
R² 0.83 Adjusted R² 0.80 n 15.00
R 0.91 k 2.00 Std Error 2.17 Dep Var Assessed Value ($000)
ANOVA table
Regression 268.72 2.00 134.36 28.58 0.00 Residual 56.41 12.00 4.70
Trang 16Total 325.14 14.00
variables coefficients std error t (df=12) p-value 95% lower 95% upper
Intercept 163.78 5.41 30.29 0.00 151.99 175.56 Area (1000s feet square) 10.73 3.01 3.56 0.00 4.16 17.29
Age (years) - 0.28 0.08 - 3.40 0.01 - 0.47 - 0.10
Predicted values for: Assessed Value ($000)
95% Confidence Intervals 95% Prediction Intervals Area (1000s feet square) Age (years) Predicted lower upper lower upper Leverage
1,750.00 10.00 18,930.00 7,447.23 30,412.78 7,447.23 30,412.78 5,908,389.59 Thus we have, the house has an area of 1750 square feets cost about $ 18.930
f Using t-test to check the real impact of the area and age of the house to its price.
• Test the hypothesis: H0: β1 = 0 (H1: β 1 ≠ 0)
In this audit, we considered α = 5%, Pvalue = 0.39% <α; therefore reject the hypothesis H0 Conclusion β 1 ≠ 0, which means that the area can really affect house prices
• Test the hypothesis: H0: β2 = 0 (H1: β 2 ≠ 0)
In this audit, we considered α = 5%, Pvalue = 0.53% <α; therefore reject the hypothesis H0 Conclusion β2 ≠ 0, that is, the age of the house can affect the price
e Determination coefficient R2 and explain its meaning.
Trang 17From the C we have R2 = 0.826 is the coefficient of determination adjusted influenced by the number of explanatory variables and sample size Larger R2, the influence of age and area to house up is larger and vice versa.We have R2 = 0826 is relatively large So, the model is relatively good choice
References.
1 Management decision-making curriculum - Dr Nguyen The Manh -PGSM
2 Software Megastat