Our study consists of 5 factors that are presumed to shape Statistics and Probability scores: Advanced Math scores, self-studying time per day, interest in the subject, classroom partici
Trang 1FOREIGN TRADE UNIVERSITY FACULTY OF INTERNATIONAL ECONOMICS
Vũ Nam Khánh - 1814450045
Vũ Minh Hồng - 1816450041
Trang 3ABSTRACT
Statistics and Probability is a subject with a long history of development The subject has been acknowledged as one of the foundation subjects for first-year economics students around the world because of its immense applicability
Therefore, our group has decided to conduct in-depth research on the determinants that influence the Statistics and Probability scores of economics students
Our study consists of 5 factors that are presumed to shape Statistics and Probability scores: Advanced Math scores, self-studying time per day, interest in the subject, classroom participation and attention to the lesson After analyzing data running from STATA, it is concluded that only attention to the lesson does not leave a strong impact on Statistics and Probability scores whereas the four remaining factors do At the end of the report, some resolutions andrecommendations are given to further assist in improving economics freshmen's Statistics and Probability scores
Trang 4INTRODUCTION
There is a general consensus that leads to the difference in the performance
of economics students Virtually all accredited business schools require theirstudents to take one or more courses in both mathematics and business statistics In addition, most introductory business statistics courses require one or more math courses to provide the necessary mathematical foundation for statistics However, despite these prerequisite math courses, many students do poorly in their business and economics statistics (hereafter, business statistics) course It has even been alleged that "…Business Statistics is the most hated, most unpopular course in the business program." Potential reasons cited for poor student performance includestatistics anxiety, inadequate statistics instruction, inadequate math preparationbefore matriculation and inadequate math prerequisites prior to taking the statistics course
In this study, we focus on the importance of math prerequisites for student performance in the business statistics course Specifically, we use an ordered probit model to examine the relationship between alternative math course sequences and the grades earned by students the first time they complete the business statisticscourse We then show how imposing a minimum grade requirement of C- for the prerequisite math course would be expected to affect student performance in business statistics
Several studies have previously examined the impacts of mathematics skills and topics on student performance in business statistics To our knowledge,however, this is the first study to examine the effect of alternative prerequisite math course sequences on student performance It is also the first study to demonstrate
Trang 5the effect on student success in business statistics of imposing a minimum grade requirement for the prerequisite math course
I Overview of the topic (Review of economic theories and statement of research hypotheses)
1 Foundation for variables and model choosing 1.1. Foundation of choosing variables
Our assumption is that the Statistical and Probability Scores are affected by the following variables: Advanced Math scores, self-studying time per day, interest
in the subject, classroom participation and attention to the lesson
- Advanced Math Score: Because Advanced Math includes the skills andknowledge to study Statistics and Probability, we expect that higher scores in Advanced Math with lead to higher scores in Statistical and Probability scores
- Self-study hours per day on Statistics and Probability subject: Self-study is a great method that students can use to enhance their learning experience
Using self-study, students can go beyond simply learning what theirtextbooks and instructors teach By practicing self-study, they areencouraged to explore more topics that interest them, developing stronger research skills Therefore, the more time students spend self-study to review and practice the subject, the higher their score will be
- Interested in the topic of Statistics and Probability: Because of interest in the subject, students make more efforts to study or learn more about this topic
Therefore, the higher the interest in Statistics and Probability, the higher the score of this course
- Attention in class: It is believed that the more attention students pay for in the lesson, the higher the score will be
- Class contribution: A successful lesson built on student contributions; In addition, contributing to the lesson by asking questions requires students to
Trang 6think logically and help them understand the lesson deeply Therefore, themore contributions a student has to make, the higher the subject score
1.2. Foundation of choosing models
- Multiple regression model: is an extension of simple linear regression It is used when we want to predict the value of a variable based on the value of two or more other variables The variable we want to predict is called the dependent variable (or sometimes, the outcome, target or criterion variable)
The variables we are using to predict the value of the dependent variable are called the independent variables (or sometimes, the predictor, explanatory or regressor variables) (statistics.laerd.com)
2 Definitions 2.1 Statistics and Probability subject
Statistics are the study of a wide range of areas, including analysis, interpretation, presentation and data organization When applying statistics in science, industry or social issues, it is usually starting with studying a statistical overall or a statistical model
The word probability is derived from the Latin word probate and means "to prove, to verify" Put simply, probably is one of many words referring to uncertain facts or knowledge, aimed at defining "ability" These are two related but separate academic disciplines Statistical analysis oftenuses probability distributions, and the two topics are often studied together
Learning about the probability we will work with tests, is considered to be experimental, experimental, and random quantities, real-life randomprocesses When solving a problem we often make assumptions, then we need to see how much the assumption is true, then we have to perform the test The testing of such a hypothesis is called a statistical hypothesis test, whose test results are calculated based on actual, calculated data
Trang 72.2. Advanced Mathematics in college
Advanced math is a subject with a level of advanced than the type of high school math that we have ever studied and it is intended for undergraduate students It is based on basic knowledge of generalmathematics such as spatial geometry, statistical probability or quantities in mathematics and upgrades them to other tiers more difficult, so it is called advanced mathematics Advanced mathematics is a difficult subject that requires students to study hard to be able to do their exercises In fact, advanced math is often used for business majors such as business
administration, finance goods, accounting …
2.3. Self-study
Self-studying is a learning method where students direct their own studying—outside the classroom and without direct supervision Sincestudents are able to take control of what (and how) they are learning, self-study can be a very valuable way for many students to learn these methods help students learn and retain information better, helping boostcomprehension, grades, and motivation
Using self-study, students are able to go beyond simply learning what their class textbooks and instructors teach them By practicing self-study,they are encouraged to further explore topics they are interested in,
developing stronger study skills as a result
2.4. Interest
Interest is the state of wanting to know or learn about something or someone Interest in Statistics and Probability subject is the feeling of wanting to pay more attention and time to learn or research about this subject
Trang 82.5. Attention focus
Paying attention means to listen to, watch, or consider something or someone very carefully That means students focus carefully on the Statistics and probability lessons
2.6. Class contribution
Class contribution is a combination of a combination of three modes
of assessment: individual assessments (a student's development and progress during the term), comparative assessments (what members of the samesection, or class, demonstrate is possible), and contextual assessments (what students whose work have been evaluated over the years suggests about thefull spectrum of class contribution performances) It is also defined as regularly attending class not just for filling a seat
II Model Specification
1 Literature review
The purpose of the present study was to identify factors that may contribute
to economics students who are having difficulty in introductory and advanced statistics courses
Probability and statistics, the branches of mathematics concerned with the laws governing random events, including the collection, analysis, interpretation, anddisplay of numerical data Probability has its origin in the study of gambling and insurance in the 17th century, and it is now an indispensable tool of both social and natural sciences Statistics may be said to have its origin in census counts taken thousands of years ago; as a distinct scientific discipline, however, it was developed
in the early 19th century as the study of populations, economies, and moral actions and later in that century as the mathematical tool for analyzing such numbers For technical information on these subjects, see probability theory and statistics
Trang 92 Object
For economics students, countless factors influence the test score instatistical and probability Among the factors that stand out is the time to study the subject yourself, the way you listen to the lectures of the teachers that must be effective in the learning process and the score of the advanced mathematics
Sometimes, students think that their self-study time is not suitable for test scores because there is an injustice between students who have bad self-study and students who have low self-study So, we want to ask the question, "Whether or not all the factors aforementioned affects how the score in Statistics and Probability subject of economics students"
3 Constructing economics model
spscore =f (amscore, interest, class, study, attention)
in which:
amscore: Scores of the Advanced Math subject interest: student’s interest in Statistics and Probability subject class: Contribution to the Statistics and Probability classes study: Self-study hours per day on Statistics and Probability subject attention: attention paying to lecturers
4 Specifying economics model
s coreP = β0+ β1amscore+ β2interest+ β3class+ β4study+ β5attention+ μ
In which β0: is the intercept of the regression model
βi : is the slope coefficient of the independent variable
μ : is the disturbance of the regression model
III Estimated model and statistical inferences
GPA or score is the biggest goal when a student decides to get the tertiary level It requires students to make their efforts in a long time In the process, there are the main factors affecting the GPA and their degree including good and bad
Trang 10factors In modern life, students are often distracted by several external factors which have adverse affection on their studies In fact, these factors are constantly increasing They affect the six factors mentioned above However, a lot of universities manage to avoid such situations For example, in the period of2007-2011 4,22% of students of Pedagogy University (Da Nang University) had average results Graduation results of the University of Foreign Language (Da Nang University) had only 4,8% average students At Da Nang University of Science and Technology, course 2006-2011 graduated with 82% of graduates having gooddegree or higher At Duy Tan University, a number of students receiving good or higher degrees accounted for 94,5% Another example is the 58th school year (2008-2012) of Hanoi National University of Education, among 1,547 students, only 9 students graduated with an average degree (accounting for 0,58%) In addition, Ho Chi Minh City University of Technology, the number of students receiving good, excellent degree was 27,6% As for Van Hien University, at the end
of 2012, graduation ceremony gave degrees to 1155 graduates, only 27 individuals received excellent degree, 386 students received good, accounting for 36% It is clear that the GPA or Statistic and Probability Scores in FTU witnessed the change
This reported is supposed to clarify this problem
1 Data overview
- This set of data is a primary one, as it is collected from our survey We get the data from our survey on FTU students and gain 150 qualifiedobservations after cleaning all sets of data
- We use des command to give a general description of the variables The most important information obtained after runningdes command is the meaning of the variables Here is the result that our group got when doing a statistic description about the dependent variable and independent variables, byrunning des command expressed as “des spscore amscore interest attention study class”
Trang 11variable name storage type display format value label variable label spscore float %8.0g
amscore float %8.0g interest byte %8.0g attention byte %8.0g study float %8.0g class byte %8.0g
We had a summarizing table based on the above result in the table:
Variables Explanation Type of variable Format spscore Score of Statistics
and Probability subject
Dependent variable Quantitative %8.0g
amscore Scores of the
Advanced Math subject
Independent variable Quantitative %8.0g
interest Interest in Statistics
and Probability classes
Independent variable (Dummy variable)
Qualitative %8.0g
Class Contribution to the
Statistics and Probability Classes
Independent variable (Dummy variable)
Qualitative %8.0g
study Self-study hours per
day on Statistics and Probability subject
Independent variable Quantitative %8.0g
attention Attention paying on
the lectures
Independent variable (Dummy variable)
Qualitative %8.0g
Trang 12The description: We run Sum command in Stata in order to get statistics indicators of the variables
After processing, the result we have:
In which:
Obs is the number of observations
Std. Dev is the standard deviation of the variable
Min is the minimum value of the variable
Max is the maximum value of the variable
variable Obs Mean Std Dev Min Max spscore 150 8.03 1.272093 6 10 amscore 150 7.36 1.455284 4 10 interest 150 0.56 0.498099 0 1 class 150 0.5333333 0.500559 0 1 study 150 0.4033333 0.3603534 0 2.5
2 Estimation of econometrics model
According to our hypothesis mentioned above: We expect β1, β2, β3, β4, β5 to
be positive (+)
3 Building the experimental model
3.1 Checking correlation among variables
First and foremost, we have to analyze the correlation of variables, determining the correlation coefficients then specifically consider whether there is multicollinearity among variables in the model With using Corr command in Stata,
we have:
(obs=150)
spscore amscore interest attention study class
Trang 13spscore 1.0000 amscore 0,7464 1.0000 interest 0,7519 0.4422 1.0000 attention 0,5217 0.7190 0.3310 1.0000
The table illustrates that:
The correlation coefficient between spscore and amscore is: 0,7464 The correlation coefficient between spscore and interest is: 0,7519 The correlation coefficient between spscore and attention is: 0,5217 The correlation coefficient between spscore and study is: 0,8153 The correlation coefficient between spscore and class is: 0,7125 From this statement, It can be easily seen that the correlation among variables is less than 1 so that there is not a strong correlation among variables in the model
3.2 Regression run
With using Reg command in Stata, we have a sample regression model:
F(5, 144) = 197.58 Prob > F = 0.0000 R-squared = 0.8728 Adj R-squared = 0.8684 Root MSE = 0.46154
Trang 14attention 0.0270072 0.1169929 0.23 0.818 -0.2042381 0.2582525
study 1.217141 0.1512262 8.05 0.000 0.918231 1.516051
class 0.5057395 0.1017541 4.97 0.000 0.3046149 0.7068642
_cons 4.869813 0.2761929 17.63 0.000 4.323897 5.415729
Analysis of regression coefficients:
β1 = 0.2693326: Other determinants are held constant When the score of Advanced Math (amscore) increases (decreases) by one score, the score of Statistics and Probability increases (decreases) 0.2693326 score
β2 = 0.7293491: Other determinants are held constant Statistics and Probabilityscores of students who have interest in this subject is higher by 0.2693326 than those of students who do not have interest
β3= 1.217141: Other determinants are held constant When the number of hours for studying Statistics and Probability increases (decreases) by one hour, the score of statistics and probability (spscore) increases (decreases) by 1.217141 score
β4 = 0.3826367: Other determinants are held constant Statistics and Probabilityscores of students who pay attention to the lectures in this this subject is higher by 0.3826367 than those of students who do not pay attention
β5 = 0.0270072 Ceteris paribus, Statistics and Probability scores of students who contribute to classes in this this subject is higher by 0.0270072 than those of students who do not contribute
4 Multicollinearity and heteroskedasticity testing
4.1 Multicollinearity Testing
- Using corr command:
amscore interest attention study class amscore 1.0000
interest 0.4422 1.0000
Trang 15attention 0.7190 0.3310 1.0000 study 0.6011 0.6028 0.4052 1.0000 class 0.4901 0.5976 0.2811 0.5482 1.0000
Based on the result, we can observe that the independent variables do not correlate strongly with each other and there is no multicollinearity in the model
- Variance Inflation Factor (VIF) Running vif command, we have the result:
Step 1: Run regression model
F(5, 144) = 197.58 Prob > F = 0.0000 R-squared = 0.8728 Adj R-squared = 0.8684 Root MSE = 0.46154
Residual 30.6744795 144 0.21301721
9
Trang 16Step 2: Run rvfplot command
Based on the graph, the points do not distribute regularly, which is a sign of possible Heteroscedasticity
- Apply White test:
Run imtest, white command, we have the result as following:
imtest, white White's test for Ho: homoskedasticity against Ha: unrestricted heteroskedasticity
chi2(17) = 120.01 Prob > chi2 = 0.0000 Cameron & Trivedi's decomposition of IM-test
Trang 17Prob (>chi2) = 0.0 < α = 0.05 so we do not accept H0 (homoscedasticity)
There is heteroskedasticity in this set of data
5 Coefficients testing
Test each coefficient to know whether it is meaningful to the model, in other words, we test the significance of each independent variable on the dependent one (spscore) Two hypotheses for hypothesis testing:
5.1 P-value
If P-value of an independent variable is smaller than the confidence level, we reject H0, accept H1 It means this variable has significance on spscore
Test for overall significance of β1:
Prob (β1) = 0.000 < 0.05, we cannot reject H0 at level of significance α = 5%
Therefore, study is statistically significant on spscore
Test for overall significance of β2:
Prob (β2) = 0.000 < 0.05, we reject H0 at level of significance α = 5% Therefore, β2
is statistically significant at 5%
Test for overall significance of β3 :
Prob (β3) = 0.818 > 0.05, we do not reject H0 at level of significance α = 5%
Therefore, β3 is not statistically significant at 5%
Test for overall significance of β4 :
Prob (β4) = 0.000 < 0.05, we reject H0 at level of significance α = 5% Therefore, β4
is statistically significant at 5%
Test for overall significance of β5 :
Prob (β5) = 0.000 < 0.05, we reject H0 at level of significance α = 5% Therefore, β5
is statistically significant at 5%
In conclusion, attention does not have significant impact on spscore, study,
interest , class and amscore have significant impact on spscore
5.2 Confidence Interval
Trang 18Variables Coefficient Significant Level Confidence Interval const B0 5% (4.323897;5.415729) amscore B1 5% (0.1818351;0.3568301) interest B2 5% (0.5229854;0.9357129) attention B3 5% (-0.2042381;0.2582525) study B4 5% (0.918213;0.7068642) class B5 5% (0.3046149;0.7068642)
For the all the coefficients, 0 doesn’t belong to the confidence interval, so we reject the hypothesis H0 in the 5 pairs of hypothesis above Therefore, the all the coefficients are statistically significant with the confidence level of 95%
0.025=1.985 |tqs| > t150
0.025=1.984 interest B2 6.99 t150
0.025=1.985 |tqs| > t150
0.025=1.984 attention B3 0.23 t150
0.025=1.985 |tqs| > t150
0.025=1.984 study B4 8.05 t150
0.025=1.985 |tqs| > t150
0.025=1.984 class B5 4.97 t150