iii Abstract The Chow’s test was proposed to test the equivalence of coefficients of two linear regression models under the assumption of equal variances.. Zhang has also proposed a wal
Trang 1NATIONAL UNIVERSITY OF SINGAPORE DEPARTMENT OF STATISTICS AND APPLIED PROBABILITY
Some New Methods for Comparing Several Sets of Regression Coefficients under Heteroscedasticity
DONE BY: YONG YEE MAY HT090765W SUPERVISOR: ASSOC PROF ZHANG JIN-TING
Trang 2Acknowledgement i
List of Tables……….……… ……… ……… ii
Abstract iii
Chapter 1 1
Introduction 1
1.1 Motivation 1
1.2 Organization of the Thesis 2
Chapter 2 3
Literature Review 3
2.1 Preliminaries on Regression Analysis 3
2.2 Conerly and Manfield’s Approximate Test 4
2.3 Watt’s Wald Test 9
Chapter 3 10
Models and Methodology 10
3.1 Generalized Modified Chow’s Test 10
3.2 Wald-type Test 15
3.2.1 2-sample case 15
3.2.2 k-sample case 16
3.2.3 ADF Test 17
3.3 Parametric Bootstrap Test 19
Chapter 4 22
Simulation Studies 22
4.1 Simulation A: Two sample cases 22
4.2 Simulation B: Multi-sample cases 27
4.3 Conclusions 34
Chapter 5 35
Real Data Application 35
5.1 Application for 2-sample case: abundance of selected animal species 35
5.2 Application for 10-sample case: investment of 10 large American corporations 37
Chapter 6 39
Conclusion 39
Bibliography 41
Appendix: Matlab Codes for Simulations 44
Trang 3i
Acknowledgement
I would like to grab this opportunity to express my heartfelt gratitude to everyone who
has provided me with their support, guidance and advice while completing this thesis
First and foremost, I would like to express my gratitude to my project supervisor,
Professor Zhang Jin-Ting, for offering me this research project and spending his valuable time
guiding me during my graduate study and research His knowledge and expertise has greatly
benefitted me During this process, I have gained valuable knowledge and experience from him
and I greatly appreciated it
I am also grateful to the Department of Statistics and Applied Probability in Faculty of
Science and National University of Singapore (NUS) for giving me this opportunity to work on
this research study
Lastly, I am highly grateful to my family and friends for their continuous support
throughout this period
Trang 4ii
List of Tables
Table 4.1 Parameter configurations for simulations ……….…… 23
Table 4.2 Empirical sizes and powers for 2-sample test (p=2) ……… 26
Table 4.3 Empirical sizes and powers for 2-sample test (p=5) ……… … 26
Table 4.4 Empirical sizes and powers for 2-sample test (p=10) ……… 27
Table 4.5 Empirical sizes and powers for 3-sample test (p=2) ……… 28
Table 4.6 Empirical sizes and powers for 3-sample test (p=5) ……….……… 29
Table 4.7 Empirical sizes and powers for 3-sample test (p=10) ……… 30
Table 4.8 Empirical sizes and powers for 5-sample test (p=2) ……… 32
Table 4.9 Empirical sizes and powers for 5-sample test (p=5) ……… 33
Table 5.1 Test Results ……….……… … 37
Table 5.2 Test Results ……….……… …… 38
Trang 5iii
Abstract
The Chow’s test was proposed to test the equivalence of coefficients of two linear regression models under the assumption of equal variances However, studies have shown that
his test may produce inaccurate results in the presence of heteroscedasticity Subsequently,
Conerly and Manfield modified his test to cater for unequal variances of two linear regression
models We generalize this modified Chow’s test to k-sample case Zhang has also proposed a
wald-type statistics, namely the approximate degrees of freedom test, to test the equality of the
coefficients of k linear regression models with unequal variances A parametric bootstrap (PB)
approach will be proposed to test the equivalence of coefficients of k linear models for
heteroscedastic case Simulation studies and real data application are presented to compare and
examine the performances of these test statistics
Keywords: linear models; Chow’s test; heteroscedasticity; approximate degrees of freedom test;
Wald statistic; parametric bootstrap
Trang 61
Chapter 1
Introduction
Regression analysis has gained much popularity in the recent years The normal linear
regression model has been widely applied to establish financial, economic or statistical
relationships As such, many analysts are interested to know if such relationships will remain
stable for different time period, or whether the same relationship can be applied for different
populations Statistically the above questions can be simply answered by testing if the sets of
observations belong to the same regression model
1.1 Motivation
For testing the equality of regression coefficients, a widely used test was Chow’s test
(1960) The assumption involved in this test was that the error variances are equal, be it within
each model or between models In reality, the likelihood that this assumption will be satisfied is
low In addition, Chow’s test had been shown by Toyoda (1974) and Schmidt and Sickles (1977) that in cases where the equality of the covariance matrices is not met, it may become severely
biased As a result, Conerly and Manfield (1988, 1989) modified his test using Satterthwaithe’s
(1946) approximation to compare heteroscedastic regression models
Trang 72
Watt (1979) had also come out with Wald test to test the equality of coefficients of
regression models with unequal variances However, studies have shown that this test has its
drawback From there, Zhang (2010) proposed an approximate degrees of freedom (ADF) test to
compare several heteroscedastic regression models In instances whereby the variances of the
regression models are the same, the usual Wald-type test statistic shows a usual F distribution In
other cases where the equality of the variances is not satisfied, the test statistic may show
misleading results However, the usual test statistic can be still be achieved by changing its
degrees of freedom This test is known as the ADF test
In this thesis, a parametric bootstrap (PB) approach for comparing several heteroscedastic
regression models is proposed This method is similar to the PB approach proposed by
Krishnamoorthy and Lu (2010) for the comparison of several normal mean vectors for unknown
and arbitrary positive definite covariance matrices
1.2 Organization of the Thesis
The thesis will be organized as follows In Chapter 2, we will review the existing
methods to test the equivalence of coefficients of two linear models Generalization of these
methods to k-sample cases and the proposed PB test will be outlined in Chapter 3 Comparison
on the empirical power of the different methodologies via simulation studies and real data
analysis is presented in Chapter 4 and Chapter 5 respectively Finally, some concluding remarks
will be given in Chapter 6
Trang 8populations For the case of homogeneity, Chow came up with a method for the comparison of
two linear regression models in 1960 The drawback is that the condition of homogeneity is
seldom satisfied Since then, several modifications and new testing methods have been published
in literature papers In this chapter, a literature review on some of the tests used for the
comparisons will be conducted
2.1 Preliminaries on Regression Analysis
Consider two independent regression models based on n1and n2 observations:
, 1, 2
i i i i i
Y X β ε (2.1) where Yi is an n i x 1 vector of observations on the dependent variable, Xi is an n i x p matrix of observed values on the p explanatory variables, βi is the p x 1 coefficient vector and εi is an 1
i
n x vector of errors It is assumed that the errors are independent normal random variables
Trang 94
with zero mean and variances 12 and 22 The hypothesis for testing the equivalence of two sets
of coefficient vectors can be formally stated as
H0:β1 β2 versus H1:β1 β2 (2.2)
2.2 Conerly and Manfield’s Approximate Test
Under the null hypothesis, the model can be combined as
2 2
Trang 102 2
P X X X X denotes the “hat” matrix for data set i1, 2 The sum of the squared
errors for the model (2.8) becomes
e e1T 1e e2T 2 Y I P1T( X1)Y1Y I P2T( X2)Y I PT( X*)YSSE F (2.12) where
1
2
*
X X
Trang 11Since F is a ratio of independent quadratic forms, Satterwaite’s approximation is applied
to the numerator and denominator independently Specifically, the distribution of the numerator
and denominator may be approximated by a2f where a and f can be determined by matching the first two moments of approximation with the exact distribution
Toyoda (1974) showed that the denominator can be approximated by
Trang 12W X X X X X X By combining this with the previous results, the approximate
distribution of the F statistics becomes
In a literature paper by Conerly and Manfield (1988), they further developed a test which
introduced an alternative denominator which gives a more accurate approximation A modified
Chow statistic, C is constructed by using * 1ˆ12 2ˆ22 as the denominator, where constants 1
and 2 are chosen to improve the approximation By matching the moments of 1ˆ12 2ˆ22to
1ˆ1 2ˆ2 1 1 1 2 2 2
Var n p n p (2.23)
Trang 13changes in variance ratio 12/ 22 For that reason, the effect of f1 and f2 on the test significance level will not be significant even as the variance ratio changes The rate of change of the
multiplier a f1 1/a f p2 2 will have to be minimized in order to stabilize the approximation
Consequently, Conerly and Manfield (1988) suggested that 1 (1 ) and 2 since
Trang 14ˆ ˆ{(1 i) i }
This method is relatively easier to implement and in the later chapters, the impact of this
estimation on the approximation will be discussed in comparison to other testing methods
2.3 Watt’s Wald Test
Another alternative test, namely the Wald test, for equality of coefficients under
heteroscedasticity, was subsequently proposed by Watt (1979) The Wald test statistic is
Trang 1510
Chapter 3
Models and Methodology
In many situations, one may be interested in comparing k sets of regression coefficients,
where k2 In this chapter, the methods mentioned previously will be generalized to k-sample cases Following that, a parametric bootstrap test will be proposed
3.1 Generalized Modified Chow’s Test
Consider k independent regression models based on n n1, , ,2 n k observations:
, 1, 2, ,
i i i i i k
Y X β ε (3.1) where Yi is an n i x 1 vector of observations on the dependent variable, Xi is an n i x p matrix of observed values on the p explanatory variables, βi is the p x 1 coefficient vector and εi is an
1
i
n x vector of errors It is assumed that the errors are independent normal random variables with zero mean and variances i2 The hypothesis for testing the equivalence of k sets of coefficient vectors can be formally stated as
Trang 1611
H0:β1β2 βk versus H1:H is not true (3.2) 0
Under the null hypothesis, the model can be combined as
alternative hypothesis, the model may be written as
a similar way as mentioned in Section 2.2
Note that the fundamental idea of the modified Chow tests, for example Conerly and
Manfield’s test, is to match the first two moments of the F-type test statistics with those of some2 distribution Since this particular method have not been generalized to the k-sample case, a modified Chow test statistic for k-sample cases based on the same methodology will be
constructed in this section
For simplicity, the degree of freedom has been omitted and the numerator of the modified
Chow’s test becomes Y PT( X*P Y , where X) PX X X X( T )1X and T PX*X*( *X T X*)1X*T
Trang 17T k
Theorem 3.1
2 1
( )
tr a tr
Trang 18( )
tr a tr
tr f tr
A
Applying similar concepts used by Conerly and Manfield (1988, 1989), if one equates the
first 2 moments of the numerator and the denominator, the multiple scalars of F distribution will
be cancelled out Let 2
( T ) ( ) k i ( T i i)
i
E Z AZ E S tr Q Q (3.10)
Since the equivalence of their expectations holds, taking S as the denominator of the test statistic
will greatly simplify the computation As S takes the form of 1ˆ12 2ˆ22 k ˆk2, it can be approximated by a 2 distribution For computation of its degree of freedom, equation (2.25) can be generalized to k-sample case
The modified Chow’s test for multiple-sample case can be constructed as
Trang 19The degree of freedom f2 can be found by solving (3.15) and (3.16) simultaneously In practice,
the approximate degrees of freedom f f1, 2 can be obtained via replacing the unknown variances
2
, 1, 2, ,
by their estimators ˆ ,i2 i1, 2, ,k given earlier We will examine and compare
the performance of this test statistic via simulation and data application in Chapter 4 and 5
respectively
Trang 2015
3.2 Wald-type Test
3.2.1 2-sample case
Recall that the hypothesis testing for the equivalence of two sets of coefficients vectors
can be statistically expressed as H0:β1 β2 versusH1:β1 β2 One can notice that the above hypothesis can be rewritten as a special case of the general linear hypothesis testing (GLHT)
x and β β1T β2TT The GLHT problem is very general as both β
amd C can be chosen such that it suits the hypothesis For illustration purpose, if we are
interested to test if β14β2, we can choose
x Hence, the Wald-type test is
more flexible and can be used in more general testing problems
The ordinary least squares estimator of βi and the unbiased estimator of i2 fori1, 2 are
ˆ ~ i n i p i
Trang 21When the homogeneity assumption of 12 and 22 is valid, i.e.12 22 2, it is natural
to estimate 2 by their pooled estimator ˆ2pool 2i1(n ip)ˆi2 / (n1 n2 2 )p Let
diag[( T ) , ( T ) ]
D X X X X Under the above assumption, Σ can be estimated by β ˆ2pool D It
is easy to see that
Trang 2217
1 2
often violated Because of this, Zhang (2010) proposed the ADF test which is based on the
Wald-type test to test for the equivalence of the coefficients for linear heteroscedastic regression
Trang 2318
Under the null hypothesis, we have Z~N q( ,0 Iq) For most cases, the exact distribution of W is
complicated and not tractable
To approximate the distribution of W , C can be decomposed into k blocks of size
q x p so that C[C1, , Ck] Set Hi (CΣ C β T)1 C and i H(CΣ C β T)1 C It follows that
W HΣ H W whereWi ˆi2H X Xi( T i i)1HT i , i1, 2, ,k For general k -samples, the
above approximated distribution of W can be derived through the following theorem
d
R G where the unknown parameters d and G are determined via
matching the first moment and the total variation of W and R Here,
d
X Y means that X and
Y have the same distribution Zhang has shown that GI and q
1 2
1( ) tr( )
i
k i i
q d
Trang 24ˆ( ) tr( )
k
i
q d
exceeds this critical value
3.3 Parametric Bootstrap Test
This parametric bootstrap (PB) approach is based on a similar test proposed by
Krishnamoorthy and Lu (2010) for testing MANOVA under heteroscedasticity The PB test
involves sampling from the estimated models This means that samples or sample statistics are
operated from parametric models with the parameters replaced by their estimates and the
operated samples are used to approximate the null distribution of a test statistics
Recall that ˆ ~β N kp( ,β Σ β) where Σ β diag[12(X X1T 1) , ,1 k2(X XT k k) ]1 Under the null
hypothesis, Cβˆ ~ ( ,N q 0 CΣ C β T) It is also well known that
2 2 2
ˆ ~ i n i p i
Trang 25p-For a given dimensionp, values of k as well as sample sizes n n1, 2, ,n k,
1 Compute the observed value T0 using equation (3.19)
Trang 2621 The proportion of times T exceed the observed value B T0 is an estimate of the p-value defined in (3.27)
Trang 2722
Chapter 4
Simulation Studies
In this chapter, the performance of the proposed PB test will be examined by comparing
the size and the power of the test statistics mentioned in the previous chapter, namely the
Conerly and Manfield’s modified Chow’s test (MC), the ADF test and the PB test The
simulation results will be presented in two studies Simulation A compares the performance of
the three tests for 2-sample cases while simulation B compares the performance of the three tests
for k-sample cases
4.1 Simulation A: Two sample cases
To illustrate the effectiveness of the proposed PB approach, simulation studies were
conducted to compare three test statistics for 2-sample cases The simulation model is designed
as follows:
Yi X βi iε εi, i ~N(0,i2), i1, 2 (4.1)
Trang 28β and β2 When 0, i.e when β1β2, the null hypothesis is true In this case, the null
hypothesis of equal variance holds Hence if we record the p-values of test statistics in this
simulation study, it will give the empirical size of the tests When 0, the power of the tests will be obtained The 12 and 22 are calculated by 2 / (1) and 2 / (1 )respectively It is not difficult to see that the parameter is designed to adjust the heteroscedasticity When 1,
we have 12 22 with respect to homogeneity case When 1, it becomes heteroscedasticity
case After the values forXi, βi and 2
i
have been generated, we can compute the values for Yi
according to the above formula In addition, for the PB approach, 1000 iterations of ( ,Z WB)
were generated This entire process is repeated N=10000 times
H 0 true ρ = 1, δ = 0 ρ = 0.1, 10, δ = 0
H 1 true ρ = 1, δ = 0.5, 1.0 ρ = 0.1, 10, δ = 0.5, 1.0
Table 4.1 Parameter configurations for simulations
The empirical sizes (when 0) and powers (when 0) of the three tests represent the proportions of rejecting the null hypothesis, i.e., when their p-values are less than the nominal
significance level For simplicity, we will set 0.05 for all simulations
Trang 2924
The empirical sizes and powers of the three tests for testing the equivalence of
coefficients are presented in Tables 4.2, 4.3 and 4.4 below, with the number of covariates to be
2,5,10
p respectively The columns labeled with " 0" present the empirical sizes of these tests, whereas the columns labeled with " 0" show the power of the tests To measure the overall performance of a test in terms of maintaining the nominal size , the average relative error (ARE) is defined as
1
1 ˆ / 100
M j i
the associated test Conventionally, when ARE10 , the test performs very well; when
10ARE20, the test performs reasonably well; and when ARE20, the test does not perform well since its empirical sizes are too conservative or liberal and therefore may be
unacceptable The ARE values of the three tests are also presented at the bottom of the tables
Initially, we compare the modified Chow’s test, the ADF test and the PB test by
examining their empirical sizes which are listed in the columns labeled with " 0" For the bivariate homogeneous case, i.e., 1 2, the empirical sizes of three tests are similar As the dimension increases, it can be seen that the values for the ADF test show largest deviation from
0.05 as compared to the other two methods Hence, we may conclude that the ADF test is worst
in maintaining the empirical size Similar observation can be made for heteroscedastic cases
When 1, it can be noticed that the values on second column and third column deviate more from 0.05 as compared to the first column Therefore, we can conclude that the modified Chow’s
Trang 3025
test performs best in maintaining the empirical size for heteroscedastic cases Although the ARE
of the PB test is larger than the ARE of the modified Chow’s test, the test is still consider to be good as its ARE10 Overall, the modified Chow’s test and the PB test perform better in maintaining the empirical size for 2-sample case
For 0 cases, the power of the tests is listed in the tables below The power of the tests increases as increases For homogeneous variances, these three tests perform comparably well with similar value of power Under heteroscedasticity, it can be observed that the modified Chow’s test performs the worst, especially for higher dimension case It can also be noted that the PB test has larger power than the ADF test, which means that the PB test performs slightly
better than the ADF test for heteroscedastic cases
Overall for 2-sample cases, all three tests perform comparably well under homogeneity
for bivariate case For higher dimension cases, the PB test is recommended as it can maintain the
empirical size well and it has the largest power as compared to the other tests
Trang 31(0.43, 1.35) (25,25) 0.050 0.047 0.049 0.534 0.551 0.549 0.979 0.982 0.982
(40,40) 0.046 0.045 0.046 0.773 0.776 0.777 1.000 1.000 1.000 (50,30) 0.051 0.052 0.051 0.637 0.641 0.642 0.991 0.991 0.991 (50,90) 0.048 0.046 0.047 0.946 0.947 0.947 1.000 1.000 1.000
(1.35, 0.43) (25,25) 0.049 0.046 0.048 0.523 0.541 0.542 0.986 0.986 0.989
(40,40) 0.053 0.054 0.053 0.779 0.779 0.784 0.995 0.995 0.995 (50,30) 0.049 0.048 0.050 0.851 0.852 0.853 0.994 0.994 0.995 (50,90) 0.051 0.048 0.051 0.896 0.898 0.898 1.000 1.000 1.000
(0.43, 1.35) (25,25) 0.052 0.046 0.053 0.718 0.763 0.768 0.991 0.991 0.991
(40,40) 0.051 0.055 0.052 0.944 0.950 0.951 1.000 1.000 1.000 (50,30) 0.047 0.046 0.047 0.848 0.866 0.866 1.000 1.000 1.000 (50,90) 0.049 0.055 0.050 1.000 1.000 1.000 1.000 1.000 1.000
(1.35, 0.43) (25,25) 0.050 0.047 0.048 0.704 0.758 0.761 1.000 1.000 1.000
(40,40) 0.049 0.043 0.047 0.943 0.951 0.951 1.000 1.000 1.000 (50,30) 0.053 0.057 0.055 0.971 0.978 0.978 1.000 1.000 1.000 (50,90) 0.047 0.045 0.046 0.986 0.987 0.987 1.000 1.000 1.000
Table 4.3 Empirical sizes and powers for 2-sample test (p=5)
Trang 32(0.43, 1.35) (25,25) 0.055 0.068 0.062 0.743 0.864 0.866 0.999 1.000 1.000
(40,40) 0.047 0.055 0.053 0.985 0.990 0.990 1.000 1.000 1.000 (50,30) 0.045 0.062 0.054 0.911 0.948 0.949 1.000 1.000 1.000 (50,90) 0.049 0.046 0.048 1.000 1.000 1.000 1.000 1.000 1.000
(1.35, 0.43) (25,25) 0.049 0.067 0.062 0.757 0.870 0.871 1.000 1.000 1.000
(40,40) 0.049 0.054 0.052 0.985 0.993 0.994 1.000 1.000 1.000 (50,30) 0.047 0.058 0.053 0.995 0.999 0.999 1.000 1.000 1.000 (50,90) 0.040 0.044 0.045 1.000 1.000 1.000 1.000 1.000 1.000
Table 4.4 Empirical sizes and powers for 2-sample test (p=10)
4.2 Simulation B: Multi-sample cases
In this simulation, we will compare the performance of the three tests for k-sample cases
Firstly, we will consider 3-sample case The data generating procedures are similar to 2-sample
case and the results are listed in the tables below Under homogeneity, it seems that the PB test
has the best performance in maintaining the empirical size for bivariate case When the variances
are not equal between the models, the ARE of the modified Chow’s test and the PB test are
smaller than the ARE of the ADF test This indicates that the ADF test has the worst ability to
maintain the empirical size under heteroscedasticity for bivariate case
To compare the power of the three tests for 3-sample case, we will look at the values
presented in the columns labeled " 0" in Table 4.5 Under homogeneity, all the tests perform
Trang 3328
comparably well as they have similar empirical power However, under heteroscedasticity, the
power of PB test is largest among the three tests Hence, in terms of the power, the PB test gives
the best performance
(15,30,30) 0.050 0.056 0.052 0.195 0.355 0.368 0.740 0.909 0.915 (30,15,15) 0.053 0.054 0.050 0.202 0.361 0.383 0.741 0.929 0.935 (1,1,4) (15,15,15) 0.055 0.046 0.053 0.070 0.217 0.241 0.122 0.608 0.646
(15,30,30) 0.047 0.044 0.049 0.081 0.331 0.351 0.197 0.891 0.897 (30,15,15) 0.054 0.042 0.048 0.085 0.310 0.332 0.198 0.893 0.903 (1,2,1) (15,15,15) 0.046 0.053 0.048 0.130 0.216 0.238 0.425 0.671 0.709
(15,30,30) 0.049 0.059 0.054 0.177 0.348 0.367 0.760 0.924 0.929 (30,15,15) 0.052 0.060 0.059 0.193 0.330 0.351 0.738 0.933 0.944 (1,4,1) (15,15,15) 0.044 0.060 0.054 0.075 0.232 0.238 0.167 0.761 0.779
(15,30,30) 0.040 0.030 0.032 0.073 0.328 0.347 0.204 0.911 0.918 (30,15,15) 0.044 0.042 0.048 0.080 0.335 0.358 0.202 0.896 0.912
Table 4.5 Empirical sizes and powers for 3-sample test (p=2)
Results for higher dimension case, i.e when p=5 and p=10, are presented in the
following tables It can be easily seen that the ARE obtained for the ADF test is the largest
among the three tests Thus, in terms of maintaining the empirical size, the ADF test is not
recommended Even though the ARE of the modified Chow’s test is smaller than the ARE of
the PB test, we can safely say that the PB test is still acceptable as ARE20 From Table 4.6 and 4.7, one can observe that the modified Chow’s test has the smallest power and that the power