Tests not requiring assumptions involving specific parametric distributions for the data or for the sampling distribution of the test statistics are called nonparametric.. Nonparametric
Trang 1Statistics in Geophysics: Inferential Statistics III
Steffen Unkel
Department of Statistics Ludwig-Maximilians-University Munich, Germany
Trang 2Tests not requiring assumptions involving specific parametric distributions for the data or for the sampling distribution of the test statistics are called nonparametric
Nonparametric methods areappropriateif
1 we know or suspect that the parametric assumption(s) required for a particular test are not met;
2 a test statistic that is suggested or dictated by the problem at hand is a complicated function of the data, and its sampling distribution is unknown and/or cannot be derived analytically.
Only a few nonparametric tests for location will be presented here
Trang 3One-sample Wilcoxon signed-rank test
Let X1, , Xn be a random sample with continuous cdf FX(·)
Suppose that it is desired to test that the 0.5 quantile, xmed,
of the population sampled from is a specific value, say δ0 Consider the test problems:
(a) H 0 : x med = δ 0 vs H 1 : x med 6= δ 0
(b) H 0 : x med ≥ δ 0 vs H 1 : x med < δ 0
(c) H 0 : x med ≤ δ 0 vs H 1 : x med > δ 0
For i = 1 , n, let Di = Xi− δ0 and define
Zi =
1 if Di > 0
0 if Di < 0 .
Trang 4Test statistic
W+=
n
X
i =1
RiZi, where Ri is the rank of |Di|
Rejection region:
(a) W+> w1−α/2+ or W+< wα/2+
(b) W+< wα+
(c) W+> w1−α+ ,
where wα+ denotes the α-quantile of the distribution of W+
Trang 5One-sample Wilcoxon signed-rank test
For sufficiently large samples: Approximation by
N n(n+1)4 ,n(n+1)(2n+1)24
Test statistic:
+− n(n+1)4 q
n(n+1)(2n+1) 24
a
∼ N (0, 1)
Rejection region:
(a) Z > z1−α/2 or Z < zα/2
(b) Z < z α
(c) Z > z 1−α ,
where zα is the α-quantile of the standard normal distribution
Trang 6We assume that the sampling situation is such that we observepaired data (X1, Y1), , (Xn, Yn)
For i = 1, , n, the differences Di = Xi− Yi arise from a continuous distribution and each pair (Xi, Yi) is chosen randomly and independent
The null hypothesis is that themedian difference, δ, between pairs of observations is zero
Consider the test problems:
(a) H 0 : δ = 0 vs H 1 : δ 6= 0
(b) H 0 : δ ≥ 0 vs H 1 : δ < 0
(c) H 0 : δ ≤ 0 vs H 1 : δ > 0
Trang 7Wilcoxon signed-rank test for paired data
Define
Zi =
1 if Di > 0
0 if Di < 0 Test statistic:
W+=
n
X
i =1
RiZi, where Ri is the rank of |Di|
Rejection region:
(a) W + > w1−α/2+ or W + < wα/2+
(b) W + < w +
α (c) W + > w1−α+ ,
where wα+ denotes the α-quantile of the distribution of W+
Trang 8Given two samples of independentdata, the aim is to test for
a possible difference in location
The null hypothesis is that the two data samples have been drawn from the same distribution
Under H0 there are n + m observations making up a single distribution, where n (m) denote the number of observations
in sample 1 (sample 2)
The test statistic is a function of the ranks of the data values within the n + m observations that are pooledunder H0
Trang 9Wilcoxon rank-sum test
Let X1, , Xn and Y1, , Ym be two random samples from populations with continuous cdfs FX(·) and FY(·),
respectively
Consider the test problems:
(a) H0: xmed = ymed vs H1: xmed 6= y med
(b) H0: xmed ≥ y med vs H1: xmed < ymed
(c) H0: xmed ≤ ymed vs H1: xmed > ymed .
Arrange the n + m observations of the pooled sample
X1, , Xn, Y1, , Ym in ascending order
Define
Vi =
1 if the i -th order statistic belongs to the X sample
0 if the i -th order statistic belongs to the Y sample
Trang 10Test statistic:
Wn,m =
n+m
X
i =1
iVi =
n
X
i =1
R(Xi) , where R(Xi) is the rank of Xi in the pooled sample
Rejection region:
(a) Wn,m> w1−α/2(n, m) or Wn,m< wα/2(n, m)
(b) Wn,m< wα(n, m)
(c) Wn,m> w1−α(n, m),
where wα denotes the α-quantile of the distribution of Wn,m For sufficiently large samples: Approximation by
N (n(n + m + 1)/2, nm(n + m + 1)/12)