Accounting for Cluster Sampling in Constructing Enumerative Sequential Sampling Plans Authors: A.. Rather, some type of hierarchical design is usually used, such as cluster sampling, whe
Trang 1BioOne sees sustainable scholarly publishing as an inherently collaborative enterprise connecting authors, nonprofit publishers, academic institutions, research libraries, and research funders in the common goal of maximizing access to critical research
Accounting for Cluster Sampling in Constructing Enumerative Sequential Sampling Plans
Author(s): A J Hamilton and G Hepworth
Source: Journal of Economic Entomology, 97(3):1132-1136.
Published By: Entomological Society of America
DOI: http://dx.doi.org/10.1603/0022-0493(2004)097[1132:AFCSIC]2.0.CO;2
URL: http://www.bioone.org/doi/
full/10.1603/0022-0493%282004%29097%5B1132%3AAFCSIC%5D2.0.CO%3B2
BioOne ( www.bioone.org ) is a nonprofit, online aggregation of core research in the
biological, ecological, and environmental sciences BioOne provides a sustainable online platform for over 170 journals and books published by nonprofit societies, associations, museums, institutions, and presses.
Your use of this PDF, the BioOne Web site, and all posted and associated content indicates your acceptance of BioOne’s Terms of Use, available at www.bioone.org/page/terms_of_use Usage of BioOne content is strictly limited to personal, educational, and non-commercial use Commercial inquiries or rights and permissions requests should be directed to the
individual publisher as copyright holder.
Trang 2Accounting for Cluster Sampling in Constructing Enumerative
Sequential Sampling Plans
A J HAMILTON1
ANDG HEPWORTH2
J Econ Entomol 97(3): 1132Ð1136(2004)
ABSTRACT GreenÕs sequential sampling plan is widely used in applied entomology GreenÕs
equa-tion can be used to construct sampling stop charts, and a crop can then be surveyed using a simple random sampling (SRS) approach In practice, however, crops are rarely surveyed according to SRS Rather, some type of hierarchical design is usually used, such as cluster sampling, where sampling units form distinct groups This article explains how to make adjustments to sampling plans that intend to use cluster sampling, a commonly used hierarchical design, rather than SRS The methodologies are
illustrated using diamondback moth, Plutella xylostella (L.), a pest of Brassica crops, as an example.
KEY WORDS cluster sampling, design effect, enumerative sampling, Plutella xylostella, sequential
sampling
WHEN CONSIDERING THE DISTRIBUTIONof individuals in a
population, the relationship between the population
variance and mean typically obeys a power law
(Tay-lor 1961) According to Tay(Tay-lorÕs Power Law (TPL),
the variance-mean relation can be described by the
following equation:
where s2and x are the sample variance and sample
mean respectively, a is a scaling factor that is
depen-dent on sampling method and habitat, and the
expo-nent b is a measure of aggregation For simplicity of
explanation, we always refer to the sample variance
(s2) and the sample mean (x or m) rather than the
population variance (2) and the population mean
() If b is not signiÞcantly different from unity, then
the population is assumed to have a random
distribu-tion When b ⬎ 1 the population is aggregated, and
when b ⬍ 1 the population has a regular distribution.
For a particular population, the parameters a and b can
be estimated by regression of ln s2on ln x:
ln s2⫽ ln aˆ ⫹ bˆ ln x [2]
where aˆ and bˆ are the estimates of a and b respectively.
TaylorÕs (1961) aˆ and bˆ values are used in GreenÕs
(1970) equation to construct a sequential sampling
plan A sampling plan derived from GreenÕs (1970)
formula makes the assumption that the Þeld will be
sampled according to SRS Most integrated pest
man-agement (IPM) sampling programs, however, use
some form of multistage systematic sampling, because
it is more practicable to implement Therefore, the sampling design needs to be taken into account when constructing a sequential sampling plan Most pub-lished enumerative plans do not do this, and thus implicitly assume that the Þeld will be sampled ac-cording to SRS (e.g., Allsopp et al 1992, Smith and Hepworth 1992, Cho et al 1995)
In this article, we show how to account for cluster sampling (Fig 1), a type of multistage design, when determining minimum sample sizes for a sequential sampling plan In cluster sampling, elementsÑthe in-dividual units from which data are collectedÑare sam-pled in clusters, each representing a primary (sam-pling) unit (Cochran 1977, pp 233 and 274) Elements are also sometimes called subunits, small-units, or sec-ondary sampling units because they are divisions of the primary sampling unit (Cochran 1977, pp 233 and 238) Note that here we are concerned with the sit-uation where the number of elements per cluster is already determined, as is often the case in practice If this were not the case, then the methods outlined by Hutchison (1994, pp 215Ð217) and Binns et al (2000,
pp 183Ð191) could be used to determine the number
of elements per cluster
Diamondback moth, Plutella xylostella (L.), is the
species used to illustrate the methodology In addition
to investigating cluster sampling issues, we take the opportunity to consider the spatial distribution of this
species, as described by TaylorÕs (1961) b parameter,
and to present a sampling plan for use in Australian broccoli crops
Materials and Methods Data Collection All data were collected from
broc-coli, Brassica oleracea variety botrytis L., Þelds
accord-ing to the followaccord-ing procedure Two transects were
1 Department of Primary Industries (KnoxÞeld), Private Bag 15,
Ferntree Gully Delivery Centre, Victoria 3156, Australia (e-mail:
Andrew.Hamilton@dpi.vic.gov.au).
2 Department of Mathematics and Statistics, The University of
Mel-bourne, Victoria, 3010, Australia.
0022-0493/04/1132Ð1136$04.00/0 䉷 2004 Entomological Society of America
Trang 3marked in a Þeld according to a standard V-shaped
pattern that extended from one corner of the Þeld, to
the midpoint on the other side, and back to the
ad-jacent corner (Hoy et al 1983, Torres and Hoy 2002)
Along each transect, there were Þve equidistantly
spaced sampling points At each point, the total
num-ber of larvae (all instars) on four nearby plants
(ele-ments) was recorded Each of these groups of four
plants formed a cluster, and there were 10 clusters,
resulting in 40 elements in total
All crops were planted in beds that consisted of two
staggered rows of plants The four plants chosen for
sampling were from the bed nearest to the sampling
point From this bed, the nearest two plants in the row
closest to the sampling point were sampled as well as
the two closest plants in the next row Thus, at each
cluster, equal numbers of elements from both sides of
the bed were sampled
Surveys were conducted on properties in the
Cran-bourne and Werribee vegetable growing regions on
MelbourneÕs outer eastern and western fringes,
re-spectively In total, 23 crops were surveyed from the
Werribee region and seven from around Cranbourne
Each crop was surveyed at approximately weekly
in-tervals from 1 wk after transplant through to harvest
(usually ⬇7 to 8 wk) Growers maintained normal
spray practices (typically calendar spraying)
through-out the period of data collection
Variance Estimation and Regression for Taylor’s a
and b Parameters The simplest way to estimate the
variance among all elements in the population (2)
would have been to use the variance associated with
the 40 elements in the sample This unadjusted sample
variance (s2) is, however, a slightly biased estimate of
2, because the elements were drawn from contiguous
groups (i.e., clusters) rather than at random We used
the method of Cochran (1977, p 239) to obtain an
(almost) unbiased variance estimate, s u2 However, for
our data set this adjustment led to negligible changes
in the estimates of TaylorÕs a and b parameters (⬍1.2
and ⬍0.09%, respectively)
TaylorÕs a and b parameters were estimated by re-gressing ln s u2 against ln xˆ Because there was error associated with estimating both s u2 and x, BartlettÕs
regression method was used (Bartlett 1949) The re-gression was done using a BASIC program (Legg 1986), and the data were sorted according to the independent variable To make inferences about the spatial distribution of the species, we tested the null hypothesis that the slope was not signiÞcantly
differ-ent from one by using the t-test procedure of Bartlett
(1949) There was one singleton observation (i.e., an observation where larvae were only observed in one element), and this was excluded from all analyses,
because the values of s2and x are restricted for sin-gleton observations (s2⫽ x), which leads to
pseudo-randomness (Taylor 1984)
Sample Size Adjustment The regression
parame-ters obtained from a TPL plot based on the s u2 esti-mates could be used to develop a sequential sampling plan using GreenÕs (1970) method However, such a plan assumes that SRS (or at least SRegS) would be used, and this is clearly not going to be so for plans that include clusters of elements (i.e., like the sampling procedure described above that was used to collect the data for this study) A plan based on these TaylorÕs
a and b values would give an underestimate of the
minimum sample size Consequently, the stop-lines for the sequential plan would also be incorrect We
adjusted for this by using the design effect (deff) (Kish
1995, pp 162 and 257) The deff is the ratio of the
variance obtained from a complex sample to the vari-ance obtained from a SRS of the same number of units (or elements) For the cluster sampling used in the broccoli Þelds,
This formula for the deff is sometimes written with a factor, 1 ⫺ f, in the denominator, where f is the
sam-pling fraction (sample size/population size) How-ever, in this study, and those to which the results will
be applied, f is negligible, and so the factor (being
close to 1) can be omitted
Note that the deff formula as deÞned by Kish (1995,
pp 162 and 257) uses the actual between cluster vari-ance rather than MSc Although the MSc is not a readily interpretable quantity, unlike the between
cluster variance, it makes the computation of the deff
more simple, because it can readily be calculated using nested analysis of variance (ANOVA)
In this way, the deff can be calculated separately for
each observation However, it is likely to vary sub-stantially across observations, and in most cases we
want to calculate an overall deff, for use in future
sampling plans Thus, it is necessary to perform a nested ANOVA covering all observations to calculate
MSc and s u2 If the data are unbalanced (e.g., variable numbers of elements per cluster across sampling oc-casions), then a restricted maximum likelihood anal-ysis should be used in preference to ANOVA
(Patter-Fig 1 Diagrammatic representation of cluster sampling
protocol used in broccoli Þelds
Trang 4son and Thompson 1971, Hepworth and Hamilton
2001)
Sampling Plan Construction The minimum sample
size (n min) for GreenÕs (1970) plan was calculated as
follows:
n min ⫽ s u2/共Dx兲2
where D is the nominal desired level of precision,
expressed in terms of the standard error as a fraction
of the mean The sequential sampling stop-line (T n)
was calculated as follows:
T n⫽冉D2
aˆ冊1/共ˆb-2兲
n min 共ˆb⫺1兲/共ˆb⫺2兲
where aˆ and bˆ are the TPL regression parameters,
estimated as described above
A sequential sampling chart is usually constructed
by plotting T n against n min(Green 1970) However,
here we needed to multiply n min by the deff to correct
for the fact that sampling with clusters of four plants
will be used for monitoring the crop Thus, T nwas
plotted against an adjusted minimum sample size
(n minA ⫽ n min deff) Note that the calculation of T nis
based on n min not n minA If n minA was used in the
calculation of T n ,and also in the plot, the inßuence of
the deff would cancel itself out.
Results and Discussion
TPL described the variance-mean relation well (r2
⫽ 0.90) (Fig 2) The slope of the Þtted line, 1.39, was
signiÞcantly greater than unity (t ⫽ 3.42, df ⫽ 23, P ⬍
0.01), indicating that overall the “population” had a
contagious distribution This is consistent with
obser-vations of the spatial distribution of P xylostella on
other Brassica vegetable crops On caulißower, P
xy-lostellalarvae (data pooled across all instars)
demon-strated a contagious distribution, with slopes of 1.41
and 1.31 for TPL and IwaoÕs (1968) mean crowding on
mean density plots, respectively (Chen and Su 1986)
In a study of P xylostella larvae in a commercial
cab-bage crop, TPL gradients of 1.35 and 1.26were
re-ported for data sets collected by two different scouts (Trumble et al 1989) Sivapragasam et al (1986)
de-scribed a contagious distribution for P xylostella eggs,
larvae (all instars pooled) and pupae on cabbages using IwaoÕs (1968) plots Fitting of the negative bi-nomial distribution has also been used to demonstrate departures from the Poisson distribution, and the
ex-ponent k can be used as a measure of contagion.
Harcourt (1960) Þtted the negative binomial distri-bution to eggs, each instar and pupae on cabbage For
all of these life stages, k remained reasonably stable
with changes in density and suggested a contagious distribution However, when considering all instars together on caulißower, Chen and Su (1986) could not
detect a constant k across different densities The
contagious distribution of larvae observed in the cur-rent study is probably a function of the tendency of females to lay groups of eggs on the one plant Al-though some dispersal of early instars, driven by crowding, and of mature larvae looking for pupation sites, is likely to occur (Harcourt 1960), this does not seem to be sufÞcient to lead to a noncontagious larval distribution
The overall deff in this example was 1.31 The fact that the deff was ⬎1 meant that more samples would
need to be collected to satisfy the stop rule in the
deff-corrected plan as opposed to the uncorrected
plan (Figs 3 and 4) The disparity between the two plans increased as the nominal level of precision
in-creased (i.e., as D dein-creased) Precision levels of
around 0.2 or 0.3 would be considered practicable for pest sampling (Southwood 1978), whereas higher
pre-cision, e.g., D ⫽ 0.1, is usually desired for research
purposes (Manly 1989) Thus, the unadjusted pling plan would underestimate the number of sam-ples that would need to be collected to achieve the desired level of precision This is particularly so at mean densities of about 0.5 to 1 larvae per plant (Fig 2), and such densities are not uncommon in sprayed crops in practice (Fig 2) In general, the sample sizes
demanded by the D ⫽ 0.1 and 0.2 sampling plans (Fig.
2) would be considered practicable, but this would not
be so for the D ⫽ 0.1 plan Thus, in Australia and New
Fig 2 TaylorÕs power plot for diamondback moth in
broccoli ln s2⫽ 0.63 ⫹ 1.39 ln x, r2⫽ 0.90
Fig 3 Minimum sample sizes for three levels of
preci-sion Solid lines indicate that the sample size has been
ad-justed using the deff, whereas dashed lines are unadad-justed.
Trang 5Zealand, where P xylostella is typically the only major
lepidopteran pest, this plan could provide a useful
alternative to the binomial plans of Hamilton et al
(2004) and Mo et al (2004)
The deff has been used here to adjust the minimum
sample size for GreenÕs (1970) sequential plan, but it
can be applied to any type of sequential sampling plan
For example, it could also be used to adjust the
min-imum sample size for KunoÕs (1969) plan, which is the
other most commonly used enumerative sequential
plan The deff has also been used to adjust the variance
estimate used in a presence-absence sampling plan
and was found to have a particularly marked inßuence
when a small proportion of the sampling units was
infested (Hepworth and McFarlane 1992)
The deff was used here to adjust for a two-stage
design, but it could also be used to correct for
multi-stage designs Furthermore, it is appropriate for
hier-archical designs other than clustering, such as
strati-Þed sampling However, whereas the deff would
usually be expected to be ⬎1 for cluster sampling, it
would typically be ⬍1 for stratiÞed samples (Kish
1995, p 259) The deff is ⬎1 for most cluster samples
because the elements within the clusters are often
correlated In contrast, the very purpose of stratiÞed
sampling is usually to reduce the overall variance
as-sociated with the estimate of the mean Thus, the
variance estimate obtained from a stratiÞed sample
would be expected to be less than that from a SRS of
the same number of units or elements, which would
lead to a deff of ⬍1.
The deff could also be used to adjust for a design
similar to that used in this study, but with the number
of elements per cluster (d) being different from four
(the number used here) The numerator in equation
1 can be expressed as MSc ⫽ ds c⫹ MSc where s cis
the between cluster variance, which is easily
calcu-lated from the ANOVA Substituting a different
num-ber for d results in a deff different from 1.31 For
example, d ⫽ 2 results in deff ⫽ 1.10, and d ⫽ 8 results
in deff ⫽ 1.72 As expected, a smaller number of
ele-ments per cluster results in a variance more similar to
that of a SRS with the same total number of elements,
and a larger number of elements per cluster results in
a variance that is much greater
Acknowledgments
This work was funded by the AusVeg Levy and Horticul-ture Australia We are grateful to Nancy Endersby and others who scouted the crops and to several growers for allowing us
to sample properties Peter Ridland provided useful com-ments on a draft of the manuscript
References Cited
Allsopp, P G., T L Ladd, Jr., and M G Klein 1992 Sample
sizes and distributions of Japanese beetles (Coleoptera: Scarabaeidae): captured in lure traps J Econ Entomol 85: 1797Ð1801
Bartlett, M S 1949 Fitting a straight line when both
vari-ables are subject to error Biometrics 5: 207Ð212
Binns, M R., J P Nyrop, and W van der Werf 2000
Sam-pling and monitoring in crop protection: the theoretical basis for developing practical decision guides CABI Pub-lishing, New York
Chen, C., and W Su 1986 Spatial pattern and
transforma-tion of Þeld counts of Plutella xylostella (L.) larvae on
caulißower Plant Protect Bull (Taiwan, R.O.C.) 28: 323Ð 333
Cho, K., C S Eckel, J F Walgenbach, and G G Kennedy.
1995 Spatial distribution and sampling procedures for
Frankliniellaspp (Thysanoptera: Thripidae) in staked tomato J Econ Entomol 88: 1658Ð1665
Cochran, W G 1977 Sampling techniques Wiley, New
York
Green, R H 1970 On Þxed precision level sequential
sam-pling Res Popul Ecol 12: 249Ð251
Hamilton, A J., N Schellhorn, N E Endersby, P M Ridland, and S A Ward 2004 A dynamic binomial sequential
sampling plan for Plutella xylostella (Lepidoptera:
Plute-llidae) on broccoli and caulißower in Australia J Econ Entomol 97: 127Ð135
Harcourt, D G 1960 Distribution of the immature stages of
the diamondback moth, Plutella maculipennis (Curt.)
(Lepidoptera: Plutellidae), on cabbage Can Entomol 92: 517Ð521
Hepworth, G., and A J Hamilton 2001 Scan sampling and
waterfowl activity budget studies: design and analysis considerations Behavior 138: 1391Ð1405
Hepworth, G., and J R McFarlane 1992 Variance of the
estimated population density from a presence-absence threshold sample J Econ Entomol 85: 2240Ð2245
Hoy, C W., C Jennison, A M Shelton, and J T Andaloro.
1983 Variable-intensity sampling: a new technique for
decision making in cabbage pest management J Econ Entomol 76: 139Ð143
Hutchison, W D 1994 Sequential sampling to determine
population density, pp 207Ð244 In L P Pedigo and G D.
Buntin [eds.], Handbook of sampling methods for ar-thropods in agriculture CRC, Boca Raton, FL
Iwao, S 1968 A new regression method for analyzing the
aggregation pattern of animal populations Res Popul Ecol 10: 1Ð20
Kish, L K 1995 Survey sampling Wiley, New York Kuno, E 1969 A new method of sequential sampling to
obtain population estimates with a Þxed level of precision Res Popul Ecol 11: 127Ð136
Fig 4 Critical stop-lines for three levels of precision.
Solid and dashed lines as for Fig 2 Note the different tick
intervals for the upper and lower portions of the y-axis
Trang 6Legg, D E 1986 An interactive computer program for
cal-culating BartlettÕs regression method North Central
Computer Inst Software J 2: 1Ð23
Manly, B F 1989 A review of methods for the analysis of
stage-frequency data, pp 3Ð69 In L McDonald, B Manly,
J Lockwood, and J Logan [eds.], Estimation and analysis
of insect populations Springer, Berlin, Germany
Mo, J., G Baker, and M Keller 2004 Evaluation of
sequen-tial binomial sampling plans for decision-making in the
management of diamondback moth (Plutella xylostella)
(Plutellidae: Lepidoptera) In Proceedings of the 4th
In-ternational Workshop on the Management of
Diamond-back Moth and other Crucifer Pests, 15Ð26November
2001, Melbourne, Australia Department of Natural
Re-sources and Environment, Melbourne, Australia
Patterson, H D., and R Thompson 1971 Recovery of
inter-block information when inter-block sizes are unequal
Bi-ometrika 58: 545Ð554
Sivapragasam, A., Y Ito, and T Saito 1986 Distribution
patterns of immatures of the diamondback moth, Plutella
xylostella(L.) (Lepidoptera: Yponomeutidae) and its
lar-val parasitoid on cabbages Appl Entomol Zool 21: 546Ð 552
Smith, A M., and G Hepworth 1992 Sampling statistics
and a sampling plan for eggs of pea weevil (Coleoptera: Bruchidae) J Econ Entomol 85: 1791Ð1796
Southwood, T.R.E 1978 Ecological methods: with
particu-lar reference to the study of insect populations Chapman
& Hall, London, England
Taylor, L R 1961 Aggregation, variance and the mean.
Nature (Lond.) 189: 732Ð735
Taylor, L R 1984 Assessing and interpreting the spatial
distributions of insect populations Annu Rev Entomol 29: 321Ð357
Torres, A N., and C W Hoy 2002 Sampling scheme for
carrot weevil (Coleoptera: Curculionidae) in parsley J Econ Entomol 31: 1251Ð1258
Trumble, R H., M J Brewer, A M Shelton, and J P Nyrop.
1989 Transportability of Þxed-precision level sampling
plans Res Popul Ecol 31: 325Ð342
Received 23 June 2003; accepted 22 January 2004.