while for the case of size-biased sampling Y Z it reduces to the harmonicmean of the sample Y-values^mHT n X s yÿ1 t !ÿ1: This last expression can be compared with the full informatio
Trang 1of Y given Z) However, it is also easy to see that the value ^fsmaximising thethird and fourth terms is not the face value estimator of f In fact, it is defined
by the estimating equation
Recollect that our aim here is estimation of the marginal population ation m of Y The maximum sample likelihood estimate of this quantity is then
expect-^ms
yfU( yjz; ^bs) fU(z; ^fs)dydz:
This can be calculated via numerical integration
Now suppose Y Z In this case Pr(It 1jYt yt) / yt, so Pr(It 1) /E(Yt) 1=y, and the logarithm of the sample likelihood becomes
ln (Ls(y)) ln Y
t2s
Pr(It 1jYt yt) fU( yt; y)Pr(It 1; y)
!/ ln Y
Finally we consider cut-off sampling, where Pr(It 1) Pr(Yt> K) eÿyK.Here
EU(^ms) EU(Es(^ms)) EU(E(ysj yt> K; t 2 s) ÿ K) m:
Trang 22.4 PSEUDO-LIKELIHOOD pseudo-likelihoodThis approach is now widely used, forming as it does the basis for the methodsimplemented in a number of software packages for the analysis of complexsurvey data The basic idea had its origin in Kish and Frankel (1974), withBinder (1983) and Godambe and Thompson (1986) making major contribu-tions SHS (section 3.4.4) provides an overview of the method.
Essentially, pseudo-likelihood is a descriptive inference approach to hood-based analytic inference Let fU( yU; y) denote a statistical model for theprobability density of the matrix yU corresponding to the N population values
likeli-of the survey variables likeli-of interest Here y is an unknown parameter and theaim is to estimate its value from the sample data Now suppose that yU isobserved The MLE for y would then be defined as the solution to an estimat-ing equation of the form scU(y) 0, where scU(y) is the score function for ydefined by yU However, for any value of y, the value of scU(y) is also a finitepopulation parameter that can be estimated using standard methods In par-ticular, let ^sU(y) be such an estimator of scU(y) Then the maximum pseudo-likelihood estimator of y is the solution to the estimating equation ^sU(y) 0.Note that this estimator is not unique, depending on the method used toestimate scU(y)
Here ptis the sample inclusion probability of unit t Setting this estimator equal
to zero and solving for y, and hence (by inversion) m, we obtain the Horvitz±Thompson maximum pseudo-likelihood estimator of m This is
which is the Hajek estimator of the population mean of Y Under probabilityproportional to Z sampling this estimator reduces to
^mHT X
s
zÿ1 t
!ÿ1X
s
ytzÿ1 t
Trang 3while for the case of size-biased sampling (Y Z) it reduces to the harmonicmean of the sample Y-values
^mHT n X
s yÿ1 t
!ÿ1:
This last expression can be compared with the full information maximumlikelihood estimator (the known population mean of Y ) and the maximumsample likelihood estimator (half the sample mean of Y ) for this case As anaside we note that where cut-off sampling is used, so population units with Ygreater than a known constant K are sampled with probability one with theremaining units having zero probability of sample inclusion, no design-unbiased estimator of scU(y) can be defined and so no design-based pseudo-likelihood estimator exists
Inference under pseudo-likelihood can be design based or model based.Thus, variance estimation is usually carried out using a combination of aTaylor series linearisation argument and an appropriate method (designbased or model based) for estimating the variance of ^sU(y) (see Binder, 1983;SHS, section 3.4.4) We write
Trang 4In particular, Binder and Roberts explore the link between analytic (i.e.model-based) and design-based inference for a class of linear parameters cor-responding to the expected values of population sums (or means) under anappropriate model for the population From a design-based perspective,design-unbiased or design-consistent estimators of these population sumsshould then be good estimators of these expectations for large-sample sizes,and so should have a role to play in analytic inference Furthermore, since asolution to a population-level estimating equation can usually be approximated
by such a sum, the class of pseudo-likelihood estimators can be represented inthis way Below we show how the total variation theory developed by Binderand Roberts applies to these estimators
Following these authors, we assume that the sampling method is mative given Z That is, conditional on the values zU of a known populationcovariate Z, the population distributions of the variables of interest Y and thesample inclusion indicator I are independent An immediate consequence isthat the joint distribution of Y and I given zU is the product of their twocorresponding `marginal' (i.e conditional on zU) distributions
noninfor-To simplify notation, conditioning on the values in iUand zU is denoted by asubscript x, and conditioning on the values in yU and zU by a subscript p Thesituation where conditioning is only with respect to the values in zU is denoted
by a subscript xp and we again do not distinguish between a random variableand its realisation Under noninformative sampling, the xp-expectation of afunction g( yU, iU) of both yU and iU (its total expectation) is
Exp[g( yU, iU)] EU[EU(g( yU, iU)j yU, zU)jzU] Ex[Ep(g( yU, iU))]since the random variable inside the square brackets on the right hand side onlydepends on yU and zU, and so its expectation given zU and its expectationconditional on iU and zU are the same The corresponding total variance forg( yU, iU) is
Trang 5Ep( ^Vp(BN)) XN
t1
XN u1
DtuUt(BN)Uu(BN) o(nÿ1)and so is unbiased for the total variance of ^B to the same degree of approxima-tion,
Trang 6varx( ^B ÿ BN) XN
t1
XN u1
So far we have assumed noninformative sampling and a correctly specifiedmodel What if either (or both) of these assumptions are wrong? It is often saidthat design-based inference remains valid in such a situation because it does notdepend on model assumptions Binder and Roberts justify this claim in Chapter
3, provided one accepts that the expected value Ex(BN) of the finite populationparameter BN under the true model remains the target of inference under suchmis-specification It is interesting to speculate on practical conditions underwhich this would be the case
2.6 BAYESIAN INFERENCE FOR SAMPLE SURVEYS bayesian inference for sample surveys
So far, the discussion in this chapter has been based on frequentist arguments.However, the Bayesian method has a strong history in survey sampling theory(Ericson, 1969; Ghosh and Meeden, 1997) and offers an integrated solution toboth analytic and descriptive survey sampling problems, since no distinction ismade between population quantities (e.g population sums) and model param-eters In both cases, the inference problem is treated as a prediction problem.The Bayesian approach therefore has considerable theoretical appeal Unfortu-nately, its practical application has been somewhat limited to date by the need
to specify appropriate priors for unknown parameters, and by the lack ofclosed form expressions for estimators when one deviates substantially fromnormal distribution population models Use of improper noninformative priors
is a standard way of getting around the first problem, while modern, tionally intensive techniques like Markov chain Monte Carlo methods nowallow the fitting of extremely sophisticated non-normal models to data Conse-quently, it is to be expected that Bayesian methods will play an increasinglysignificant role in survey data analysis
computa-In Chapter 4 Little gives an insight into the power of the Bayesian approachwhen applied to sample survey data Here we see again the need to model thejoint population distribution of the values yU of the survey variables and thesample inclusion indicator iU when analysing complex survey data, with
Trang 7analytic inference about the parameters (y, o) of the joint population tion of these values then based on their joint posterior distribution Thisposterior distribution is defined as the product of the joint prior distribution
distribu-of these parameters given the values zU of the design variable Z times theirlikelihood, obtained by integrating the joint population density of yU and iUover the unobserved non-sample values in yU Descriptive inference about acharacteristic Q of the finite population values of Y (e.g their sum) is thenbased on the posterior density of Q This is the expected value, relative to theposterior distribution for y and o, of the conditional density of Q given thepopulation values zU and iU and the sample values ys
Following Rubin (1976), Little defines the selection process to be ignorablewhen the marginal posterior distribution of the parameter y characterising thepopulation distribution of yU obtained from this joint posterior reduces to theusual posterior for y obtained from the product of the marginal prior andmarginal likelihood for this parameter (i.e ignoring the outcome of the sampleselection process) Clearly, a selection process corresponding to simple randomsampling does not depend on yU or on any unknown parameters and istherefore ignorable In contrast, stratified sampling is only ignorable whenthe population model for yU conditions on the stratum indicators Similarly,
a two-stage sampling procedure is ignorable provided the population modelconditions on cluster information, which in this case corresponds to clusterindicators, and allows for within-cluster correlation
2.7 APPLICATION OF THE LIKELIHOOD PRINCIPLE IN
DESCRIPTIVE INFERENCE the likelihood principle in descriptive inferenceSection 2.2 above outlined the maximum likelihood method for analytic infer-ence from complex survey data In Chapter 5, Royall discusses the relatedproblem of applying the likelihood method to descriptive inference In particu-lar, he develops the form of the likelihood function that is appropriate when theaim is to use the likelihood principle (rather than frequentist arguments) tomeasure the evidence in the sample data for a particular value for a finitepopulation characteristic
To illustrate, suppose there is a single survey variable Y and the descriptiveparameter of interest is its finite population total T and a sample of size n istaken from this population and values of Y observed Let ysdenote the vector
of these sample values Using an argument based on the fact that the ratio ofvalues taken by the likelihood function for one value of a parameter comparedwith another is the factor by which the prior probability ratio for these values
of the parameter is changed by the observed sample data to yield the ponding posterior probability ratio, Royall defines the value of likelihoodfunction for T at an arbitrary value q of this parameter as proportional to thevalue of the conditional density of ys given T q That is, using LU(q) todenote this likelihood function,
Trang 8LU(q) fU( ysjT q) fU(qj yf s) fU( ys)
Clearly, this definition is easily extended to give the likelihood for any defined function of the population values of Y In general LU(q) will depend onthe values of nuisance parameters associated with the various densities above.That is, LU(q) LU(q; y) where y is unknown Here Royall suggests calculation
well-of a prwell-ofile likelihood, defined by
LprofileU (q) max
y LU(q; y):
Under the likelihood-based approach described by Royall in Chapter 5, theconcept of a confidence interval is irrelevant Instead, one can define regionsaround the value where the likelihood function is maximised that correspond toalternative parameter values whose associated likelihood values are not toodifferent from the maximum Conversely, values outside this region are thenviewed as rather unlikely candidates for being the actual parameter value ofinterest For example, Royall suggests that a value for T whose likelihood ratiorelative to the maximum likelihood value ^TMLE is less than 1/8 should beconsidered as the value for which the strength of the evidence in favour of
^TMLE being the correct value is moderately strong A value whose likelihoodratio relative to the MLE is less than 1/32 is viewed as one where the strength ofthe evidence in favour of the MLE being the true value is strong
Incorporation of sample design information (denoted by the known tion matrix zU) into this approach is conceptually straightforward One justreplaces the various densities defining the likelihood function LU by condi-tional densities, where the conditioning is with respect to zU Also, although notexpressly considered by Royall, the extension of this approach to the case ofinformative sampling and/or informative nonresponse would require the nature
popula-of the informative sampling method and the nonresponse mechanism also to betaken account of explicitly in the modelling process In general, the conditionaldensity of T given ysand the marginal density of ysin the expression for LU(q)above would be replaced by the conditional density of T given the actual surveydata, i.e yobs, rs, iU and zU, and the joint marginal density of these data
Trang 9David A Binder and Georgia R Roberts
3.1 CHOICE OF METHODS choice of methodsOne of the first questions an analyst asks when fitting a model to data that havebeen collected from a complex survey is whether and how to account forthe survey design in the analysis In fact, there are two questions that theanalyst should address The first is whether and how to use the samplingweights for the point estimates of the unknown parameters; the second is how
to estimate the variance of the estimators required for hypothesis testing andfor deriving confidence intervals (We are assuming that the sample size issufficiently large that the sampling distribution of the parameter estimators isapproximately normal.)
There are a number of schools of thought on these questions The puremodel-based approach would demand that if the model being fitted is true,then one should use an optimal model-based estimator Normally this wouldresult in ignoring the sample design, unless the sample design is an inherent part
of the model, such as for a stratified design where the model allows for differentparameter values in different strata The Bayesian approach discussed in Little(Chapter 4) is an example of this model-based perspective
As an example of the model-based approach, suppose that under the model it
is assumed that the sample observations, y1, , yn, are random variableswhich, given x1, , xn, satisfy
yt x0
tb "t, for t 1, , n, (3:1)
Analysis of Survey Data Edited by R L Chambers and C J Skinner
Copyright ¶ 2003 John Wiley & Sons, Ltd.
ISBN: 0-471-89987-9
Trang 10where "t has mean 0, variance s2, and is uncorrelated with "t0 for t 6 t0.Standard statistical theory would imply that the ordinary least squares estima-tor for b is the best linear unbiased estimator In particular, standard theoryyields the estimator
where Xsand ys are based on the sample observations
Example 1 Estimating the mean
We consider the simplest case of model (3.1) where xtis a scalar equal to one forall t In this case, b, a scalar, is the expected value of the random variables
yt(t 1, , n) and ^b is ys, the unweighted mean of the observed sampley-values Here we ignore any sample design information used for obtainingthe n units in our sample Yet, from the design-based perspective, we know thatthe unweighted mean can be biased for estimating the finite population mean,
yU Is this contradictory? The answer lies both in what we are assuming to bethe parameter of interest and in what we are assuming to be the randomisationmechanism
In the model-based approach, where we are interested in making inferencesabout b, we assume that we have a conceptual infinite superpopulation ofy-values, each with scalar mean, b, and variance s2 The observations,
y1, , yn, are assumed to be independent realisations from this tion The model-based sample error is ^b ÿ b, which has mean 0 and variance
superpopula-s2=n The sample design is assumed to be ignorable as discussed in Rubin(1976), Scott (1977a) and Sugden and Smith (1984)
In the design-based approach, on the other hand, where the parameter ofinterest is yU, the finite population mean, we assume that the n observations are
a probability sample from the finite population, y1, , yN There is no ence to a superpopulation The randomisation mechanism is dictated by thechosen sampling design, which may include unequal probabilities of selection,clustering, stratification, and so on We denote by It the 0±1 random variableindicating whether or not the tth unit is in the sample In the design-basedapproach, all inferences are made with respect to the properties of the randomvariables, It (t 1, , N), since the quantities yt (t 1, , N), are assumed
refer-to be fixed, rather than random, as would be assumed in the model-basedsetting From the design-based perspective, the finite population mean isregarded as the descriptive parameter to be estimated Considerations, such
as asymptotic design-unbiasedness, would lead to including the samplingweights in the estimate of yU Much of the traditional sampling theory adoptsthis approach, since it is the finite population mean (or total) that is of primaryinterest to the survey statistician
Trang 11At this point, it might appear that the design-based and model-basedapproaches are irreconcilable We will show this not to be true In Section 3.2
we introduce linear parameters and their estimators in the design-based andmodel-based contexts We review the biases of these estimators under both thedesign-based and model-based randomisation mechanisms
In Section 3.3 we introduce design-based and total variances and we comparethe results under the design-based and model-based approaches The totalvariance is based on considering the variation due to both the model and thesurvey design This comparison leads to some general results on the similaritiesand differences of the two approaches We show that the design-based ap-proach often leads to variances that are close to the total variance, even though
a model is not used in the design-based approach These results are similar tothose in Molina, Smith and Sugden (2001)
The basic results for linear parameters and linear estimators are extended tomore complex cases in Section 3.4 The traditional model-based approach isexplored in Section 3.5, where it is assumed that the realisation of the random-isation mechanism used to select the units in the sample may be ignored formaking model-based inferences We also introduce an estimating functionframework that leads to a closer relationship between the pure model-basedand the pure design-based approaches In Section 3.6 we study the implications
of taking the `wrong' approach and we discuss why the design-based approachmay be more robust for model-based inferences, at least for large samples andsmall sampling fractions We summarise our conclusions in Section 3.7
3.2 DESIGN-BASED AND MODEL-BASED LINEAR ESTIMATORSdesign-based and model-based linear estimators
We consider the case where there is a superpopulation model which generatesthe finite population values, y1, , yN We suppose that, given the realisedfinite population, a probability sample is selected from it using some, possiblycomplex, sampling scheme This sampling scheme may contain stratification,clustering, unequal probabilities of selection, and so on Our sample can thus beviewed as a second phase of sampling from the original superpopulation
In our discussions about asymptotics for finite populations, we are assumingthat we have a sequence of finite populations of sizes increasing to infinity andthat for each finite population, we take a sample such that the sample size alsoincreases to infinity In the case of multi-stage (or multi-phase) sampling, it isthe number of first-stage (or first-phase) units that we are assuming to beincreasing to infinity We do not discuss the regularity conditions for theasymptotic normality of the sampling distributions; instead we simply assumethat for large samples, the normal distribution is a reasonable approximation.For more details on asymptotics from finite populations, see, for example, Sen(1988) In this and subsequent sections, we make frequent use of the o-notation
to denote various types of convergence to zero First of all, we use
an bn o(nÿq) to mean nqjanÿ bnj ! 0 almost surely as n ! 1 Next, todistinguish among various types of convergence in probability under different
Trang 12randomisation mechanisms, we use the somewhat unconventional notation
an bn op(nÿq), an bn oxp(nÿq), and an bn ox(nÿq) for convergence inprobability under the p-, xp- and x-randomisations, respectively (We definethese randomisation mechanisms in subsequent sections.)
We first focus on linear estimators As we will see in Section 3.4, many of theresults for more complex estimators are based on the asymptotic distributions
of linear estimators To avoid complexities of notation, we consider the plest case of univariate parameters of interest
which is the expectation of the sample mean, ysPItyt=n, when the sampledesign is ignorable Normally, modellers are interested in b only when the mtareall equal; however, to allow more generality for complex situations, we are notrestricting ourselves to that simple case here Defining m Pmt=N, we assumethat b has the property that
that is, that b converges to m This important assumption is reasonable whenthe sample design is ignorable We discuss the estimation of b for the case ofnonignorable sample designs in Section 3.6
The design-based parameter of interest that is analogous to b is the finitepopulation mean,
Trang 13which, when combined with (3.6), implies that
On the other hand, the usual design-based linear estimator of b has the form
^b yd XItdtyt, (3:11)where the dtare chosen so that ^b has good design-based properties In particu-lar, we assume that
Ep( ^b) XEp(Itct)yt: (3:15)
We see that ^b is not necessarily asymptotically design unbiased for b; thecondition for this is that
We now show that this condition holds when the model is true
Since, by the strong law of large numbers (under the model),
which means that condition (3.16) holds when the model is true
Trang 14We conclude, therefore, that when the data have been generated according tothe assumed model, not only does ^b have good model-based properties forestimating b, but also ^b is approximately design unbiased for b.
Using similar arguments, it may be shown that Ex(^b) ÿ b o(1), so that ^b isapproximately model unbiased for the finite population quantity b Addition-ally, we can see that the model expectations and the design expectations of both
^b and ^b all converge to the same quantity, m
N2), then the usual design-based estimator for b yU is
^b (N1y1s N2y2s)=N, (3:20)where y1sand y2s are the respective stratum sample means Now,
Ep( ^b) (n1y1U n2y2U)=n, (3:21)where y1U and y2U are the respective stratum means In general, for dispropor-tionate sampling between strata, the design-based mean of ^b is not equal to
3.3 DESIGN-BASED AND TOTAL VARIANCES OF LINEAR
ESTIMATORS3.3.1 Design-based and total variance of ^b design-based and total variances of linear estimatorsThe observed units could be considered to be selected through two phases ofsampling ± the first phase yielding the finite population generated by thesuperpopulation model, and the second phase yielding the sample selectedfrom the finite population Analysts may then be interested in three types of
Trang 15randomisation mechanisms The pure design-based approach conditions on theoutcome of the first phase, so that the finite population values are consideredfixed constants The pure model-based approach conditions on the sampledesign, so that the Itare considered fixed; this approach will be discussed further
in Section 3.5 An alternative to these approaches is to consider both phases ofsampling to be random This case has been examined in the literature; see, forexample, Hartley and Sielken (1975) and Molina, Smith and Sugden (2001) Wedefine the total variance to be the variance over the two phases of sampling
We now turn to examine the variance of ^b The pure design-based variance isgiven by
varp(^b) varpXItdtytX XDtt0ytyt0, (3:23)where Dtt0 denotes the design-based covariance of Itdt and It0dt0
Under a wide set of models, it can be assumed that Ep(^b ÿ yU) Ox(Nÿ3),where Ox refers to the probability limit under the model This assumption iscertainly true when the model is based on independent and identically distrib-uted observations with finite variance, but it is also true under weaker condi-tions Therefore, assuming the sampling fraction, n=N, to be o(1), and denotingthe total variance of ^b by varxp(^b), we have
varxp(^b) varxEp(^b) Exvarp(^b)
varx( yU) Ex X XDtt0ytyt0
o(nÿ1)
sX XDtt0(stt0 mtmt0) o(nÿ1) (3:24)where
stt0 covx( yt, yt0) (3:25)and
s X Xstt0=N2 varx( yU): (3:26)
If we assume that s is O(Nÿ1) then
varxp(^b) X XDtt0(tt0 mtmt0) o(nÿ1): (3:27)Note that assuming that s is O(Nÿ1) is weaker than the commonly usedassumption that the random variables associated with the finite populationunits are independent and identically distributed realisations with finite vari-ance from the superpopulation
When sis O(Nÿ1), we see that varxp(^b) and Exvarp(^b) are both ally equal to P PDtt0(stt0 mtmt0) Thus, a design-based estimator of varp(^b),given by vp(^b), would be suitable for estimating varxp(^b), provided vp(^b)
asymptotic-is asymptotically model unbiased for Exvarp(^b) This is the case when
vp(^b) converges to varp(^b) P PDtt0ytyt0 For a more complete discussion ofdesign-based variance estimation, see Wolter (1985)
Trang 163.3.2 Design-based mean squared error of ^b and its model expectationDesign-based survey statisticians do not normally use ^b to estimate b, since ^b isoften design-inconsistent for b However, design-based survey statisticiansshould not discard ^b without first considering its other properties such as itsmean squared error Rather than examining the mean squared error in general,
we continue with the situation introduced in Example 2
Example 3
In Example 2, we had
^b ys (n1y1s n2y2s)=n, (3:28)the unweighted sample mean Our sample design was a stratified randomsample from two strata with n1 units selected from the first stratum (stratumsize N1) and n2units selected from the second stratum (stratum size N2) Since
Ep( ^b) (n1y1U n2y2U)=n, the design-based bias of ^b in estimating b is
Ep( ^b) ÿ b a( y1Uÿ y2U), (3:29)where
The design-based variance of ^b, ignoring, for simplicity, the finite populationcorrection factors (by assuming the stratum sample sizes to be small relative tothe stratum sizes), is
varp( ^b) (n1v1U n2v2U)=n2, (3:31)where v1U and v2U are the respective stratum population variances The design-based mean squared error of ^b would therefore be smaller than thedesign-based variance of ^b when
which is approximately equal to s2=n when N1and N2are large compared with n
On the other hand, the model expectation of the design-based variance isgiven by
Trang 17Example 3 illustrates that when the model holds, ^b may be better than ^b as anestimator of b, from the point of view of having the smaller design-based meansquared error However, as pointed out by Hansen, Madow and Tepping(1983), model-based methods can lead to misleading results even when thedata seem to support the validity of the model.
3.4 MORE COMPLEX ESTIMATORS3.4.1 Taylor linearisation of non-linear statistics more complex estimators
In Sections 3.2 and 3.3 we restricted our discussion to the case of linearestimators of the parameter of interest However, many methods of dataanalysis deal with more complex statistics In this section we show that theproperties of many of these more complex quantities are asymptotically equiva-lent to the properties of linear estimators, so that the discussions in Sections 3.2and 3.3 can be extended to the more complex situations We take a Taylorlinearisation approach here
3.4.2 Ratio estimation
To introduce the basic notions, we start with the example of the estimation of aratio The quantities yt and xt are assumed to have model expectations mYand
mX, respectively We first consider the properties of the model-based estimator,
ys=xs, of the ratio parameter, y mY=mX Assuming the validity of a first-orderTaylor expansion, such as when the y-values and x-values have finite modelvariances, we have
Trang 18xsÿ y
1n
X
Itut(mX, mY) ox(nÿ3), (3:37)
so that the model-based properties of ys=xs are asymptotically equivalent tolinear estimators which are discussed in Sections 3.2 and 3.3 The design-basedexpectation of ys=xsis
X
ptut(mX, mY) ox(nÿ3), (3:38)where pt is the probability that the tth unit is in the sample Since
Ex[ut(mX, mY)] 0 for t 1, , n, then Ep( ys=xs), in turn, has model ation
expect-Exp( ys=xs) y o(nÿ3): (3:39)Therefore, the xp-expectation of ys=xsconverges to the parameter of interest, y,when the model holds
Similarly, yd=xd PItdtyt=PItdtxt, a design-based estimator of the ratio ofthe finite population means, may be linearised as
yd
xdÿyU
xU XItdtut(xU, yU) op(nÿ3), (3:40)where ut(xU, yU) is defined analogously to ut(mX, mY) in (3.36) for t 1, , N.Since Put(xU, yU) 0, the design-based expectation and variance of yd=xdare
Trang 193.4.3 Non-linear statistics ± explicitly defined statistics
The same linearisation method can be used for a wide class of non-linearstatistics For example, suppose that we have a simple random sample ofsize n and that our estimator can be expressed as a differentiable function, g,
of sample means, ys ( y1s, , yms)0 Letting mjbe the model expectation of ytj,where ytj is the value of the jth variable for the tth unit, we have
g(ys) ÿ g(m) Xm
j1
]g]mj( yjsÿ mj) ox(nÿ3), (3:44)where m (m1, , mm)0 We define pseudo-variables, ut(m) for t 1, , N as
ut(m) Xm
j1
]g]mj( ytjÿ mj), (3:45)
so that
g( ys) ÿ g(m) XItut(m)=n ox(nÿ3): (3:46)This implies that inferences about g( ys) ÿ g(m) are asymptotically equivalent toinferences on us(m) PItut(m)=n For example,
Ep[ g( ys)] g(m) Xptut(m)=n ox(nÿ3) (3:47)and
Exp[ gs( ys)] g(m) o(nÿ3) (3:48)since Ex[ut(m)] 0
The design-based analogues of ys and m are yd ( y1d, , ymd)0 and
yU ( y1U, , ymU)0 respectively, where we denote the finite populationmeans by yjU, for j 1, , m, and we let yjd PItdtytj be the design-basedestimator of yjU We have
g( yd) ÿ g( yU) XItdtut( yU) op(nÿ3): (3:49)Taking the design-based mean, we have
Ep[ g( yd)] g( yU) Xut( yU)=N o(nÿ3)
g( yU) Xm
j1
XN t1
]g]yjU( ytjÿ yjU)=N o(nÿ3)
and its model expectation is
Exp[ g( yd)] Ex[ g( yU)] o(nÿ3)
For the design-based variance, we have