Approaches for checking if the model is consistent- 123docz.net

3.1. Bayesian residual analysis

In linear models, analysis of residuals is a popular and intuitive tool for diagnos- ing problems with the model fit. Plot of residuals against predicted values can iden-

tify failure of the statistical model. There is of course a natural Bayesian analogue to the traditional residual analysis (see, for example, Chaloner and Brant, 1988;

Albert and Chib, 1995). Supposeyi denotes the observation for individuali. Suppose further thatE(yi|ω)=ài, whereωdenotes the vector of parameters in the model. The residual εi = yi −ài is a function of the model parameters throughài. Thus there is a posterior distribution of the residuals and one can imagine a posterior distribution of residual plots or diagnostics. If a Bayesian model has been fit to the data, thenyi is considered outlying if the posterior distribution for the residualεi is located far from zero(Chaloner and Brant, 1988).

Examining residual plots based on posterior means of the residuals or randomly cho- sen posterior draws may help identify patterns that call the model assumptions into question. Residuals for discrete data can be difficult to interpret because only a few possible values of the residual are possible for a given value ofài. In this case it can be helpful to bin the residuals according to the value ofài(or other variable of interest) as inGelman et al. (2000).

3.2. Cross-validatory predictive checks

Cross-validation is the technique of partitioning observed data into two parts, a training or estimation set and a test set. The model is fit to the training set and applied to the test set. This can be done with a single observation serving as the test set to detect outlying or unusual observations or groups of observations. Alternatively, for an overall measure of fit we might split the data intoKsubsets and then cycle through deleting one subset at a time and using the remainingK−1 subsets to generate a predictive distribution for the test set. The predictive distributions for the observations in the test set are evaluated relative to the observed values. This evaluation can be done by comparing the predictive mean (or median) to the observed values or by calculating the predictive density at the observed value. Poor performance in the test set may be evidence of a poorly fitting or poorly identified model.Stone (1974),Gelfand et al. (1992), andMarshall and Spiegelhalter (2003)discuss Bayesian approaches to cross-validation.

3.3. Prior predictive checks

Letyrepdenote replicate data that we might observe if the experiment that generated y is replicated and let ω denote all the parameters of the model. Box (1980) suggested checking Bayesian models using the marginal or prior predictive distribution p(yrep) =

p(yrep|ω)p(ω)dωas a reference distribution for the observed datay. In practice, a diagnostic measureD(y)is defined and the observed valueD(y)compared to the reference distribution ofD(yrep)with any significant difference between them indicating a model failure.

This method is a natural one to use for Bayesians because it compares observables to their predictive distribution under the model. A significant feature of the prior predictive approach is the important role played by the prior distribution in defining the reference distribution. Lack of fit may be due to misspecification of the prior distribution or the likelihood. The dependence on the prior distribution also means that the prior predictive distribution is undefined under improper prior distributions (and can be quite sensitive to

the prior distribution if vague prior distributions are used). As the influence of Bayesian methods spreads to many application fields it appears that vague or improper priors are often used as a “noninformative” starting point in many data analyses. Thus the requirement of a proper prior distribution can be quite restrictive.

3.4. Posterior predictive checks

A slight variation on the prior predictive checks is to defineyrepas replicate data that we might observe if the experiment that generatedyis repeated with the same underlying parameter. (For the prior predictive checks, the replicate data is based on a different parameter value drawn independently from the prior distribution.) Essentiallyyrepis a data set that is exchangeable with the observed data if the model is correct. This leads to the use of the posterior predictive distributionp(yrep)=

p(yrep|ω)p(ω|y)dωas a reference distribution for the observed datay. Posterior predictive checks are concerned with the prior distribution only if the prior is sufficiently misspecified to yield a poor fit to the observed data. Also, posterior predictive checks can be applied even if the prior distribution is improper as long as the posterior distribution is a proper distribution. Posterior predictive checks are the primary focus of the remainder of the chapter.

They are discussed in more detail in the next section and then illustrated in two appli- cations.

3.5. Partial posterior predictive checks

Bayarri and Berger (2000)develop model checking strategies that can accommodate improper prior distributions but which avoid the conservatism of posterior predictive checks (this conservatism is discussed in the next section). They propose a compromise approach, known as partial posterior predictive model checks, that develops a reference distribution conditional on some of the data (but not all). LetT (y)denote a test statistic with observed valuetobs. The partial posterior predictive checks use as a reference distribution the predictive distribution

pppost

yrep|y

p yrep|ω

pppost(ω|y, tobs)dω,

where pppost(ω|y, tobs) is a partial posterior distribution conditional on the observed value of the test statistic. The partial posterior distribution is computed as pppost(ω|y, tobs)∝p(y|tobs,ω)p(ω). Model checks based on the partial posterior predictive distribution (and other alternatives described byBayarri and Berger, 2000) have good small-sample and large-sample properties, as demonstrated byBayarri and Berger (2000)andRobins et al. (2000). However, asJohnson (2004)comments, thep-values associated with these methods can seldom be defined and calculated in realistically complex models. Moreover the reference distribution changes for each test statistic that is conditioned on.

3.6. Repeated data generation and analysis

Dey et al. (1998)suggested a variation on the prior predictive checks that considers the entire posterior distribution as the “test statistic” (at least conceptually) rather than just

a particular function. Here, as in the prior predictive method, many replicates of the data are generated using the model (i.e., using the prior distribution and the likelihood). For each replicate data set, rather than just compute a particular test statistic to compare with the observed test statistics, the posterior distribution of the parameters is determined.

Each posterior distribution is known as a replicate posterior distribution. The key to model checking is then to compare the posterior distribution obtained by conditioning on the observed data with the replicate posterior distributions.

As described inDey et al. (1998), lety(r)denote therth replicate data set and say, for convenience, thaty(0)denotes the observed data. With this approach it is possible to consider discrepancies that are functions of just the data (in which case the method is identical to the prior predictive check), functions of the parameters, or most generally functions of both, sayT (y,ω). The posterior distributionp(T|y(0))is compared with the setp(T|y(r)), r =1,2, . . . , R. Comparing posterior distributions is a complicated problem, one possibility is to compare samples from each distributionp(T|y(r))with a single sample drawn from the posterior distribution based on the observed data.Dey et al. (1998) propose one approach for carrying out the comparison; they summarize each posterior sample with a vector of quantiles and then define a test statistic on this vector.

Like the prior predictive approach, this approach can not be applied when the prior is improper and can be quite sensitive to the use of vague prior distributions. The compu- tational burden of this method is substantial; a complete Bayesian analysis is required for each replication which likely involves a complex simulation. In addition, a large amount of information needs to be stored for each replication.

Approaches for checking if the model is consistent with the data

Models for the underlying data – Bayesian inference

Intrinsic discrepancy and expected information