Báo cáo sinh học: "The covariance between relatives conditional on genetic markers" pptx

Fernando and Grossman [11] developed a method for calculatingthe gametic covariance conditional on a single linked marker, assuming com- pletely informative markers.. [3] designed a comp

Trang 1

DOI: 10.1051/gse:2002030

Original article

The covariance between relatives

conditional on genetic markers

Yuefu LIUa∗, Gerald B JANSENa, Ching Y LINa,b

a

Department of Animal and Poultry Science, University of Guelph,

Guelph, ON N1G 2W1, Canadab

Dairy and Swine Research and Development Centre,Agriculture and Agri-Food, Canada

(Received 13 August 2001; accepted 3 June 2002)

Abstract – The development of molecular genotyping techniques makes it possible to analyze

quantitative traits on the basis of individual loci With marker information, the classical theory of estimating the genetic covariance between relatives can be reformulated to improve the accuracy

of estimation In this study, an algorithm was derived for computing the conditional covariance between relatives given genetic markers Procedures for calculating the conditional relationship coefﬁcients for additive, dominance, additive by additive, additive by dominance, dominance

by additive and dominance by dominance effects were developed The relationship coefﬁcients were computed based on conditional QTL allelic transmission probabilities, which were inferred from the marker allelic transmission probabilities An example data set with pedigree and linked markers was used to demonstrate the methods developed Although this study dealt with two QTLs coupled with linked markers, the same principle can be readily extended to the situation

of multiple QTL The treatment of missing marker information and unknown linkage phase between markers for calculating the covariance between relatives was discussed.

covariance between relatives / molecular marker / QTL / transmission probability / tionship matrix

rela-1 INTRODUCTION

Quantifying the resemblance between relatives is a fundamental issue inquantitative genetics It is needed for estimating genetic parameters, predictingbreeding values, planning mating schemes, QTL mapping and marker assistedgenetic evaluation The study of the correlation between relatives can betraced back to the beginning of the last century [29, 36] Kempthorne [22]summarized the work on this topic up to Malecot’s study [27] Fisher [12]ﬁrst studied the two-locus epistatic deviations and their effects on the cov-ariance between relatives such as parents and descendants, fullsibs, uncles

∗Correspondence and reprints

E-mail: yuefuliu@uoguelph.ca

Trang 2

and cousins Cockerham [6, 7] partitioned the two-locus epistatic varianceinto additive by additive, additive by dominance and dominance by domin-ance Kempthorne [21, 22] applied the analysis of factorial experiments topartition the genetic variance and studied the covariance between relatives inrandom mating populations [21, 23], inbred populations [24] and a simpleautotetraploid population [25] Plum [31] formulated a recursive methodfor calculating the relationship and inbreeding coefﬁcients Cockerham [8]

and Weir et al [37] analyzed the inﬂuence of linkage on the covariance

between relatives The theory and computational algorithms for the correlationbetween relatives were well established in the early development of quantitativegenetics

The resemblance between relatives is attributed to gene transmission fromthe parents to the descendants so that the relatives share identical genes bydescent with certain probabilities Since the gene transmission between gener-ations is not observable, the transmission probability of an allele is generallytaken to be 0.5 Actually, the transmission of an allele from a parent to offspring

follows an all-or-none pattern With information from molecular markers, itbecomes possible to track the transmission of a linked gene more precisely than

by using pedigree data alone

There have been several studies on the conditional covariance betweenrelatives Fernando and Grossman [11] developed a method for calculatingthe gametic covariance conditional on a single linked marker, assuming com-

pletely informative markers Van Arendonk et al [3] designed a computing

procedure for the gametic relationship matrix given a single linked marker,which is valid when the parental origin of the offspring’s alleles is known.Goddard [16] derived the conditional gametic covariance due to allelic effects

in terms of genetic effects without using the concept of identity probabilities,where parental origins of marker alleles and linkage phases among markers areassumed to be known However, the parental origin of the offspring’s alleles

is often unknown in real data analysis Wang et al [35] extended Fernando

and Grossman’s [11] method to accommodate situations where the parentalorigin of marker alleles can not be determined unequivocably However, themethod used to account for this biological uncertainty has been developed onlyfor a single marker linked to a QTL In QTL mapping for human populations,Fulker and Cardon [13, 14] used a regression approach to approximate the IBD

of QTL from the IBD of ﬂanking markers Their development is based on themethod of Haseman and Elston [18] which considers the expected IBD of alocus as a linear function of the IBD of another linked locus Kruglyak andLander [26] developed a hidden Markov model to estimate the IBD states of

a putative QTL using the probability distribution of the marker IBDs Thisapproach is more accurate than Fulker and Cardon’s approximation [13, 14],but is more complicated to compute Xu and Gessler [38] made a compromise

Trang 3

between the two methods and proposed an approximate hidden Markov model

to improve the computing speed at the expense of estimation accuracy Almasyand Blangero [2] improved Fulker and Cardon’s method [13, 14] in regard

to the sib-pair approach of QTL mapping and developed a general

frame-work of multipoint identity by descent Pong-Wong et al [32] combined

the method of Haseman and Elston [18] for estimating identity by descent

between sibs often used in human genetics and the method of Wang et al.

[35] for general pedigree to derive a simple method for calculating the gameticidentity-by-descent matrix of QTLs Meuwissen and Goddard [28] developed

a method of predicting gametic identity probability from marker haplotypes

by a simpliﬁed coalescence process, assuming that the number of generationssince the base population and effective population size are known Thesestudies on conditional identity measures of relatives have generally focused

on the identity by descent due to allelic effects The theory of conditionalcovariance due to non-additive effects has been little studied Aside from thecovariance due to allelic effects, the quantiﬁcation of the conditional covariancecomponents due to additive and non-additive effects is also frequently required

to reﬁne the statistical model for marker assisted analysis of quantitativetraits

This study aimed to develop a general theory for constructing the tional covariance between relatives in the presence of additive, dominanceand epistatic effects and to update the classical theory when both pedigreeand marker data are available The development relaxed the assumptions ofprevious studies and applied both single and ﬂanking marker inferences withknown or unknown parental origins of offspring’s haplotypes

i at the ﬂanking locus N l

i The superscript l will be

dropped for simplicity whenever a single QTL is considered These symbols

are random variables For example, when an individual i has the genotype A1A2

at marker locus m, then M m1 i = A1 and M i m2 = A2 The symbol “≡” stands for

the identity between alleles and the symbol “⇐” for the allelic transmissionfrom a parent to a descendant

Trang 4

m1 s m1 s m1

s Q N

s n1 s n1

s Q N

d m1 d m1

d Q N

d n1 d n1

dQ N

m2 s m2 s m2

s Q N

s n2 s n2

s Q N

d m2 d m2

d Q N

d n2 d n2

d Q NM

m1

i m1 i m1

i Q N

i n1 i n1

i Q NM m2

i m2 i m2

i Q N

i n2 i n2

i Q NM

m1 s' m1 s' m1

s' Q N

s' n1 s' n1 s'Q N

d' m1 d' m1 d' Q N

d' n1 d' n1 d'Q NM

m2 s' m2 s' m2

s' Q N

s' n2 s' n2 s'Q N

d' m2 d' m2 d' Q N

d' n2 d' n2 d'Q NM

m1

j m1 j m1

j Q N

j n1 j n1

j Q NM m2

j

m2 j

n2

j Q NM

Figure 1 The marker and QTL genotypes for individuals i and j, and their respective

parents s, d, and s, d

2.2 Genetic covariance components

If there are q loci controlling a quantitative trait, the classical formula for computing the covariance between genotypic values (g) of individuals i and j [21, 22] is:

under the assumption of no inbreeding and linkage equilibrium among loci

When there is only one locus (q = 1), formula (1) reduces to Cov(g i , g j ) =

Trang 5

ADD Traditionally, the coefﬁcients r ij and u ij are assumed to be identical for

the q loci because the allelic transmission at each individual locus can not be traced Considering only two loci, say m and n, the genetic covariance due to

these two QTL loci can be written as:

are dominance variances at the two loci The epistatic variances for additive

by additive, additive by dominance, dominance by additive and dominance by

dominance between loci m and n are σ2

A m A n,σ2

A m D n,σ2

D m A nandσ2

D m D n, respectively.Information on the markers linked to QTL affecting a trait can be used toreﬁne the covariance among relatives Conditional on the marker information

Therefore, formula (2) needs to be rewritten as:

ijare the additive and dominance relationship coefﬁcients

between individuals i and j at loci m and n, and r m

probability of QTL allelic identities between individuals i and j:

2.3 Conditional probability of QTL allelic identity by descent

For every pair of individuals i and j in a population, there are four possible QTL allelic identities: (Q1 ≡ Q1), (Q1 ≡ Q2), (Q2 ≡ Q1) and (Q2 ≡ Q2)

Trang 6

The probabilities of these identities can be inferred conditional on the marker

information Let matrix Pij contain the probabilities of the four QTL allelic

identities between individuals i and j:

The additive and dominance relationship coefﬁcients between individuals i and

where the t’s are all (2×1) column vectors Similarly, QTL allelic transmission

probabilities from parents sand dto descendant j can be deﬁned in matrix T j

The QTL allelic identity probabilities between individuals i and j, i.e P ij,can be calculated as:

Trang 7

Formula (7) corresponds to Falconer’s [10] “basic rule”for calculating try whereas formula (6) relates to the “supplementary rule” Computationally,formula (6) is more efﬁcient than formula (7) Both (6) and (7) indicate that theQTL allelic identity probabilities in a population can be tabulated recursivelyfrom ancestors to descendants.

coances-The same principle applies to the derivation of QTL allelic identity

probab-ilities of individual i with itself Letting j = i, s= s and d = d in formula (7),

and replacing the marginal probabilities with conditional probabilities in Tjofformula (7) because the allelic transmission from parent to the ﬁrst allele of

offspring i is not independent of that to the second allele, the QTL allelic identity

probabilities of individual i with itself (P ii) can be derived from formula (7)and take the following form:

where Psd and t’s are as deﬁned above and 1 = (1 1) Matrix P ii is always

symmetric When the parental origins of the two QTL alleles are known (e.g.

i is from the father and Q2

i from mother), formula (8) simpliﬁes to

the QTL identity probabilities of an individual i with itself when parental

origins of offspring’s alleles are known This explains why formula (8) of Van

Arendonk et al [3] works in the same way as the method of Wang et al [35]

when parental origins are known

2.4 QTL allelic transmission probabilities

The parental origin of QTL alleles is usually unknown because the QTLallelic transmission is not directly observable Therefore, the eight transmission

probabilities of QTL alleles from parents s and d to descendant i (T i) have

to be assessed based on marker alleles transmitted from parents s and d to the offspring i and genetic distances between QTL and markers When two

ﬂanking markers are available, the transmission probability from QTL allele

Trang 8

k p (k p = 1, 2) of parent p (p = s, d) to allele k i (k i = 1, 2) of descendant i can

p N p kp ) is the conditional probability given

in the 5th column of Table I when k p = 1 and in the 6th column when k p= 2

Matrix Ti can now be expressed in terms of marker allelic transmission

probabilities, Si, and recombination rates between QTL and markers andbetween ﬂanking markers:

case of Wang et al [35] Formula (10) is identical to formula (5) of Wang

et al [35] if their B matrix is transposed.

Trang 10

2.5 Marker haplotype transmission probability

Although marker genotypes can be observed through genotyping techniques,the parental origin of a descendant’s haplotype is often uncertain For example,

if a descendant and its parents all have genotype A1A2at a single marker, there

is no way to ascertain which parent the descendant’s haplotypes come from.Furthermore, when a parent is homozygous, it is impossible to determine whichparental gamete a descendant’s haplotype comes from In this development,

we trace all possible paths from parental gametes to a descendant’s markerhaplotype Because the inference is always conditional on marker information,the notation for conditioning on marker information (|M) will be droppedhereafter for ease of presentation

The assessment of the marker haplotype transmission involves three steps.First, the transmission probabilities of each path from parental gametes to adescendant’s haplotype needs to be quantiﬁed For this, we need to inferwhich parent a descendant’s haplotype comes from (parental origin), and whichparental gamete type the descendant’s haplotype originates from given theparental origin (gametic frequency) The probability of each transmission path

is a probabilistic product of the parental origin and the gametic frequency givenparental origin, following the Law of Compound Probability [5] There are fourmutually exclusive paths for each descendant’s haplotype in a single markercase and eight in a ﬂanking marker case Second, we need to determine theprobabilities of each descendant’s haplotype given the transmission path from aparental gamete to the descendant’s haplotype This can be done by comparingthe descendant’s haplotype with the parental gametic type Third, our purpose is

to determine the probabilities of each transmission path from parental gametes

to a descendant’s haplotype given that the descendant’s haplotype is observed.This requires calculating the reverse probability of each path given the observedhaplotype of the descendant using the Bayes Theorem [5]

Consider the single marker case ﬁrst A marker haplotype M k i

There are two possible parental origins for M k i

i It may be paternal, i.e.

Trang 11

sum to one as expected In the single marker case, the frequencies of parentalgametes given parental origins are all 0.5.

For each realization of M k i

In the case of ﬂanking markers, there are eight mutually exclusive marker

transmission paths for each haplotype M k i

M and N, respectively The paternal and maternal gametic frequencies given

parental origins are(1 − θ)/2, θ/2, θ/2, and (1 − θ)/2 In a similar way, the

probabilities of parental origins of M k i

not be inferred, it is assumed that both Pr (M k i

Định dạng
Số trang	22
Dung lượng	160,77 KB