Di truyền số lượng QTL Công nghệ sinh học chọn tạo giống các phương pháp xác định Quantitative trait loci bản đồ di truyền số lượng
Trang 1Institute of Agriculture Sciences – Biotechnology Division
Trait Mapping (Quantitative Trait Loci)
IAS – Biotech Division
Robert J Wright Texas Tech University
Genes explaining variation in simple or
complex traits can easily be mapped to
chromosomes or linkage groups with
minimal a priori information.
Trang 2What is a QTL?
• A quantitative trait loci (QTL) is the
location of a gene(s) that have an effect
on a trait A QTL is depicted as a
confidence interval on a genetic map
– Examples of quantitative traits
• plant height
• grain yield
– These traits are typically affected by more
than one genes, and also by the
environment
IAS – Biotech Division – Slide 3
environment
– Thus, QTL mapping is not as simple as
mapping a single gene that influences a
qualitative trait (such as flower color)
Why map QTL ?
• To provide knowledge towards a
fundamental understanding of heredity
and the gene(s) that control individual
traits
• To study individual gene(s), gene
actions and interactions
• To enable positional cloning of the gene
• To improve estimations of breeding
value and selection response through
IAS – Biotech Division – Slide 4
value and selection response through
marker assisted selection (Predictive
Breeding)
Trang 3DNA markers that are near a disease
resistance gene tend to be inherited
together (genetically linked).
IAS – Biotech Division – Slide 6
Resistant Allele Susceptible Allele
Trang 4During gamete formation the segregation
of the alleles of one allelic pair is
independent of the segregation of the
alleles of another allelic pair (Mendel’s
Trang 5Localize Target Gene to a Delimited Region
D d
C c
R s
IAS – Biotech Division – Slide 9
• Identify genomic regions (QTLs) that
contribute to phenotypic variation of a
trait.
• Delineate the QTL location within a
confidence interval.
• Estimate the effects of the QTL,
putative gene action, and contribution
IAS – Biotech Division – Slide 10 10
to the phenotypic variance.
Trang 6Factors critical to successful QTL mapping
1 Good genetic map
2 Good phenotyping
3 Rigorous statistical/genetic analysis
4 Validation of QTL
IAS – Biotech Division – Slide 11
1 to 3 absolutely critical for genetic analysis;
1 to 4 are all critical for implementation in breeding
I Good Genetic Mapping
– Critically look at your map
– Identify/remove poor marker loci
(disequilibrium and missing data)
– Identify alleles mapped as independent loci
– Conduct graphical assessment of map
quality
– Investigate alignment with published maps
IAS – Biotech Division – Slide 12
Trang 7Constructing a Genetic Map
• DNA Markers
– Restriction Fragment Length Polymorphism (RFLP)
– Amplified Fragment Length Polymorphism (AFLP)
– Simple Sequence Repeats (SSR)
– Single Nucleotide Polymorphism (SNP)
– Insertion/Deletion Mutations (INDEL)
• To be a informative genetic marker, the
DNA marker must meet two criteria:
– The marker must differentiate between the parents
(Polymorphic).
IAS – Biotech Division – Slide 13
– The marker must be precisely transmitted to the
progeny (Mendelian Segregation).
(Doubled haploid line)
Created to minimize the confounding effect of heterogeneity of the parental genotypes (i.e a
IAS – Biotech Division – Slide 14
RIL(Recombinant inbred line)
With polymorphic molecular markers and linkage maps as tools, mapping
QTL is simply a matter of growing and evaluating large populations of plants,
and of applying the appropriate statistical tools
pure line is not genetically homogeneous).
Trang 8Stringent Computational Analysis
• Data Analysis
– Marker loci density
• 10-20 cm for genome wide linkage analysis
• 50-200 kb for genome wide association studies
– Pre-selection of marker type
Critically look at your map!!
IAS – Biotech Division – Slide 16
Trang 9Scoring Marker Loci
(3) P
Genotypes are scored as:
IAS – Biotech Division – Slide 17
Scoring Marker Loci
AFLP
IAS – Biotech Division – Slide 18
Trang 10Dominant Markers (P1 and P2)
Trang 11Activities to improve map quality
1 Identify/remove poor marker loci
(disequilibrium and missing data)
Demo
2 Identify alleles mapped as independent loci
3 Conduct a graphical assessment of map
quality
4 Investigate alignment with published maps
IAS – Biotech Division – Slide 21
Conduct a graphical assessment of map quality
IAS – Biotech Division – Slide 22
Trang 12AABbAaBB
Numbers under the genotype indicate expected
number of individuals (N = 400) at 40 cM and
(10 cM) linkages between marker loci.
Numbers under the genotype indicate expected
number of individuals (N = 400) at 40 cM and
(10 cM) linkages between marker loci.
Trang 13Investigate alignment with published maps
IAS – Biotech Division – Slide 25
Figure 2 Colinearity of the G herbaceum × G arboreum linkage map with the Nguyen
et al 2004 tetraploid map (N).
Institute of Agriculture Sciences – Biotechnology Division
Discussion
IAS – Biotech Division
Trang 14II Good Phenotyping
– Heritability
– Population design
– Replication vs re-sampling
– Reproducibility (measure –equip.)p y ( q p )
– How to handle false phenotypes
IAS – Biotech Division – Slide 27
Phenotype = Genotype + Environment + Epistasis
Phenotype - The observable properties of an organism
Phenotype = Genotype + Environment + Epistasis
IAS – Biotech Division – Slide 28
properties of an organism, produced by the interaction between the organism’s genetic potential (its genotype) and the environment in which it finds itself.
Trang 15Qualitative Traits (Simple Inheritance)
Phenotype = Genotype + Environment + Epistasis
Qualitative traits have a few possible phenotypes that
fall into discrete classes ( Discrete Traits);
phenotypes are determined by one or a few genes
IAS – Biotech Division – Slide 29
Mendel studied seven characters (traits) in the garden pea
during his breeding experiments.
phenotypes are determined by one or a few genes
with minimal environmental influence
Bimodal Distribution
IAS – Biotech Division – Slide 30
Trang 16P12-20a pAR1-28PXP1-9apAR335a A1318b pAR179x
Empire B2b6 Chromosome 20
pAR988 A1318a pAR335b pGH239
G1219
P5-57 pVNC163a A1695b
Empire B2b6 LGD04
Empire B2b6 LGD02
A1666 M16-40 pAR648 pAR566 pAR044b (34.4 cM)Mapping Traits as QTLs
Empire B3 Chromosome 20
M16-150 pAR125 A1701b
pAR377 pAR3-37A1429pAR3-41 pAR1005
(7.9 cM)
pAR430 pAR827
pAR044b
pAR043
pAR3-24bpAR723
pAR043 pAR129 pAR723b
(11.4 cM)
Empire B2b6 Chromosome 14 S295
Chromosome 14
IAS – Biotech Division – Slide 31
(14.8 cM)
pGH559b pGH510a
pAR1005
pAR545
pAR3-24b pAR451b G1012 pAR129 G1012
p 7 b pAR3-24b A1580
Wright et al., (1998) Genetics 149:1987-1996.
QTLs are the Genetic Contribution
37%
IAS – Biotech Division – Slide 32
Trang 17• a measure of the degree to which the phenotype is
controlled by genetic factors and thus amenable to
genetic improvement in breeding
• Two main types of heritability:
– broad sense heritability - The proportion of
phenotypic variability that is due to all types of
genetic causes.
– narrow sense heritability - The proportion of
H2= VG/VP x 100
IAS – Biotech Division – Slide 33
phenotypic variability that is due to heritable genetic
factors (e.g., may be passed from parents to
progeny).
h 2= VA/VP x 100
Why is good phenotyping important?
Response to selection and heritability
Trang 18Predicted Gain from Selection
increase after one cycle of selection
increase after one cycle of selection
Trang 19(Markers density does not solve this problem)
IAS – Biotech Division – Slide 37
The bias in estimated genetic variance (σ G2) occurs mainly due to sampling of small
populations; QTLs with small effects tend to not be detected but when detected the
estimated genetic effects appear much larger than they really are This
phenomenon is known as “The Beavis Effect”.
a=0.0001, power = 90%, F2population
Trang 20(Doubled haploid line)
IAS – Biotech Division – Slide 39
RIL(Recombinant inbred line)
Individual plant phenotypes lead to lower heritability estimates,
thus more false positives and QTLs that escape detection
GRAIN YIELD (Mass/Area)
Simplifying Phenotypic Complexity
Yield Components – Grain Sorghum
Seed Number/Area Seed Size (g/1000 Seed)
Harvested Heads/Area Seed/ Head Volume Density
IAS – Biotech Division – Slide 40
Plant Population Tillers
Trang 21IAS – Biotech Division – Slide 41
+ 0.015 ( panicle / ha )
+ 96.79 ( g / 10 3 seed )
0.57 0.12 0.12 0.81
COTTON LINT YIELD (Mass/Area)
Simplifying Phenotypic Complexity
Yield Components – Cotton
Plant Population Bolls/Plant Lint/Boll
Seed/Boll Lint/Seed
IAS – Biotech Division – Slide 42
Number of Fruiting Sites
Retention of Fruit
Seed/ o t/Seed
Trang 22Multiple Regression Analysis of Cotton Lint Yield
Lint Yield Intercept
Source Component Parameter
0.00 16.77 38.42
IAS – Biotech Division – Slide 43
Boll plant Plants acre -1
6.43 1.51
0.374 0.376
IAS – Biotech Division – Slide 44
Var A Check 98032
Var B
Trang 23Replication vs Resampling
Example: QTL mapping of enzyme activity
Measure Enzyme Activity Measure Enzyme Activity
IAS – Biotech Division – Slide 45
Measure Enzyme Activity Measure Enzyme Activity
Biological Replication Resampling
Beneficial to estimate heritability and QTL mapping
Technical Replication beneficial
for estimates of reproducibility
Institute of Agriculture Sciences – Biotechnology Division
Discussion
IAS – Biotech Division
Trang 24III Rigorous statistical/genetic analysis
– Critically assess each QTL region
– Identify an appropriate significant criteria
– Provide confidence intervals
– Fixing QTL affectsg Q
– Dominant – Additive effects
– If multiple generation of testing or replicated
testing – due QTL regions align? - are
additive and dominant effects similar?
– Does the full model (include all QTL regions)
agree with heritability values for each trait?
IAS – Biotech Division – Slide 47
agree with heritability values for each trait?
– Check alignment with prior research
– Don’t reinvent the annotation
Critically look at your QTLs!!
IAS – Biotech Division – Slide 48
Trang 25• Regress phenotypes on codes
• Significance = marker linked to QTL
• Regression slope = estimate of QTL effect
• Interval mapping
– Maximum Likelihood Method
• Numeric codes Give to marker genotypes
IAS – Biotech Division – Slide 49
• Numeric codes Give to marker genotypes
(eg AA = 0, aa = 1)
• Association of the phenotypes on codes
• Significance = target interval contains a QTL
• Split plants into groups
according to their genotype
at a marker
• Do an ANOVA (or t-test)
• Repeat for each marker
IAS – Biotech Division – Slide 50
Trang 26– Suffers in low density scans
IAS – Biotech Division – Slide 51
y– Only considers one QTL at a time
51
Interval mapping
Lander and Botstein 1989
• Imagine that there is a single QTL, at position z.
• Let qi = genotype of mouse i at the QTL, and assume
• We won’t know qi, but we can calculate (by an HMM)
• yi, given the marker data, follows a mixture of normal
distributions with known mixing proportions (the pig).
• Use an EM algorithm to get MLEs of θ = (μAA, μAB, μBB, σ).
• Measure the evidence for a QTL via the LOD score, which is
the log10 likelihood ratio comparing the hypothesis of a single
QTL at position z to the hypothesis of no QTL anywhere
IAS – Biotech Division – Slide 52 52
Trang 27IAS – Biotech Division – Slide 53
Qeffects
(crossover) event occurring between them in meiosis
the likelihood that they are not linked
IAS – Biotech Division – Slide 54
A LOD score of three or more is generally taken to indicate that two gene loci are close to each other on the chromosome (A LOD score of three means the odds are a thousand to one in favor of genetic linkage).
Trang 28LOD thresholds
• To account for the genome-wide search,
compare the observed LOD scores to
the distribution of the maximum LOD
score, genome-wide, that would be
obtained if there were no QTL
anywhere.
• The 95th percentile of this distribution
is used as a significance threshold.
• Such a threshold may be estimated via
IAS – Biotech Division – Slide 55
Such a threshold may be estimated via
permutations (Churchill and Doerge
• Repeat many times
• LOD threshold = 95th percentile of M*
• P-value = Pr(M* ≥ M)
56
Trang 29This interval is expected to contain a gene(s) that explain the phenotypic variation of the trait.
The bar along the linkage group indicates the 90%
(1 LOD) lik lih d i t l
IAS – Biotech Division – Slide 58
(1-LOD) likelihood interval for the QTL, and whiskers indicate 99% (2-LOD) likelihood interval.
Trang 30Mapping QTLs
• The critical factor to successfully
mapping quantitative traits is a detailed
understanding of the traits and the
precision in which individuals can be
phenotyped. LOD = 3.6
PVE = 14.7
LOD = 10.2 PVE = 56.5
IAS – Biotech Division – Slide 59
Population Xcm Race
Probable gene identity
Nearest DNA
Gene action
II Additive effects: the effects of homozygous.
Percentage of variation explained (PVE) is the percentage of phenotypic
variation that is explained by the QTL
IAS – Biotech Division – Slide 60
This deviation will be detected in F2 population
It’s calculated as: Heterozygous – [(P1+P2)/2]
A positive effect reflects growth of the heterozygous that exceeds
the midparent
A negative effect reflects growth that is less than the midparent
It’s calculated as: (Homozygous for P1 – Homozygous for P2)/2
A positive effect reflects greater growth of the P1 homozygous
A negative effect reflects greater growth of the P2 homozygous
Trang 31QTLs Mapped in Multiple Generations
IAS – Biotech Division – Slide 61
Figure 3 Chromosomal location of QTL conferring resistance to Thielaviopsis
basicola Bars along the linkage groups indicate 90% (1-LOD) likelihood intervals for
the QTLs, and whiskers indicate 99% (2-LOD) likelihood intervals Marker names and
genetic distances (in centi-Morgan) are shown at the right and left of each linkage
group.
QTLs Mapped in Multiple Generations
Table 1 Biometrical parameters of individual QTLs conferring resistance to Black Root
a Markers flanking the QTL likelihood peak
b Position of the QTL likelihood peak (centi-Morgan from top)
c Biometrical parameters were calculated using dominance and recessiveness to refer to
the behavior of the G herbaceum alleles
Trang 32Check QTL alignment with prior research
Figure 1 A CMap display of the
comparison between an individual map and a consensus map The left map is chromosome 16 from
the n2 population including one
QTL (MIC) shown on the left side
of the map The right map is consensus homoeologous group 4
The syntenic regions with
Arabidopsis duplicates and QTLs
in this chromosome were aligned
on the right side of the map The numbers after D or Ds represent
the different Arabidopsis duplicates
and the numbers after the dot represent the different syntenic regions on the cotton chromosome
The experimental treatment
IAS – Biotech Division – Slide 63
name (J IANGet al 1998a) and
particular measurement for fiber length (C HEEet al 2005b) were
presented in parenthesis after each QTL name Detailed explanation of these descriptions can be found in the references cited
Institute of Agriculture Sciences – Biotechnology Division
Discussion
IAS – Biotech Division
Trang 33IV QTL Validation and Efficacy
– Selection of genotypes
– Testing
– Cross validation (split data set - map in
1st and validate in second)
IAS – Biotech Division – Slide 65