1. Trang chủ
  2. » Giáo Dục - Đào Tạo

Association study of ABCA1 polymorphisms in singapore populations 3

16 206 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 16
Dung lượng 52,94 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

3.2 Strategies for Dissecting Genetic Basis of Complex Traits Genetic dissection of complex traits may be carried out using linkage analysis, allele sharing methods, association studies

Trang 1

3 Literature Review II: Genetic Analysis of Complex Traits

3.1 Key Definitions

3.1.1 Complex Trait

A complex trait refers to any phenotype that does not segregate in a classic, one-locus Mendelian fashion (Lander and Schork, 1994) In other words, there is poor one-to-one correspondence between genotype and phenotype The same genotype can lead to different phenotypes due to effects of chance, environment or interactions with other genes Conversely, mutations in different genes can give rise to identical phenotypes, such as when the genes are involved in a common biochemical pathway Some traits require the simultaneously presence of mutations in multiple genes (genetic or locus heterogeneity), each with a relatively small effect Some individuals who inherit a predisposing allele may not manifest the trait (incomplete penetrance), whereas others who inherit no predisposing allele may nonetheless acquire the condition due to environment or random causes (phenocopy)

3.1.2 Single Nucleotide Polymorphisms (SNPs)

SNPs are single base substitutions representing 90% of the common genetic variation in the human genome, with insertion/deletion (indel) and length polymorphisms providing the rest Operationally defined, a SNP has a minimum allele frequency of 1% SNPs occur about 1 in every 300-1000 bp (Cargill et al., 1999; Halushka et al., 1999; Stephens

et al., 2001; Carlson et al., 2003)

SNPs possess distinct advantages over repeat polymorphisms like short tandem repeats (also known as microsatellites) in the genetic analysis of complex traits They are abundant with more than five million SNPs with minor allele frequencies greater than 10% expected to exist, and are distributed throughout the genome in coding and noncoding

Trang 2

regions Thus one can use SNPs for fine-mapping studies of positional cloning efforts or

as candidates to test directly as the casual mutations for a trait Various experimental as

well as in silico strategies are available for the discovery of novel SNPs although

resequencing remains the gold standard SNPs are also easy to genotype and are amenable to high throughput genotyping technologies (Schork et al., 2000; Kirk et al., 2002) SNPs are generally more stable than microsatellites which can undergo slippage during replication The increased stability allows a more reliable way to assess linkage disequilibrium (LD) relationship, locus association and co-segregation

3.2 Strategies for Dissecting Genetic Basis of Complex Traits

Genetic dissection of complex traits may be carried out using linkage analysis, allele sharing methods, association studies and experimental models

3.2.1 Linkage Analysis

Linkage studies examine whether genetic markers tend to co-segregate with disease or other phenotype of interest using pedigrees During meiosis, alleles of two loci that are located on different chromosomes or are widely separated on the same chromosome segregate independently of each other Therefore markers closest to the causative locus would be most correlated with the phenotype while distant markers do not segregate with the trait as a result of breakdown by recombination during meiosis Linkage analysis is also sometimes referred to as positional cloning

A complete dissection of the genetic basis of a disease entails several steps: linkage studies by genotyping kindreds with multiple affected family members using ~400 microsatellite markers spaced 10 cM apart throughout the genome, followed by narrowing

of the susceptibility locus using SNPs which are more abundant than microsatellites, and finally sequencing and identification of the causative mutation in the candidate gene

Trang 3

Linkage studies have been extremely effective for locating genes involved in rare, simple monogenic Mendelian traits which are typically caused by genes with large effects (also known as high displacement), strong genotype-phenotype correlation, high heritability, and robust to allelic and locus heterogeneity (Risch, 2000) For complex diseases, linkage analysis of susceptibility genes is less powerful since it is performed without a known mode of inheritance and estimated allele frequency (derived from segregation analysis), and many unaffected individuals also carry the susceptible alleles (Bogardus et al., 2002) Linkage studies are more generally effective for loci with large genotypic risk ratios (GRR) of at least four, but not for loci with GRR of two or less; even then, positional cloning may prove daunting because the candidate region is large (Risch and Merikangas, 1996) The few successes are largely confined to those with low allele frequency and Mendelian-like inheritance, e.g BRCA-1 and BRCA-2 genes in breast cancer, and β-amyloid precursor protein and presenilin-1 and -2 in Alzheimer’s disease (Risch, 2000)

3.2.2 Allele Sharing Methods

Allele sharing methods also concern pedigree information like in classical linkage analysis but differ in that they are non-parametric with no assumptions about the mode of the inheritance of the disease, the population disease gene frequency, and so on (Lander and Schork, 1994; Ewens and Spielman, 2001) The affected sib pair method is the simplest allele sharing approach Consider a locus A and an individual heterozygous at this locus, A1A2 A parent of two affected sibs could have either passed the same allele (either A1 or A2) to both sibs or A1 to one sib and A2 to the other If locus A is linked to the disease, then both affected sibs will share an excess, i.e greater than the expected 50%,

of the allele Allele sharing methods are more robust than linkage analysis because affected relatives always show excess sharing of alleles even in the presence of

Trang 4

incomplete penetrance, genetic heterogeneity and high frequency disease alleles (in which the expected Mendelian inheritance is confounded by multiple copies of the disease-causing allele segregating in the pedigree; Lander and Schork, 1994) The trade-off is that they can be less powerful when the correct linkage model is specified

3.2.3 Genetic Association Studies

Genetic association methods detect differences in frequencies of genetic markers between affected (case) and normal (control) individuals Unlike linkage studies which involve pedigrees, association is performed at the population level Statistical analysis is generally straightforward The presence of an association implies that the marker itself is directly functional or it is in close LD with the causative allele, hence association studies are sometimes referred to as LD mapping LD occurs when an allele at a genetic locus is situated on the same haplotype with a specific allele at another locus However, there are other reasons for statistical association such as chance effects due to multiple testing as well as confounding, for example, due to presence of cryptic population structure or another risk factor Association methods are expected to be more powerful than linkage studies for the detection of common disease alleles that confer modest (Risch and Merikangas, 1996) This refects the fact that for modest risk alleles, the patterns of allele sharing among affected family members are less striking than those between affected unrelated individuals Another practical advantage of association studies is that it is easier to enroll large numbers of affected unrelated individuals than to enroll large numbers of pedigrees, each with multiple affected family members, especially for late-onset diseases However, the region of sharing among unrelated affected individuals will

be narrower which implies higher marker densities on the order of hundreds of thousands

of markers are required for whole-genome association studies than in linkage analysis (Kruglyak and Nickerson, 2001)

Trang 5

3.2.4 Insights from Model Systems

With animal models, there are two ways of assigning genes to a physiological process The phenotype-driven approach involves linkage mapping of naturally occurring or induced mutations followed by screening of positional candidates Conversely, in the genotype-driven approach, the effect of a known gene on physiology is investigated by engineering transgenic or knockout models Use of inbred strains limits the number of positional candidate genes to those that are different between the two strains Rare alleles with large displacement can become rapidly fixed by many generations of positive selection After initial mapping, the physiological effect of individual polygenic factors may

be further studied by constructing transgenic or knockout animals After mapping a locus successfully in a model organism, syntenic conservation between animals and human can exploited to identify equivalent candidate regions in the human genome For an example, the recent identification of the LPR1 gene in a murine model of tuberculosis suggests that the human homolog SP110 could be a strong candidate gene in determining host susceptibility to tuberculosis (Pan et al., 2005)

3.2.5 Other Approaches for Studying Disease Phenotypes

Gene expression profiling using microarrays (Weiss and Terwilliger, 2000) and RNA inference (RNAi) technology can enable rapid screening and testing, respectively, of candidate genes involved in a disease or biological pathway In the former, the assumption is that the susceptibility allele causes a differential gene expression between tissues from affected and unaffected In RNAi, mRNAs of candidate genes are targeted for degradation specifically by engineering homologous double stranded RNA molecules

Trang 6

3.3 Allelic Spectrum of Complex Diseases

Much of the enthusiasm advocating the use of association studies to map complex traits subscribe explicitly or implicitly to the common disease-common variant (CDCV) hypothesis The CDCV hypothesis posits that the genetic variation underlying susceptibility to complex diseases arose spontaneously within the founding population of modern humans and gradually disseminates globally The disease susceptibility alleles have persisted to reach fairly moderate frequencies presumably because they had been originally neutral and were not under selective pressure Until recently, an opportunity, e.g due to a change in environment or interaction with other genes, arose that triggers the manifestation of the disease In the alternative common disease rare variant (CDRV) model, multiple rare alleles predominate, each contributing a small fraction to the population disease risk (Pritchard and Cox, 2002)

Existing data support either model An example often quoted by proponents of the CDCV model is the ApoE gene in which a single common allele, the ε4 allele with a frequency of 5-41%, increases risk to heart disease and Alzheimer’s disease (Pritchard and Cox, 2002) Another commonly cited example is the highly prevalent Pro12Ala polymorphism (frequency >75%) in the PPAR-γ gene which is associated with a 25% reduction in Type 2 diabetes risk (Stumvoll and Haring, 2002) On the other hand, multiple rare variants in the CARD15/NOD2 gene contribute to susceptibility in about 20%

of patients with the common and chronic inflammatory bowel disorder, Crohn’s disease (Hugot et al., 2001; Ogura et al., 2001) Yet another example exists in which both common and rare variants are believed to contribute to phenotypic variation in HDL levels

in the general population (Cohen et al., 2004; Frikke-Schmidt et al., 2004) Which hypothesis is of interest to the investigator requires different methodological

Trang 7

considerations For the CDRV hypothesis, a comprehensive analysis by resequencing is preferable compared to genotyping of known variants

3.4 Case-Control Association Studies

The most common, simple and oldest form of association study design involves sampling two random samples, such as cases and controls, from the population and studying the distributions of risk factors among them Another association study design, the cohort study, assembles a group of individuals who are followed up with time to determine the frequency that the disease develops The cohort study is prospective while the case-control study design is retrospective in nature The reasons favoring a case-case-control study over a full cohort design are almost always practical The clearest advantage is when the disease is rare and exposure of interest common Its retrospective nature leads to savings in time and subjects are also easy to enroll The cohort study is more expensive and labour-intensive but on the other hand, it has more credibility and offers an opportunity to collect more reliable exposure information Case-control studies can suffer from (i) selection bias due to inappropriate sampling of cases and controls, creating non-comparability between them; (ii) information bias caused by measurement errors because the disease status is known to the researcher and subject when measurements are taken; and (iii) incidence or prevalence bias which is the failure to define the nature of the disease variable (Clayton, 2001) Proper matching and the use of genetic markers which are invariant throughout life will reduce selection and information biases

3.4.1 Case-Control Genetic Data Analysis

The analysis of genetic case-control data can be exquisitely straightforward or complex Traditionally, allele and genotype frequencies at each genetic marker between cases and controls are compared using conventional 2x2 or 2x3 contingency table analyses respectively Deviation from Hardy-Weinberg equilibrium (HWE) in the case sample, after

Trang 8

ruling out genotyping error, may also be taken as initial evidence of association (Clayton, 2001; Botstein and Risch, 2003) Logistic and loglinear regression methods have the added advantage of being able to adjust for confounding variables which could not be controlled in the experimental design (Clayton, 2001) Use of multi-locus information such

as haplotypes can offer greater power than individual SNPs in detecting associations by increasing information and accommodating potential locus (multiple genes contributing to overall risk) and allelic heterogeneity (multiple alleles contributing to the overall risk) as well as weak disequilibria among markers and functional variant sites (Judson et al., 2000; Akey et al., 2001; Fallin et al., 2001; Morris and Kaplan, 2002; Hoh and Ott, 2003) For instance, in an association study of SNPs spanning a 1.5 Mb region around the Alzheimer disease locus, APOE, Martin et al (2000) demonstrated that statistical analysis based on haplotypes was able to detect statistically stronger association than single SNP analysis alone, and furthermore, improved fine localization of the susceptibility allele ApoE-ε4 even though the latter was not examined directly in the analysis

3.5 Prioritizing Polymorphisms for an Association Study

Risch and Merikangas (1996) originally proposed studying coding or promoter variants that are most likely to affect the function of a protein or its regulation However, in the alternative approach suggested by Collins et al (1997), sequence variants, including noncoding ones, across the entire genome can serve as genetic markers to detect association by virtue of the phenomenon of LD The effect of coding SNPs can be predicted using simplistic criteria such as determining the severity of the amino acid substitution using BLOSUM62 or Grantham values, or alternatively, through comparative sequence analysis using alignment with evolutionarily related paralogues or orthologues (Stephens et al., 2001;Leabman et al, 2003; Shu et al., 2003) If a promoter SNP resides

Trang 9

in a highly conserved region, it is likely to have an impact In addition, effects of promoter SNPs can be verified experimentally using reporter gene assays but this can be time-consuming and requires technical expertise Although more difficult to predict their effects and thus less well characterized, nonsynonymous SNPs that do not alter protein sequence or are found in the non-coding regions like the 3’ untranslated regions or introns may potentially affect stability, splicing or localization of the mRNA (Pagani and Baralle, 2004) Guidelines for selecting SNPs for a candidate gene association study have been recently published by Tabor et al (2002)

3.5.1 Candidate Gene Approach

In the candidate gene approach, polymorphisms in genes that are known a priori to be

part of the physiological process underlying the trait are examined For most complex traits, numerous candidates are available Family and twin studies can be useful in helping to gauge the heritability (the proportion of phenotype variation attributed to genes), mode of inheritance and penetrance as well as the number of genes involved Linkage studies, including those derived from animal models, as well as an understanding of the biological mechanisms underlying the complex trait, provide valuable clues on candidate regions of the genome to investigate One practical advantage of the hypothesis-driven, candidate gene approach over the whole genome approach is the lower genotyping requirements The candidate gene approach tends to focus on polymorphisms that are likely to be functional such as those in promoter and coding regions (Tabor et al., 2002) These SNPs can be selected from databases or by performing a resequencing-based SNP discovery in a small number of random individuals of identical ethnicity to the study population The latter approach is preferred

as efficient selection of a minimal set of SNPs for an association study requires a prior assessment of allele frequencies and LD relationships (Carlson et al., 2003)

Trang 10

3.5.2 Whole Genome Association Studies

The genome scan approach involves genotyping of a dense map of SNPs arrayed across both coding and noncoding regions in cases and controls (Collins et al., 1997) The strategy hypothesizes that the susceptibility allele descended from a single founder in the distant past so the all his/her descendents carry a signature array of alleles (i.e haplotype) surrounding the causative allele Thus the causative allele need not be observed directly because the adjacent SNPs serve as surrogates due to the effect of LD The search for the causal variants could then be limited to regions showing association

It has been suggested that the required number of SNPs is on the order of 105-106 for a whole genome study (Kruglyak and Nickerson, 2001) The existence of haplotype blocks means a potential reduction in map density since only representative SNPs need

to be examined to capture the entire haplotype diversity with little loss of statistical power (Daly et al., 2001; Johnson et al., 2001; Reich et al., 2001; Patil et al., 2001; Gabriel et al., 2002) In contrast to the candidate gene approach, the whole genome approach is unbiased and can potentially discover novel players that are involved in the trait/disease process

3.6 Issues Surrounding Association Studies

3.6.1 Poor Replication

Association studies are, in theory, a powerful approach to dissect the genetic basis of complex traits (Risch and Merikangas, 1996) But the typical scenario is that the vast majority of initial, strongly optimistic reports of associations, often published in prestigious journals, frequently fail to be reproduced unequivocally (Ioannidis et al., 2001; Lohmueller

et al., 2003) Only 20-30% of claims of statistically significant genetic associations are believed to be true (Lohmueller et al., 2003) Several reasons for the inconsistent

Ngày đăng: 16/09/2015, 17:14

TỪ KHÓA LIÊN QUAN

🧩 Sản phẩm bạn có thể quan tâm