47 2.4 References ……….48 PART III GENETIC CHARACTERIZATION OF ABCB1/4 AND CYP3A Chapter 3: Distinct Haplotype Profiles and Strong Linkage Disequilibrium at the MDR1 Multidrug Transporter
Trang 1I want to thank Miss Wong Li Peng for her diligent and excellent work I would also like to thank all the lab members for their kind help They made my four years in this “family” fun and exciting In addition, I would like to specially thank my friends Mr Wang, Baoshuang, Mr Ren, Jianwei, Mr Wang, Zihua, Mr Zhang, Dongwei, Mr Gwee, PaiChung and Dr Lee, Alvin T.C for their constant support and friendship
I acknowledge the National University of Singapore, for honoring me with studentship and financial assistance in the form of scholarship
i
Trang 2Table of Contents
Acknowledgements ……… i
Table of Contents……… ……… ii
Summary ……… vi
Publications and Awards arising during PhD tenure……….… viii
List of Tables ……… xi
List of Figures ……… xii
PART I INTRODUCTION Chapter 1: General Introduction ……….……… 1
1.1 SNP profiling in candidate genes and genomic regions – the practical pharmacogenetics approach ……… …….…2
1.1.1 Introduction of SNP profiling and several concepts related to it.…… 2
1.1.2 Association studies ……… 7
1.1.3 Detection of signature of natural selection ……… 11
1.2 A drug-response related genomic region around chromosome 7q21.1: the importance of MDR1, MDR3 and the CYP3A clusters ……….16
1.2.1 Drug response related genes The ATP binding cassette (ABC) transporter family and the P450 cytochrome enzyme super-family ……… 16
1.2.2 ABCB1 (MDR1) gene and its related functional and genetic studies 21 1.2.3 ABCB4 (MDR3) gene and its related functional and genetic studies 26 1.2.4 An important CYP3A cluster: the CYP3A4, CYP3A5, CYP3A7 and CYP3A43 genes ……… 27
1.3 Objectives and Significance ……… 30
Trang 31.4 References ……… 31
PART II GENOTYPING TECHNOLOGY Chapter 2: Simultaneous Genotyping of Multiple Single-Nucleotide Polymorphisms in Candidate Gene by Single-Tube Multiplex Minisequencing 2.1 Introduction ……… 43
2.2 Material and methods ………43
2.3 Examples and Discussion ……… 47
2.4 References ……….48
PART III GENETIC CHARACTERIZATION OF ABCB1/4 AND CYP3A Chapter 3: Distinct Haplotype Profiles and Strong Linkage Disequilibrium at the MDR1 Multidrug Transporter Gene Locus in Three Ethnic Asian Populations ……… 49
3.1 Introduction ……… 50
3.2 Materials and Methods ……… ……… 54
3.3 Results ……… ……… 56
3.3.1 SNPs in the promoter and 59 UTR of the MDR1 gene …….…… 58
3.3.2 SNPs in the coding region of the MDR1 gene ………… ………… 60
3.3.3 Haplotype profile and linkage disequilibrium of SNPs in the MDR1 gene ……… …… 63
3.4 Discussions ….……… 67
3.5 References ……… ……… 74
Chapter 4: Genomic Evidence for Recent Positive Selection at the Human MDR1 Gene Locus ……… 78
4.1 Introduction ……… ……… 79
Trang 44.2 Materials and Methods ……… 82
4.3 Results ……… ……… 88
4.3.1 MDR1 SNP allele frequencies differ among populations ……… … 88
4.3.2 MDR1 haplotype diversity differs among populations ……… 89
4.3.3 Highly variable LD between SNP loci ……… …… 93
4.3.4 SNPs 10 and 11 of the MDR1 gene are positively selected ……… 96
4.4 Discussion ……… ……… 101
4.4.1 Varied haplotype diversity and long-range LD in the MDR1 gene 101
4.4.2 Evidence of recent positive selection at the MDR1gene locus … 103 4.4.3 Implication of recent positive selection with respect to functional disease association studies ……….……… 104
4.5 Reference ……… ……… 107
Chapter 5: An Extended Genetic Study of a Drug-response Related Region: Differential Selections Detected in Both MDR1 and MDR3 genes 112
5.1 Introduction ……… ……… 113
5.2 Materials and methods ……… 115
5.3 Results ……… ……… 119
5.3.1 Genotyping results and allele frequencies ……… ……… 119
5.3.2 Linkage Disequilibrium profiles ……… ……… 122
5.3.3 Haplotype frequency profiles ……… ……… 123
5.3.4 Detection of positive selection ……… … 126
5.4 Discussion ……… ……… 133
5.5 Reference ……… 137
Chapter 6: The CYP3A Gene Shows Strong Evidences of Positive Selection in Caucasians ……… 140
Trang 56.1 Introduction ……… ……… 141
6.2 Materials and methods ……… 143
6.3 Results ……… …… 149
6.3.1 Allele frequency and the Fst, Pexcess tests ……… ……… 149
6.3.2 Haplotype and Linkage Disequilibrium profiles …… ………… 151
6.3.3 LRH test of positive selection ……… … 156
6.4 Discussions ……… ……… 160
6.5 References ……… 162
PART IV ASSOCIATION STUDY Chapter 7: MDR1, the Blood–brain Barrier Transporter, Is Associated with Parkinson’s Disease in Ethnic Chinese ……… ……… 165
7.1 Introduction ……… ……… 165
7.2 Methods ……… ……… 168
7.3 Results ……… …… 172
7.3.1 Association of MDR1 SNPs and their haplotypes with Parkinson’s disease.……… 172
7.3.2 Sex differences in risk determination ……… 173
7.3.3 Role of SNPs/haplotypes in the MDR1 gene in later onset of Parkinson’s disease ……….… .175
7.4 Discussions……… ……….………… 177
7.5 Conclusions ……… ……… 181
7.6 References ……… 182
SUMMARY AND CONCLUSIONS ……… 186
Trang 6Summary
The ABCB1/MDR1 multidrug transporter is the prototype of drug transporters and one of the major determinants of drug/xenobiotics response The MDR1 gene, together with several other important drug-response genes, namely the ABCB4/MDR3 gene and the CYP3A gene cluster, maps to a 12 Mb region around Chromosome 7q21.1 A large number of studies here reported associations between MDR1/CYP3A genetic polymorphisms and a diversity of functional traits including gene expression, pharmacokinetic properties as well as susceptibilities to various diseases Functional polymorphisms were also identified at the CYP3A4 and
influence an individual’s response to medication, it is necessary to understand this important drug-response locus, localizing the causative variants and clarifying how its genetic variants affect function
This thesis describes a series of studies aimed at addressing the above questions, using the MDR1 gene and several nearby drug-response genes as models Specifically, comprehensive SNP profiling was carried out in and around these genes
in 5 major world populations: Chinese, Malay, Indian, Caucasian and African American The relationships between individual markers were described in terms of linkage disequilibrium (LD) profiles; and the haplotype frequencies were estimated using Expectation Maximization (EM) approaches and compared amongst the different populations We detected substantial, but highly variable and complex LD at this 12Mb region of Chromosome 7q21.1 Haplotype frequencies vary amongst populations, with the African population being the most different from the non-African populations We further investigated the impact of natural selection at these gene loci through several tests including a modified Long Range Haplotype (LRH)
Trang 7test and Fst / Pexcess based tests The MDR1 and MDR3 genes demonstrated significant evidence of positive selection for several variants residing on a common extended haplotype, in the 4 non-African groups Tests of positive selection, including the LRH
CYP3A gene cluster in Caucasians We further examined the association between SNPs / haplotypes of SNPs within the MDR1 gene with Parkinson’s disease Several MDR1 polymorphisms were found to significantly affect one’s susceptibility to Parkinson’s disease
The studies described in this thesis are amongst the first efforts to clarify the genetic profiles at this important drug-response region at Chromosome 7q21.1 We presented the genetic relationships of several functionally associated drug-response genes at this chromosome locus Our studies would provide a basis for future studies directed at single locus in different populations to be compared systematically It should also facilitate the inference of the genomic location of causative variants The evidences of natural selection demonstrated in our studies are among the first to be reported for genes important for drug response These evidences strongly support the notion that genes controlling drug/xenobotics responses were under substantial selection pressures during recent human migrations Additionally, the approaches for detecting signatures of natural selection and functional association, applied and evaluated in our studies could contribute to the identification of other functional variants in the genome
Trang 8Publications and Awards arising during PhD tenure:
Peer-Reviewed Publications:
1 Kun Tang, Soo-Mun Ngoi, Pai-Chung Gwee, John MZ Chua, Edmund JD Lee,
Samuel S Chong and Caroline G Lee* “Distinct Haplotype Profiles and Strong Linkage Disequilibrium in the MDR1 multidrug transporter gene locus in three
Asian populations Pharmacogenetics 12(6):437-450 (2002) In focus comments
on our article: Kim RB MDR1 single nucleotide polymorphisms: multiplicity of
haplotypes and functional consequences Pharmacogenetics 12(6): 425 (2002)
(2003 Impact Factor: 5.851)
2 Pai-Chung Gwee, Kun Tang, John MZ Chua, Edmund JD Lee, Samuel S Chong,
Caroline G Lee* Simultaneous Genotyping of Seven Single Nucleotide
Polymorphisms (SNPs) of the MDR1 gene by Single Tube Multiplex
Minisequencing Clinical Chemistry 49(4):672-676 (2003) (2003 Impact
Factor: 5.538)
3 Caroline GL Lee*, Kun Tang, Yin Bun Cheung, Li Peng Wong, Chris Tan, Hui
Shen, Yi Zhao, R Pavanni, Meng-Cheong Wong, Samuel S Chong and Eng King Tan MDR1, the blood-brain barrier transporter, is associated with Parkinson’s
Disease in Ethnic Chinese Journal of Medical Genetics 41:e60 (2004) (2003
Impact Factor: 6.368)
4 Kun Tang, Li Peng Wong, Edmund JD Lee, Samuel S Chong, Caroline G.L
Lee* Genomic Evidence for Positive Selection at the MDR1 Gene Locus
Human Molecular Genetics 13(8): 783-797 (2004) (2003 Impact Factor:
8.597)
5 Eng-King Tan, Marek Drozdzik, Monika Bialecka, Krystyna Honczarenko,
Gabriela Klodowska-Duda, YY Teo, Kun Tang, Li-Peng Wong, Samuel S
Trang 9Chong, Chris Tan, Kenneth Yew, Yi Zhao, Caroline GL Lee Analysis of MDR1
Haplotypes in Parkinson’s Disease in a White Population Neuroscience Letts
372: 240-244(2004)
6 Eng-King Tan, Daniel Kam-Yin Chan, Ping-Wing Ng, Jean Woo, Y Y Teo, Kun Tang, Li-Peng Wong, Samuel S Chong, Chris Tan, Hui Shen, Yi Zhao, Caroline
GL Lee MDR1 Haplotype (e21/2677T and e26/3435T) Modulates Risk of
Parkinson’s Disease Archives of Neurology (in press) (2003 Impact Factor: 4.684)
7 Pai Chung Gwee, Kun Tang, Pui Hoon Sew, Edmund J.D Lee, Samuel S
Chong, and Caroline G.L Lee* Strong Linkage Disequilibrium at the
Nucleotide Analogue Transporter ABCC5 Gene Locus Pharmacogenetics (in
press) (2003 Impact Factor: 5.851)
Trang 10Awards:
1 2003 American Association for Cancer Research (AACR) Pfizer
Scholar-in-Training Award to present at the AACR Special Meeting: “SNPs,
Haplotypes, and Cancer: Applications in Molecular Epidemiology” September
13-17 (2003) at the Sonesta Beach Resort Key Biscayne, in Key Biscayne,
Florida (did not go because of visa problems) Details of presentation: Kun
Tang, Li Peng Wong, Edmund JD Lee, Samuel S Chong, and Caroline GL
Lee e21/26777T and e26/3435T alleles in the MDR1 gene showed evidence
of recent positive selection
2 2004 AACR-ITO EN Ltd Scholar-in-Training Award to present at the
# 2923 Details of presentation: Kun Tang, Li Peng Wong, Edmund JD Lee,
Samuel S Chong, and Caroline GL Lee Recent positive selection of SNPs e21/2677 and e26/3435 in the MDR1 gene
Trang 11reports in dbSNP……… 59 3.3 Pairwise allele frequency comparisons of SNPs exon1 -129T > C, exon12 1236C
>T, exon21 2677G > T/A and exon26 3435C > T between the different ethnic groups in this study and between previously published populations and this study ……… 62
Chapter 4
4.1 Allele frequency comparisons of the different SNPs in the different populations
……… 83 4.2 Primers, PCR and Minisequencing Conditions for genotyping of 5 SNPs at the MDR1 gene locus……… 85 4.3 P-values computed by ranking relative EHH of the observed SNP of interest with that of all of the simulated data points under specified models of population history at specified recombination rate ……… …… 105
Chapter 7
7.1 Characteristics of the study populations ……… 167 7.2 Association of SNPs / haplotypes of SNPs with Parkinson’s disease ………… 170 7.3 Effect of gender in the association of SNPs / Haplotypes in the MDR1 gene with Parkinson’s disease ……… 173 7.4 Effect of Age-of-Onset in the association of SNPs / Haplotypes in the MDR1 gene with Parkinson’s disease ……… ……… 175
Trang 12List of Figures
Chapter 1
1.1 Schematic distribution of drug-response related genes at locus 7q21.1 … … 20
Chapter 2 2.1 Multiplex PCR and genotyping results for the seven MDR1 SNP ……… 46
Chapter 3 3.1 Drawn-to-scale map of the MDR1 gene, mRNA and putative protein secondary structure……….……… 55
3.2 Pairwise linkage profiles of the four SNPs present in our population …… … 64
3.3 Haplotype frequencies of the three high-frequency SNP loci present in our population ……… 65
3.4 Linkage disequilibrium profile of the three high frequency MDR1 SNPs in the three ethnic groups ……… ……… 66
Chapter 4 4.1 Distribution of 12 SNPs across the MDR1 gene ……… 84
4.2 Haplotype profiles of the 10 MDR1 SNPs in the five populations ……… 91
4.3 Pairwise linkage disequilibrium profiles for single SNPs ……… 94
4.4 HBDs for five selected loci in the five populations……… 97
4.5 EHH and relative EHH tests ……… 99
Chapter 5 5.1 The physical map of MDR1, MDR3 genomic structure, and the distribution of tested SNPs ……… ……… 115
5.2 Pairwise linkage disequilibrium profiles for single SNPs ………… ………… 122
5.3 Haplotype profiles based on 19 SNPs selected from MDR1 and MDR3 (see methods) in the five populations ……… ……… 124
5.4 Haplotype branching diagrams (HBD) for five selection indicative loci in the MDR1/MDR3 region in all the five populations ……… 126
Chapter 6 6.1 The physical map of the CYP3A gene cluster, the distribution of the tested SNPs and the Fst / Pexcess profiles ……… 147
6.2 Haplotype profiles based on 19 SNPs (see methods) ……… 152
6.3 Pairwise linkage disequilibrium profiles for single SNPs |D’| and r2 were calculated for each pair of the 24SNPs in the five populations and represented in color gradients ……… 154
6.4 Examples of Haplotype branching diagrams (HBD) for several selection indicative loci in the CYP3A region in the fours non-African groups … …… 158
Chapter 7 7.1 Schematic diagram showing relative positions of the SNP sites in the promoter and exons of the MDR1 gene ……… ………… 168
Trang 13PART I INTRODUCTION
Chapter 1: General Introduction
“If it were not for the great variability among individuals, medicine might as well be a science and not an art”, said Sir William Osler in 1892 (1) His view of medicine as an art dominated the last 100 years This notion well reflected the fact that there lacks precise judgments when doctors prescribe medicine to individual patients, although individuals differ greatly in their responses to medicine Besides, great variations in drug-response also exist among different ethnic populations It is not rare that medicines/dosages tested on one population do not directly apply to other populations Fortunately, the practice of medicine is seeing a great change at the dawn
of the new century, as a result of the recent surges of genetic studies With the accomplishment of the Human Genome Project and the rapid development of genetic assays and technologies, geneticists are now seeing the possibility of identifying the inherited differences between individuals that causes the diverse drug responses The study about how genetic differences influence patients’ response to drugs is defined as Pharmacogenetics (1) It is hoped that by the time we have a sufficient understanding
of Pharmacogenetics, we can accurately predict one’s response to medication by analyzing his/her genetic profile; personalize medication to obtain optimal treatment for each individual; and target the specific gene that causes adverse responses or diseases
This thesis aims to shed some light on the understanding of the genetics of drug-response genes In this thesis, I will present several genetics studies carried out
on the Multidrug Resistant 1 (MDR1, ABCB1) gene and a few others closely located
Trang 14drug-response-related genes, namely the ABCB4 (MDR3) gene and the CYP3A gene cluster, based mainly on the approach of Single Nucleotide Polymorphisms (SNP) profiling The introduction is given in three sections The first section is a brief review of the importance and the problems of SNP profiling approaches in the current Pharmacogenetics studies Thereafter, the two major drug-response related members
of the gene super-families, ABC and CYP, and their specific members that are examined in this thesis, the MDR1, MDR3 genes and the CYP3A cluster are introduced Previous genetic and pharmacogenetics studies on these candidate genes are also reviewed in this section Finally, the objectives and significance of this study will be given in the last section
1.1 SNP profiling for candidate genes (regions) – the practical Pharmacogenetics approach
1.1.1 Introduction to SNP profiling and several concepts related to it
Modern medical research has undergone substantial progresses during the last few decades, owing to the accelerated understanding of the basic mechanism underlying normal life processes and diseases at both cellular and molecular levels Every year, there are a significant number of novel, highly efficient medication and treatments out in the market As the choices of therapy grow rapidly, doctors and researchers are, nonetheless, faced with a dilemma – which medication is most appropriate for which individual as there are great variations in the outcomes of treatments amongst different patients Due to the poor understanding of the nature of these variations, a large portion of patients are suffering from undesirable responses to drug therapy, generally categorized as the adverse drug reactions and the lack of
Trang 15therapeutic response Adverse drug reactions have been listed among the five leading causes of death in Western countries (2) The lack of therapeutic response, on the other hand, has caused even greater consequences according to some estimation (3)
It was long believed that the majority of phenotypic diversity in human results from inheritable inter-individual differences in our genetic composition And this notion has received strong experimental evidences from twin studies The study of how genetic differences predict and contribute to the phenotypic variations in drug response, namely Pharmacogenetics, therefore promises a way towards individualized, safer, and more efficient drug treatment (4)
The term of “Pharmacogenetics” has been coined for 40 years (5) However this field has attracted great interest and undergoes rapid progress only recently The first few pioneering studies in Pharmacogenetics were all based on the simple model
of Mendelian inheritance, i.e with mono-gene controlling the altered phenotypes (6) However, Mendelian model is applicable only on rare genetic variations that introduce dramatic effects (6) It is now generally accepted that, the majority of phenotypes in drug response are controlled by multiple genes, resulting in continuous variation distribution (1, 7, 8) A concept related to this idea in genetic epidemiology
is the hypothesis of “Common Disease -Common Variant” (CD-CV), stating that common diseases in human are caused mainly by interaction of common variants in multiple genes (9) On the hand, Pritchard, et al also proposed the hypothesis of
“Common Disease-Rare allele” (CD-RA), where they emphasized alternative possibility that the common diseases are rather attributable mainly to the heterogeneous rare variants rising during population expansion (10) Whether or not a certain disease/phenotype difference is controlled by multiple common variants,
Trang 16identification of genetic variants that control drug-response remains the central problem of the modern Pharmacogenetics
The natural history of genetic polymorphisms in the human genome provides the most important record for the search of functional variants important for drug responses On one hand, genetic variations causing phenotypic changes constitute a special subset of the whole body of genetic polymorphism On the other hand, neutral loci near to the causal variants are strongly shaped by genetic forces acting on the causal variants, due to the physical association (defined as Linkage Disequilibrium, explained in later sections); and thereby provide an informative footprint about the historical genetic events There are many different forms of polymorphisms, including Single Nucleotide Polymorphism (SNP), micro-satellites, short-tandem-repeat polymorphisms (STRPs), sequence insertions and deletions, etc The polymorphism marker commonly used in early studies is the micro-satellite repeats Micro-satellite possesses certain desirable properties including the relatively even genome distribution and the multi-allelic property However, its relatively low density limited its application to fine mapping studies (11) Another important form of genetic polymorphism, the Single Nucleotide Polymorphism (SNP), drew much attention only after the accomplishment of the Human Genome Project, but nonetheless piqued great enthusiasm in the community It is now commonly hailed as a promising way towards a comprehensive understanding of Pharmacogenetics (12, 13) SNP is the most abundant form of genetic polymorphisms In several genome-wide characterization studies, the informative high-frequency SNPs (minor-allele frequency higher than 10%) were estimated to occur on average once every kilo-bases (14, 15) Such a high density is ideal for fine mapping of the genome The great prevalence of
Trang 17SNPs among all forms of polymorphisms also suggests it may be responsible for the majority of the phenotype variance (13)
SNP is of special importance also to the study of Linkage Disequilibrium (LD),
or the non-random association between the genetic variations, which plays pivotal role in many genetic studies including Pharmacogenetics Linkage Disequilibrium rises and decays by mutation and recombination respectively Under the “Infinite Site” model (16), a newly arisen variant is related to all the existing variants on the carrier chromosome via physical connection Such linkages maintain through generations, until it is broken down by recombination, resulting in incomplete association or complete Linkage Equilibrium (LE) as a function of inverse correlation
to distance The pattern of LD over short distance is also greatly affected by other molecular and demographic factors, such as recurrent mutation, random genetic drift, population structure and natural selection (17) The LD can be viewed as the extent to which one polymorphism predicts the status of another This property is of great importance to Pharmacogenetic study, as effects of functional variants can be detected
by examining surrogate markers in strong LD with the causative ones Patterns of LD
on fine scale can also provide insights into the demographic processes of human history (10) As the average extent of considerable LD is around 30~60kb in the human genome (15), almost beyond the highest resolution of any other genetic markers, only SNP satisfies the LD mapping in both large and fine scales
Another concept that looks at the polymorphism distribution from a different angle, and provides more insights into the genetic profiles is Haplotype A haplotype
is the combination of alleles of multiple polymorphic loci on one chromosome (14) Haplotype is thus commonly treated as abstracted chromosome Under conditions of
no recombination, genetic forces on functional polymorphisms affect the chromosome
Trang 18as a whole Such effects can be clearly seen in the non-recombinant Y chromosome and mitochondria DNA If recombinant chromosomes are considered, the haplotype blocks that define genetic region of strong LD, with negligible or limited crossovers, can also be treated as integral genetic unit These blocks of haplotypes thus serve as good predictors of functional variants that reside within the regions of strong LD Hence, since a haplotype is defined by alleles of multiple binary polymorphisms, e.g SNPs, they, therefore possesses statistically better power than single SNPs (18) The early use of haplotype was limited to the analyses of sex chromosomes, where recombination is very low or absent; or of autosomal chromosome with family data (18), for which the haplotype phases - the actual haplotype sequences on either of the pair of chromosomes, can be inferred exactly from genotyping data For the case of autosomal genotyping in random population sample however, one faces the problem
of phase-uncertainty, where the number of possible ways to allocate either alleles of each marker SNP onto a pair of haplotypes increases exponentially with the number
of heterozygous loci (19) Several strategies were proposed to solve this problem One category uses so-called experimental “haplotyping” approach, which intends to physically separate the chromosome pairs by means of molecular technologies This includes cloning, allele-specific polymerase chain reaction and single molecule dilution, etc (20-22) Although very precise, these experimental approaches inevitably suffer from high labor and expense cost and major technological improvements are necessary to facilitate high-throughput studies (18) On the other hand, statisticians proposed computational ways of resolving this problem by compensating certain levels of precision Of these, the Expectation Maximization (EM) based algorithm for haplotype estimation (19, 23, 24) is probably the most commonly used approach recently This algorithm estimates the maximum likelihood frequency distribution
Trang 19based on the assumption of complete random mating within the studied population Under tests utilizing either experimental or simulation data, EM produced satisfying estimations with little overall differences from actual data for most common conditions, although with relatively less accuracy for the minor frequency haplotypes (25-27) Another commonly used approach is the “PHASE” algorithm, originally proposed by Stephens et al (28) PHASE is a Bayesian statistical method incorporating the prior knowledge that unresolved haplotypes will be similar to phase-certain haplotypes (18) Compared to the EM algorithm, the performance of PHASE
is similar, or in certain cases slightly better (18, 27, 28) A third algorithm that was once popular in genetic study is the subtraction method described by Clark (29), although now EM and PHASE generally outperform it (18)
1.1.2 Association studies
As mentioned in the previous section, the central problem of Pharmacogenetics is to scan for functional loci that determine the variability in drug responses Association study is the most common strategy for mapping the candidate functional variants
The principle of genetic association study is to detect the statistical association between genetic variations/mutations and the variances in the traits-of-interest, such
as the differences in individual’s predisposition to certain diseases and the response to certain drugs The statistical dependence between a genetic marker locus and the trait suggest possible functional effect the corresponding locus has on the trait, although other factors, such as population structure could also contribute to it Therefore association study provides the possibility of detecting the causal variants by genetic study In reality, the association study of phenotypic changes and their corresponding
Trang 20genetic variants strongly relies on the understanding of Linkage Disequilibrium On the one hand, the linkage between genetic polymorphisms suggests that an observed association of a genetic marker to a phenotype variation does not necessarily attribute any direct functional effect to this marker, as it could be an indirect reflection of the causal effect by a causative variant in LD with the observed markder On the other hand, LD may serve as a short cut in the association study, as theoretically a small subset of polymorphisms (tagging SNPs), which are designed to represent a high percentage of all polymorphisms that are in LD, can detect most genetic variant/function associations without greatly compromising in the test power, and with a great increase in the efficiency (1, 17, 30)
Despite the theoretical advantage of using limited marker set to map the genome-wide LD, the feasibility of this idea highly relies on the behaviors of Linkage Disequilibrium, such as how long the useful LD extends and how variable LD is among different genome regions (17) Long and regular LD patterns could significantly reduce the number of markers necessary for LD mapping; whereas short LDs require more markers to be genotyped, although might reduce the effort in experimental identification of causative variants because of the shorter candidate regions The behavior of LD in the human genome generated intensive debates in the recent few years Simple computational models of constant population size and uniform recombination rates predicted Linkage Disequilibrium as a relatively defined function of the physical distance, with useful strong LD extending no more than several thousands base-pairs (31) However, large experimental surveys of high-density SNP profiles in human genome revealed very different scenario, where LD is far more variable and generally longer than expected under simple evolution model (14, 15) Later experiments all confirmed this big variability of LD in different
Trang 21genes/regions (17, 32-35) Several genetic factors were proposed to account for the large LD variability Reich et al ruled out the possibility of stochastic variation as the only cause of LD variability, proposing that the discrete LD patterns could come from genetic events such as severe bottlenecks and selection (35) However, simulations assuming genetic events could not well explain the observed high nucleotide diversity
in modern human (10, 17, 36) On the other hand, Patil et al derived a resolution SNP profile of the entire chromosome 21 and found that SNPs follow a blockwise distribution (37) Several other studies confirmed this finding and attributed the block-wise LDs to recombination hotspots (32, 37-40) The uneven recombination activities have been confirmed experimentally (39), and were thought
high-to account for the majority of LD variation by many researchers (35, 40) However others questioned the uneven recombination rate as the major cause of discrete LD, since simulations showed that block-like LD pattern can also derive from uniform recombination rate, under severe bottlenecks or strong selection (39, 41) Furthermore, extensively long LDs were observed when haplotype-specific |D’| were measured (42), showing a seemly “jumping” action of LD across LD blocks and contradict the hotspot hypothesis In view of the complicated LD profile, people have proposed some new measurement to describe Linkage disequilibrium as well as to represent the association information underlying the tested regions, such as the tagging SNP and
LD block, haplotype blocks (32, 37, 40, 41, 43, 44) These new technologies may help handle large data sets in genome-wide studies
The first commonly used genetic marker in association study was the satellite (45-54) However, studies using such genetic markers covered only limited genomic regions due to the low marker density SNPs provide the opportunity of fine, genome-wide association tests
Trang 22The association studies based on SNPs can be classified into two major categories: the genome-wide association study and the candidate gene/region directed association study The genome-wide association approach is based on the idea of the design of a subset of SNP markers that covers most of the LD patterns across the whole human genome The search for the marker loci correlated with certain phenotypes, e.g some diseases, can then be conducted across the entire human genome Such genome-wide approaches make no assumptions about the functions of individual gene regions, and therefore provide the possibility to detect new drug-related genes and to investigate the multiple-gene interactions (30) Such strategy has proven to be promising in the identification of candidate causative loci for Alzheimer’s disease (55) However, there are also several problems The genome-wide approach requires a complete SNP subset throughout the entire genome with sufficient SNP density in the presence of largely predictable patterns of LD While the genome-wide SNP density increases rapidly as a result of big efforts from several international SNP databases (dbSNP, TSC, Celera), the LD profile has been shown to
be anything but simple, discussed in the previous section Furthermore, genome-wide SNP profiling is certainly highly costly and time-consuming; and the pre-requisite large sample-size and statistics handling big datasets are always practical difficulties (56) These led to the preference to the alternative approach: the candidate gene/region directed SNP profiling and association study, which is also the basic strategy used in this thesis study This approach chooses candidate genes/regions related to drug-response, according to the experimental or taxonomical evidences Within the candidate genes/regions, appropriate SNP markers were then selected for
LD and haplotype analyses, as well as association analyses
Trang 23The candidate gene/region directed association study has been widely applied
in the epidemiology and pharmacogenetics studies The majority of such studies report associations of single SNPs to certain phenotype changes However, most early SNP-based association studies failed to provide analyses of LD and haplotype, and thus are limited in identifying the causal variants as well as assessing the validity
of associations with the candidate loci As the importance of LD and haplotype profiling becomes more apparent, there is an increasing number of candidate gene studies that carried out detailed LD and haplotype analyses (33, 34, 42, 57, 58) Besides the tests based on single SNPs, novel association tests using multiple-SNPs and haplotypes have also been proposed The major debates on this type of approach
is about how to construct tests that properly utilize phase-uncertain data, and how to increase the statistical power of tests (59-62) As these tests utilize genotype data from population samples of unrelated individuals, they can be easily applied in general association studies (59)
1.1.3 Detection of signatures of natural selection
Although association study provides a promising way to detect functional genetic variants of medical significance, it sometime requires prohibitively large sample size to achieve enough statistical power for moderate or multi-gene effects Recently a new idea draws much attention that important genetic variants determining drug-response variability may be inferred by detecting signatures of natural selection (63) Since Darwin, it is known that genetic variants conferring better fitness to the carriers have a higher probability of being inherited to subsequent generations and therefore increase their allele frequency in the population This process is called positive selection On the other hand the deleterious mutations are continuously
Trang 24screened out from the population by environmental pressures as a result of purifying selection or negative selection It is hypothesized that during harsh environmental changes, the genes that strongly interact with the altering environmental factors will
be subject to more intensive selection pressures, differentially selected for their functional polymorphisms and leave specific genetic patterns in local genome regions (63) The ability to detect such signatures of natural selection thus provides an attractive way of finding functional variants Current data strongly suggests an African origin of the modern global human populations (64-67) The dispersal of the early humans from the African continent to the rest of the world: Euro-Asia and America challenged the Out-of-Africa groups with quite different environments, particularly the varied climates, pathogens and sources of food (63) Given this migration model, it is very likely that a number of variants were locally selected in different populations to adapt to the drastic environmental changes These functional variants that once facilitated the fitness of ancestral humans in different environments should still contribute greatly to today’s inter-individual / inter-population differences
of drug responses and susceptibility to diseases (63) Hence, the study of selection events that occurred in the recent human history, especially during the period of the Out-of-Africa migration and the colonization of ancient humans in the new continents
is therefore very important to the Pharmacogenetics study
The intrinsic way of detecting the signatures of positive selection is to detect allele frequency changes, which are the major consequences of a selection event However, the variability in allele frequency does not only come from selection Under selection neutrality, allele frequencies do fluctuate, and sometimes greatly, due to the stochastic process known as genetic drift (68) Nonetheless, it is now evident that human population history was far from a single, constant, equilibrium process; rather
Trang 25it involved many complicated demographic processes such as bottleneck, population sub-structure, population size change, etc (63, 69-71) All these demographic factors influence the genetic profiles, including the allelic spectrum, in different ways And some of these effects can resemble those caused by selection (63) and complicate the correct identification of evidence of natural selection
The power of a test of selection largely relies on the design of the statistics Statistics should be chosen to capture the genetic signatures that best define the occurrence of natural selection and yet to be robust against any demographic effects
of neutrality Bamshad and Wooding recently described a variety of selection signatures (63), including the excess of rare polymorphisms, excess of high-frequency derived variants, big allele frequency differences among populations, and extensively strong LD or high homozygosity for common variants
Detection of positive selection has been for a long time the central issue of population genetics There have been available a number of useful positive selection tests Traditional tests are usually classified into two major categories (63) The first category is based on the comparison of the variability and divergence of different types of genetic variations The commonly used tests in this category include the
Kreitman’s test (75), etc These tests depend less on assumptions about the demographic processes, and thus are more robust against variability caused by demographic factors However they are not very powerful, especially when testing of candidate gene/region is concerned, as most of them utilize only the information of variation types The tests of the second category, on the other hand, tend to recognize the specific patterns in allelic spectrum and/or variability levels caused by selection sweeps (63) Many of them are based on the comparisons of estimators of population
Trang 26mutation rate θ (63) These include Tajima’s D test (76), Fu and Li’s D* and Fu and Li’s F* tests (77), Fay and Wu’s H test (78) and so on These tests provide relatively higher power of testing, given their usage of allele frequency spectrum However, they rely strongly on the assumptions of demographic histories, and significant results may not necessarily indicate positive selection but other demographic effects (63) Recently, Kim et al proposed a composite-likelihood approach, which incorporates the allelic configuration and recombination into the framework of likelihood test (79) This test demonstrated high power both in simulation evaluation and real data testing
in drosophila (79, 80)
Tests based on new strategies have also been developed Some of them captured much attention Sabeti, et al recently proposed the Long-Range-Haplotype (LRH) approach, focusing on the allele specific, over-extended Linkage Disequilibrium caused by recent positive selection (81) Its underlying principle is straightforward: when a new mutation arises, it is completely linked to the other variants already present on the same ancestral chromosome (i.e it occurs on a single ancestral haplotype) Under neutral selection, the mutation will fluctuate its allele frequency due to genetic drift – either disappearing after a short period as most other newborn variants do, or slowly accumulating in the allele frequency Assuming this neutral variant by chance rises to an intermediate frequency, the time taken is expected to be so long that the LD flanking this locus should have been substantially broken down by recombination However, a new variant that is under positive selection will be enriched in the population in a much shorter time, such that its surrounding LD and homozygosity show relatively little decay The LRH test implements this idea by using the relative Extended Haplotype Homozygosity (ReEHH) to describe the relative strength of LD on the tested variant over its
Trang 27corresponding alternative allele(s) (81) An unusually high ReEHH combined with a high allele frequency is therefore indicative of occurrence of recent positive selection The formal statistical test is then carried out by comparing the actual values against the simulation derived distributions An advantage of the LRH test is the design that the LD of the tested variant is compared against that of its alternative alleles, which makes this approach less sensitive to the local variation in recombination rate (81) The LRH test, considered to be powerful, has been successfully applied in the detection of evidences of recent positive selection in the glucose-6-phosphate dehydrogenase (G6PD), CD40 ligand gene (TNFSF5) (81) and lactase gene (82) However, this test also depends heavily on the assumptions of human history, and therefore the effects of demographic factors such as population size and structure should be carefully justified Candidate gene/region directed approaches, although providing detailed characterization of suspected selection patterns, are sensitive to undefined neutral variations imposed by the complexities of actual human history From the view of whole genome however we expect natural selections to affect only local genetic patterns while the effects of demographic forces should be seen genome-wide (63) Given this, regions under local adaptive selection should have higher differences of genetic variability among different populations, compared to neutral regions Akey, et al did a whole-genome scan based on this idea, by examining
of inter-population difference at individual polymorphic site Genes or regions
natural selection In another study, Bersaglieri, et al applied both the LRH test and
gene in the north European-derived populations In addition, they described a novel
Trang 28test that uses Pexcess instead of Fst to measure the allele frequency changes due to
In that study where all the three selection tests were utilized, consistently significant evidences of selection was reported for the lactase region (82) The growing wealth of evidences of selection as well as novel approaches are improving our understanding
of selection, human history and the effects of demographic factors Nonetheless, the current approaches are still constrained by our insufficient knowledge about the real human histories Approaches such as LRH, although providing considerably more power to detect departures from simple neutral models, are highly dependent on the demographic assumptions One solution to address these problems is to control the test statistic against the genome-wide distributions The ultimate goal will be to gain a comprehensive understanding of the actual human history and thereby to develop appropriate simulation models to justify individual signature of selection
1.2 A drug-response related genomic region around chromosome 7q21.1: the importance of MDR1, MDR3 and the CYP3A clusters
1.2.1 Drug response related genes The ATP binding cassette (ABC) transporter super-family and the P450 cytochrome enzyme super-family
The candidate gene/region-directed pharmacogenetics studies are designed to provide detailed characterization of regional genetic patterns, from which inference can be made about the nature of the specific gene/function interactions The detailed knowledge obtained in candidate regions can in turn provide guidelines for the genome-wide pharmacogenetics study To conduct candidate gene directed studies, specific target genes/regions showing relevance to drug-response are first to be
Trang 29selected Many genes contribute to drug responses: Genes that control the absorption, distribution, metabolism and elimination of drugs/chemicals/toxins determine one’s pharmacokinetic properties; whereas genes that are targets of individual drugs and those residing in drug-response signaling pathways affect the pharmacodynamic properties (3) Currently, major attention is paid on pharmacokinetic determinants, as they affect drug responses in a more direct and general way These pharmacokinetic genes are mainly constituted of two functional groups - the drug metabolizing enzymes and drug transporters
Drug metabolizing enzymes basically carry out two types of xenobiotic metabolism - Phase I and Phase II Reactions involving chemical structure transformations of compounds, such as reduction, hydrolysis is classified as Phase I; whereas the conjugation reactions of a compound via glucuronidation, sulfatation, acetylation and methylation and so on are classified as Phase-II Among the Phase-I enzymes, the cytochrome p-450 (CYP) family, characterized for its heme-containing-oxgenases, constitutes a major xenobiotics/chemical-compounds metabolizing system
So far there are ~90 CYP genes identified in the human genome (http:// drnelson.utmem.edu/CytochromeP450.html) Among these, genes from subfamilies 1-
3 are mainly responsible for the metabolisms of exogenous compounds, including drugs and xenobiotics, while members of other subfamilies (CYP 4-51) participate mainly in the metabolisms of endogenous metabolites such as fatty acids It is estimated that the CYP 1-3 families metabolize up to 70-80% of all the drugs in today’s use (84, 85) CYP1-3 subfamilies have long been of particular interests to pharmacogenetics study, not only for their unique importance in the drug metabolizing functions, but also because of their evidently high polymorphism both in genetic and phenotypic aspects It was found that compared to families CYP 5-51,
Trang 30which metabolizes endogenous substrates and have an overall percentage of genes of about 13%, the CYP1-3 families have a high ratio of 16 pseudo-genes out of the identified 38 genes (42%) (86)) The nucleotide polymorphisms are also evidently more common in CYP1-3 genes, most of which harbor functional polymorphisms, compared to the only 6 out of 20 genes detected of genetic polymorphisms in the CYP4-51 families (86) The much lower genetic diversity of CYP4-51 genes, which metabolize endogenous substrates, can be explained by the essential and also highly conserved endogenous metabolic pathways On the contrary, CYP1-3 genes would be exposed to much more rapidly-changing environmental agents, and hence have more genetic polymorphisms to cope with the relatively short-term and moderate pressures exerted on the gene These genetic variants have long been found to have functional impact on drug responses Back in the year 1977, Mahgoub, et al reported the polymorphic hydroxylation of drug in human (87) This drug-response variation was later attributed to the genetic polymorphisms in CYP2D6 gene (88), which is presently the best characterized CYP enzyme Now it is clear that more than 70 CYP2D6 alleles account for more than 200-fold variability in the metabolism of >100 drugs (89) It is also estimated that such highly polymorphic pattern and strong impact
pseudo-in drug response are not unique to well studied genes like CYP2D6, but common pseudo-in many other CYP1-3 genes (Home page of the Human Cytochrome P450 (CYP),
thus serve as good candidate to identify important genetic variants in drug-response genes
Besides the drug metabolizing enzymes, the drug transporters also play crucial roles in determining drug responses, and are attracting increasing attentions in the past few years Most of the identified drug transporters so far belong to the ATP binding
Trang 31cassette transporters (ABC) super-family, the largest transmembrane (TM) protein family in the genome The name of this gene superfamily comes from these transporters’ characteristic ATP binding domain(s) A typical ABC transporter protein has two homologous halves, within either of which there is one transmembrane (TM) domain containing 6-12 membrane spanning α-helices, and one ATP binding domain, which is also called Nucleotide Binding Domain (NBD) (90) Nonetheless, some ABC proteins are half-transporters with only one of the half structure They must form homo/hetero-dimers to function as transporters Transporters from this superfamily form crucial translocation systems from bacteria to human being With the consumption of ATP energy, they translocate a wide variety of chemical compounds, including peptides, metabolic products, lipids and steroids, drugs and xenobiotics, etc, across all kinds of extra or intra-cellular membrane systems In humans, several members of the ABC family are particularly important to drug response These genes were first identified in cancer and tumor cell lines that developed the multi-drug resistance (MDR) phenotype or the ability to prevent a variety of chemotherapy drugs from entering the cell The MDR phenotype strongly correlated with the over-expression of several ABC transporters, including the ABCB1 (MDR1), ABCC1 (MRP1) and ABCG2 (MXR) (91) Functional assays revealed that these multi-drug resistant transporters confer the MDR phenotypes by pumping out a wide variety of structurally unrelated drugs, cytotoxins and related chemicals from the inner cell membrane The expression of a well studied drug-transporter, the MDR1 gene, is found to have a wide distribution in the whole body, especially high in the interfaces of major organs such as liver, intestine, Blood Brain Barrier (BBB) and kidney Studies show that other important drug transporters have similar distributions along the major organ-interfaces Such expression distributions
Trang 32Figure 1.1, Schematic distribution of drug-response related genes at locus 7q21.1
Chrom 7q21.1
combined with their drug transport functions strongly suggest that, together, they constitute a powerful protective system for essential organs against the exogenous toxins, drugs and xenobiotics Genetic studies focusing on nucleotide polymorphisms and their association with functional differences and diseases in several members of the MDR ABC genes have increased very rapidly in the past few years A recent genome-wide survey demonstrated that a large number of SNPs are localized near the ABC genes, at variable densities (92) All these studies highlighted the need for pharmacogenetic characterization in the ABC genes of drug-response importance
The ABCB1 (MDR1) gene is the first cloned human ABC gene and the best characterized drug transporter (91) This gene maps to chromosome 7q21.1 The ABCB1 gene has been intensively examined for its drug-transport function and its substrate spectrum And recently, there are enthusiastic debates about the functional associations of several SNPs in ABCB1 to many drug-transport related phenotypic changes and several diseases Reviews in more details about functional and genetic studies on ABCB1 are presented in later sections However, before the studies presented in this thesis, there were few systematic genetic profiling studies in this gene, which are critical in directing the identification of the actual causative variants Furthermore, there is a lack of thorough comparisons of the genetic profiles among
Trang 33the global populations at this gene locus, which had made it difficult to compare the effects of MDR1 polymorphisms in the different populations (93, 94) Besides the ABCB1 gene, it is striking to see that several other drug-response important genes also lie around the chromosome 7q21.1 region, spanning a relatively short distance of 12Mb from ABCB1 These genes include ABCB4, a transporter gene highly homologous to ABCB1, and a cluster of CYP3A genes including the important drug-metabolizers CYP3A4, CYP3A5 and several other major CYP3A members (Figure 1.1) Their functional importance and related genetic studies will be reviewed in more detail in later sections However, limited efforts have been done on the comprehensive genetic profiling for these genes as well The whole region of two ABC transporters and several CYP3A drug-metabolizers thereby can serve as an ideal target for candidate genes/regions directed Pharmacogenetic study Further more, it is known that ABCB1 and CYP3A4, the two major drug-response factors share a highly overlapping substrate spectrum, and both are controlled by some common drug-response transcription regulators (3, 95) It is therefore interesting to examine whether these two have interactions at the genetic level Study of such interactions might ultimately benefit genome-wide and multi-factorial Pharmacogenetic studies
1.2.2 ABCB1 (MDR1) gene and its related functional and genetic studies
ABCB1 (MDR1) is the best studied and one of most important ABC drug transporters It is the first cloned ABC transporter (96) This gene was first identified via an observation that its protein product, the Pgp protein, is highly expressed on the plasma membrane of tumor cell lines which developed cross resistance to various anti-cancer drugs (i.e MDR phenomena) (97, 98) The ABCB1 protein, also called the P-glycoprotein (Pgp), is a 1280 a.a glycosylated protein It is a typical ABC
Trang 34transporter with two homologous TM-NBD halves, of which each TM domain has 6 membrane-spanning helices The Pgp protein is a major drug/xenobiotics transporter with a very wide spectrum of structurally unrelated chemical compounds, including anti-cancer, anti-arrhythmics, anti-depressants, anti-pyschotics, anti-tuberculosis, cardiovascular, immunosuppressive and anti-viral drugs including HIV-1 protease inhibitors (99, 100), as well as steroids, opioids, small peptides and other xenobiotics (101) Its expression was detected throughout the whole body, especially high in the interfaces of major organs, such as epithelial cells of the intestinal tract, proximal renal tubular cells, canalicular membrane of hepatocytes, pancreatic ductuli, the blood brain barrier (BBB), testes, placenta, adrenal cortex, and blood cells The molecular function and expression distribution of Pgp strongly suggest that ABCB1 plays a pivotal role in protecting the human body, especially the important organs, against entry of xenobiotics, drugs and exogenous toxins Knockout mice experiments supported such a physiological function of ABCB1 Schinkel first found that mice
that are deficient in mdr1a, the homologous counterpart of MDR1, accumulated
anthelmintic ivermectin, a MDR1 substrate, 100 times more than wide type mice in the brain after oral administration (102) In human, studies have shown that Pgp limited the bioavailability of various drugs by secreting them back to the gastrointestinal tracts, after either oral or intravenous administration (103) Several inhibitors of Pgp on the other hand, increased the bioavailability of oral drugs (104, 105)
Recently much attention has been focused on genetic polymorphisms at the MDR1 locus, since any functional genetic variants in this gene may act as important modulators on drug-responses Prior to the analyses of natural polymorphisms in ABCB1, several acquired mutations in cell-lines and a number of artificially
Trang 35generated amino acid substitutions were functionally characterized for their influences
on P-gp function Several substitutions showed significant modulation on Pgp substrate specificity and transport efficacy (106) Hoffmeyer et al conducted the first systematic screening of natural polymorphisms in ABCB1 in Caucasian population, and identified 15 SNPs from 5’ UTR, promoter, exons, intron-exon boundaries (107) Further studies identified additional polymorphisms, and characterized their allele frequencies in ABCB1 among several populations (108-110) As the ABCB1 gene expression was found to markedly vary between individuals, early efforts focused on detecting possible effects the promoter polymorphisms have on ABCB1 expression However, up to now, there are no conclusive findings correlating the promoter genotypes with ABCB1 functional changes Instead, several exonic SNPs attracted much interest recently Of the 15 SNPs first identified in ABCB1, Hoffmeyer et al found that a synonymous SNP, the exon26 3435C>T is correlated with significantly lower Pgp expression and weaker in vivo drug efflux function (107) However, this finding didn’t lead to an easy conclusion about the genotype/phenotype relationship
on this SNP Rather, it initiated intensive debates, as the subsequent association studies on this SNP showed marked inconsistency Several studies reported the association of exon26/3435T with lower ABCB1 expression (107, 111, 112) and increased drug accumulations (107, 111) for a variety of drugs and tissues, suggesting
a T allele is associated with decrease in the MDR1 expression or function Fellay et al (113) noted that HIV1 patients carrying the T allele had lower MDR1 expression in blood cells and better response to anti-HIV1 drugs, which is consistent with the former findings; however simultaneously had a lower plasma concentration of the drug nelfinavir after oral administration, implying a better p-gp function Further more, Kim et al (110) and Sakaeda et al (114) found that the T allele of SNP3435
Trang 36correlated with lower plasma/serum intakes of fexofenadine and digoxin respectively, favoring a stronger efflux efficacy of the T allele There were also a number of studies reporting no association between this SNP and Pgp expression/functions (115, 116)
Although examinations were mainly focused on the exon26/3435C>T, this SNP doesn’t change any amino acid of Pgp and therefore is not very likely to directly cause functional changes Kim et al first noticed the co-segregation of exon26/3435 T variant with the T alleles of other two SNPs in exons The synonymous 1236C>T of exon 12 and the non-synonymous 2677G>T/A of exon 21, which results in an Ala893Ser/Thr amino acid change (110) Relatively few studies were conducted on these two additional SNPs, although some inconclusive associations of these two SNPs with Pgp function were reported (110, 117-119) In summary, abundant evidence supports the existence of functional polymorphisms in ABCB1, but current reports suggest a rather complicated nature of the underlying genetic variant/phenotype interactions
Studies have also been done to evaluate the association of ABCB1 polymorphisms, particularly exon26/3435 C>T, with the responses to therapeutic treatments as well as the predisposition to diseases involving xenobiotics exposure The exon26/3435 C>T variant was found to significantly correlate with altered responses to HIV antiretroviral therapy (113), and altered risks of developing Parkinson disease (120, 121), epilepsy (122), renal epithelial tumors (123), gastrointestinal diseases (124) and so on, although a clear mechanism of these correlations is yet unknown
To directly address the question whether these polymorphisms are responsible
for the phenotypic changes, several important SNPs were also assayed in vitro Kim et
al first demonstrated that NIH-3T3 cells expressing a Pgp variant Ser893 (exon 21/
Trang 372677 T) had a significantly lower digoxin accumulation than those expressing type Ala893 (exon 21/2677 G), by using a retroviral vector system (110) However, Kimchi-Sarfaty et al failed to detect any changes in either Pgp expression or drug transport, after expressing the ABCB1 mutants carrying each of several SNPs: A61G, T307C, G1199A, G2677T, and G2995A, in a vaccinia virus-based transient expression system in HeLa cells (125) Similarly, a recent study testing different combinations of alleles of exon21/2677 and exon26/3435 in LLCPK1 cells also recorded no significant differences in the transport of various common Pgp substrates (126) Several factors may account for the overall negative results: these tested variants may change the efflux of some but not other drugs, or there might be other intrinsic transporters expressed in the host cells that reduced the effects on Pgp However the seemly strong linkage disequilibrium seen in ABCB1 gene (110) highlighted another possibility, that some other polymorphisms, not yet identified but
wild-in strong LD to exon26/3435 or exon21/2677, may be the real causal variants
These single-SNP based association and characterization studies have implied
a complicated, if not totally contradictory, nature of genotype/phenotype relationship
in the ABCB1 gene Further single-SNP based studies would be as insufficient as blind trials; as such studies don’t explicitly address the relationships among polymorphisms, and thus provide no clues to the position of causal variant(s) (3, 127) Recent studies suggested that genetic variants in introns or even “non-functional” intergenic regions can also strongly affect enzyme functions via newly discovered mechanisms, such as alternative splicing of mRNA (128) Therefore the strong LD blocks surrounding the associated SNPs should be defined, and systematic genetic analyses should be conducted to narrow down the searches Furthermore, it is in principle improper to compare single SNP associations between different populations
Trang 38without a prior systematic characterization of the inter-population differences at the local genetic background The currently observed discrepancies might well be contributed by the underlying differences in the genetic profiles among different populations However, these two important problems have not been fully considered and systematically investigated in the previous studies A comprehensive characterization of genetic profiles in LD and haplotype across different major populations is therefore necessary
1.2.3 ABCB4 (MDR3) gene and its related functional and genetic studies
Another ABCB transporter, the ABCB4 (MDR3), lies just closely downstream from the ABCB1, with the same orientation This gene has a very high homology to ABCB1 (129, 130), suggesting the rise of the two ABCB genes from a gene duplication event Currently the physiological function of ABCB4 is not totally clear
A better characterized function of ABCB4 is the translocation of phosphatidylcholine (PC) from liver to bile (131-133) Mice knocked out at the mdr2 gene, the homologous counterpart of ABCB4, completely failed to translocate the PC molecules in liver (131) On the other hand, epithelial cell-line stably transfected with ABCB4 gained PC transportation function (133) On the other hand, as a highly homologous genetic sibling to ABCB1, ABCB4 has been explored for a possible role
in drug response Early studies that introduced ABCB4 expression into cell lines by
Smith, et al showed that ABCB4 does interact and transport a subset of MDR1 substrates (135) The overall effects of ABCB4 in drug response have not been totally understood Another interesting aspect of ABCB4 is its seemly highly polymorphic There are at least 4 alternative splicing transcripts reported for ABCB4, resulting all
Trang 39in in-frame protein products (129, 130) This gene also has a number of different transcription starting sites (136) These evidences suggest that genetic polymorphisms
in this gene may have significant impact on its physiological functions Evidences from functional and clinical studies also supported the functional effects of ABCB4 polymorphisms, as several mutations in ABCB4 were found to be responsible for PC translocation defects related to liver disease (137-141); and significant associations were reported between ABCB4 expression level and several drug-resistance phenotypes during the treatments of certain diseases, particularly some leukemics (142-145)
1.2.4 An important CYP3A cluster: the CYP3A4, CYP3A5 and CYP3A7 genes
As discussed in the previous sections, CYP1-3 genes are important determinants of the drug response Among these, the CYP3A subfamily is one of the most important CYP subfamilies involving drug metabolism It has been estimated that the CYP3A enzymes constitute up to 30% of total p-450 isoforms in liver (146); they are also responsible for the metabolic elimination of ~50% of today’s commonly used drugs (147) Notably, the major members of this subfamily all cluster into a 300kb region around the Chromosome7q21.1 region, about 12Mb upstream the ABCB1 gene (Figure 1.1), spanning the CYP3A43, CYP3A4, CYP3A5P2 (pseudo-gene), CYP3A7, CYP3A5P1 (pseudo-gene) and CYP3A5 genes consecutively
The CYP3A4 is the best characterized CYP3A isoform CYP3A4 has a wide spectrum of substrates including nifedipine, midazolam, cyclosporine, erythromycin, alprazolam, and triazolam ((148-152) The expression distribution of CYP3A4 was mainly located in liver and intestine epithelial cells Notably, both the substrate spectrum and the expression distribution of CYP3A4 overlap with that of ABCB1
Trang 40(150) They are also under regulation of common transcription factors such as PXR; and are co-induced by same drugs (153) In an association study, Goto, et al, further noticed that the ABCB1 SNP exon26/3435(C/T) was significantly associated with the expression level of CYP3A4 in intestine (154) It thus seems CYP3A4 and ABCB1 have a strong interplay at the functional level The other two functional CYP3A genes, namely the CYP3A5 and CYP3A7 have high sequence homology of up to 80-90% to CYP3A4, for both coding regions and introns This cluster of CYPs clearly arose via gene duplications CYP3A5 has a similar substrate spectrum with CYP3A4 However CYP3A5 is not normally expressed in every individual; and there are big discrepancies in the frequency of individuals with significant CYP3A5 expression across different populations Preliminary studies detected CYP3A5 expression in 10%
to 97% of human livers among different populations (155, 156) CYP3A7 on the other hand was found to be specific for human fetal, constituting up to 50% of CYP contents in the fetal liver (157)
Of these CYP3A genes, great interest was initially focused on CYP3A4 in the previous studies, as this major drug-metabolizer demonstrated high inter-individual variations in its expression, up to 20 fold or more (157-162) Furthermore, population differences of CYP3A4 mediated drug disposition has been reported (95) Several SNPs in CYP3A4 were identified and one in particular, the CYP3A4*1B (A> G), attracted much interest This 5’UTR SNP has a highly differed allele frequency distribution among different ethnic groups, assessed in Caucasians, Africans and Chinese (163) Subsequent association studies also demonstrated that this SNP was significantly associated with predispositions of several diseases (163, 164) However, other studies failed to detect the association of the SNP CYP3A4*1B (A>G) with CYP3A4 metabolism activity (165, 166) The other polymorphisms identified in