analysis of genes and genomes phần 5 potx

RT–PCR can be used for cloning, cDNA library construction andprobe synthesis.. 5 Cloning a geneKey concepts DNA libraries are pools of recombinant DNA molecules Genomic libraries conta

Trang 1

oligonu-devised as a method of RNA ampliﬁcation and quantitation after its conversion

to DNA RT–PCR can be used for cloning, cDNA library construction andprobe synthesis The technique consists of two parts (Figure 4.12) – the syn-thesis of DNA from RNA by reverse transcription (RT) and the subsequentampliﬁcation of a speciﬁc DNA molecule by polymerase chain reaction (PCR).The RT reaction uses an RNA template (typically either total RNA or polyA+RNA), a primer (random or oligo dT primers), dNTPs, buffer and a reversetranscriptase enzyme (which we will discuss more in Chapter 5) to generate

a single-stranded DNA molecule complementary to the RNA (cDNA) ThecDNA then serves as a template in the PCR reaction During the ﬁrst cycle of

Trang 2

4.9 REAL-TIME PCR 179

PCR, the single DNA strand is made double stranded through the binding ofanother, complementary, primer and the action of Taq DNA polymerase.Like other methods of mRNA analysis, such as northern blots and nucleaseprotection assays, RT–PCR can be used to quantify the amount of mRNAthat was contained in the original sample This type of analysis is particularlyimportant for monitoring changes in gene expression However, because PCRampliﬁcation is exponential, small sample-to-sample concentration and loadingdifferences are ampliﬁed as well Even large differences in target concentration(100-fold or more) may produce the same intensity of band after 25 or 30 PCRcycles Therefore, RT–PCR requires careful optimization when used for quan-titative mRNA analysis Quantitation usually takes one of two forms – relative

or absolute

• Relative quantitation compares transcript abundance across multiple ples, using a co-amplified internal control for sample normalization Resultsare expressed as ratios of the gene specific signal to the internal controlsignal This yields a corrected relative value for the gene specific product

sam-in each sample These values may be compared between samples for anestimate of the relative expression of target RNA in the samples

• Absolute quantitation, using competitive RT–PCR, measures the absoluteamount (e.g 5.3 × 105 copies) of a speciﬁc mRNA sequence within asample Dilutions of a synthetic RNA (containing identical primer bindingsites, but slightly shorter than the target RNA) are added to the sample andare co-ampliﬁed with the target The PCR product from the endogenoustranscript is then compared with the concentration curve created by thesynthetic competitor RNA

4.9 Real-time PCR

Quantitative real-time RT–PCR combines the best attributes of both relativeand competitive RT–PCR in that it is accurate, precise, high throughput andrelatively easy to perform Real-time PCR automates the otherwise laboriousprocess of relative RT–PCR by quantitating reaction products for each sample

in every cycle Real-time PCR systems rely upon the detection and quantitation

of a ﬂuorescent reporter, whose signal increases in direct proportion to theamount of PCR product in a reaction In the simplest form, the reporter is thedouble-strand DNA-speciﬁc dye SYBR Green (Wittwer et al., 1997) SYBR

Green binds double-stranded DNA, probably in the minor groove, and, uponexcitation, emits light Thus, if the dye is included in a PCR reaction, as a

Trang 3

3 ′

5 ′ 3′

Figure 4.13. TaqMan  real-time PCR quantification Three primers are used during the PCR process – two of these (primers 1 and 2) dictate the beginning of DNA replication on each DNA strand, and the third (the probe) binds to one strand in between The probe contains two modified bases – a fluorescent reporter (R) at its 5-end and a fluorescence quencher (Q) at its 3 -end As DNA replication proceeds, the extended product from primer 1 displaces the 5 -end of the probe and the exonuclease activity of the polymerase cleaves the fluorescent reporter from the probe The separation of the reporter from the quencher allows it to fluoresce The amount of fluorescence is proportional to the amount

of PCR product being made and is measured during each PCR cycle

PCR product accumulates the ﬂuorescence increases The advantages of SYBRGreen are that it is inexpensive, easy to use, and sensitive The disadvantage

is that SYBR Green will bind to any double-stranded DNA in the reaction,including primer dimers and other non-speciﬁc reaction products, which can

Trang 4

4.10 APPLICATIONS OF PCR 181

result in an over-estimation of the target concentration For single PCR productreactions with well designed primers, SYBR Green can work extremely well,with spurious non-specific background only showing up in very late cycles.The alternative method for quantifying PCR products is TaqMan, whichrelies on fluorescence resonance energy transfer (FRET) of hybridization probesfor quantitation (Figure 4.13) TaqMan probes are oligonucleotides that con-tain a fluorescent reporter dye, typically attached to the 5base, and a quenchingdye, typically attached to the 3 base The probe is designed to hybridize to aninternal region of a PCR product When irradiated, the excited reporter dyetransfers energy to the nearby quenching dye molecule rather than fluorescing,resulting in a non-fluorescent substrate During PCR, when the polymerasereplicates a template on which a probe is bound, the 5-3 exonuclease activ-

ity of the polymerase cleaves the probe (Holland et al., 1991) This separates

the ﬂuorescent and quenching dyes and FRET no longer occurs Fluorescenceincreases in each PCR cycle, proportional to the rate of probe cleavage, and ismeasured in a modiﬁed thermocycler Real-time PCR is a powerful quantitativetool, but the cost of reagents and equipment is much higher than that ofstandard PCR reactions

Trang 5

• ﬁngerprinting/population analysis

• genome analysis

• quantitative PCR of RNA or DNA

We will touch on some of these topics in later chapters but, again, interestedreaders are directed toward more dedicated literature (McPherson and Møller,2000; Innis, Gelfand and Sninsky, 1999)

Trang 6

5 Cloning a gene

Key concepts

DNA libraries are pools of recombinant DNA molecules

Genomic libraries contain fragments of all DNA sequences present

in the genome

developmental stage speciﬁc Their formation is dependent on anRNA-dependent DNA polymerase enzyme, reverse transcriptase

PCR based libraries negate the requirement for cloned DNA ments and can undergo subtraction to isolate genes that aredifferentially expressed

frag-Genomes contain an enormous amount of DNA (Table 5.1) Consequently,each gene contained within a genome represents only a tiny fraction of thegenome size itself All traditional DNA cloning strategies are composed of fourparts: the generation of foreign DNA fragments, the insertion of foreign DNAinto a vector, the transformation of the recombinant DNA molecule into a hostcell in which it can replicate and a method of selecting or screening clones toidentify those that contain the particular recombinant we are interested in Inthis chapter we will address some of the particular problems and issues withthe ﬁrst two of these steps in the formation of DNA libraries A DNA library

is simply a collection of DNA fragments

There are several different types of library that we will consider here DNA

fragment libraries are designated as being either a genomic DNA library or

a cDNA library Most traditional methods of library construction involve the

physical cloning of various DNA fragments into a suitable vector However,

as we will see later, DNA fragments that are not cloned (e.g those derived

Analysis of Genes and Genomes Richard J Reece

 2004 John Wiley & Sons, Ltd ISBNs: 0-470-84379-9 (HB); 0-470-84380-2 (PB)

Trang 8

cDNA libraries are constructed by the conversion of mRNA from a particulartissue sample into DNA fragments that can be cloned into an appropriate vector.cDNA libraries thus contain only the coding sequence of genes expressed in atissue sample together with small regions of the 5and 3untranslated portions

of the gene Consequently, cDNA libraries isolated from different tissues ofthe same organism may be radically different in their composition The genesexpressed in one tissue type or developmental stage may well be different fromthose expressed in another tissue type or developmental stage Additionally, thecomposition of a cDNA library reﬂects the relative abundance of mRNA in theoriginal tissue sample Highly expressed genes will be represented in the librarymultiple times, whereas genes expressed at a low level will be represented inthe library less frequently

5.1 Genomic Libraries

The smallest unit of DNA within a genome is the chromosome Even in thesimplest organisms, however, chromosomes contain an enormous quantity of

DNA For example, the E coli chromosome contains some 4.6 Mbp (4 600 000

bp) of DNA (Table 5.1) This amount of DNA is far too large to be cloned intoany of the vectors currently available (Chapter 3) Therefore it is necessary, andindeed desirable, to fragment the DNA before it is cloned into an appropriatevector A ‘divide and conquer’ strategy comes into play here, whereby relativelysmall fragments of the genome can be assigned a speciﬁc function whereasthe whole genome is somewhat impenetrable The method of fragmentationplays an important role in the quality of the ﬁnal library Ideally, the genomicDNA should be broken up into random and overlapping fragments prior tocloning Such cleavage would ensure that the library contains representativecopies of all DNA fragments present within the genome, and that fragment

Trang 9

bias is not encountered by the cleavage of DNA at speciﬁc sites only There aretwo basic mechanisms for cleaving DNA that are used in the construction ofgenomic libraries.

(a) Mechanical shearing Puriﬁed genomic DNA is either passed several times

through an narrow-gauge syringe needle or subjected to sonication tobreak up the DNA into suitable size fragments that can be cloned.Typically, an average DNA fragment size of about 20 kbp is desirablefor cloning intoλ based vectors Mechanical methods such as these have

the advantage that DNA fragmentation is random, but suffer from thefact that large quantities of DNA are required, and that the average DNAfragmentation size may be quite variable

(b) Restriction enzyme digestion Restriction enzymes, such as EcoRI, often

recognize 6 bp DNA sequences and cleave the DNA within the recognitionsequence On average, a 6 bp DNA sequence will occur approximatelyevery 4000 bp within DNA Complete digestion of genomic DNA withEcoRI will generate DNA fragments that are generally too small to beuseful in genomic library construction Other restriction enzymes, e.g.NotI, recognize and cleave 8 bp recognition sequences Such sequenceswill occur much less commonly within DNA (approximately once every 65kbp) However, restriction enzyme cleavage to produce DNA fragmentssuffers as a consequence of the recognition sites themselves If, by chance,

a gene that we would like to clone contains multiple recognition sitesfor a particular restriction enzyme, then the fragments generated afterenzyme digestion may be too small to clone, and consequently the genemay not be represented within a library To overcome this problemgenomic DNA libraries are usually constructed by digesting the genomicDNA with restriction enzymes in such a way that the digestion does not

go to completion (Figure 5.1) Partial restriction digests will ensure thatnot all DNA recognition sequences are cut and, consequently, that thelibrary produced should contain copies of genes that may possess multiplerestriction enzyme recognition sequences In practice, restriction digestion

is normally performed using a restriction enzyme, or often two, thatrecognize and cleave very commonly occurring sequences For example,

as shown in Figure 5.2, high-molecular-weight genomic DNA is partiallycleaved with a mixture of the restriction enzymes HaeIII and AluI Each

of these restriction enzymes recognizes a 4 bp DNA sequence Theirrecognition sequences should therefore occur, on average, approximatelyevery 256 bp within genomic DNA The partial digestion, however, limitsthe number of restriction enzymes sites that are actually cut and leads to

Trang 10

Figure 5.1. The complete and partial digestion of a DNA fragment using a restriction enzyme (a) Complete digestion ensures that all restriction enzyme recognition sites (RE) are cut (b) Partial digestion results in the cleavage of a random subset of the recognition sites Partial digestion will generate a variety of products as indicated

the formation of genomic DNA fragments of a suitable size for cloning.DNA fragments produced in this manner have blunt ends since bothHaeIII and AluI cut DNA in a blunt-ended fashion:

5'-GG CC-3'HaeIII:

3'-CC GG-5'

5'-AG CT-3'AluI:

• Linkers or adaptors As shown in Figure 5.2, the blunt ended DNA

frag-ments can be ligated to a series of oligonucleotides that either contain therecognition sequence for a restriction enzyme (linkers) or possess one bluntend for ligation to the genomic DNA and an overhanging sticky end forcloning into particular restriction sites (adaptors) In the case shown here,

Trang 11

High molecular weight DNA (>100 kbp)

Partial restriction digest

& size fractionate 20 kbp

Mix and ligate

EcoRI methylase

m m

Figure 5.2. The construction of a genomic DNA library See the text for details

the DNA fragments are ﬁrst protected from restriction enzyme cleavage

by treatment with a speciﬁc DNA methylase (Maniatis et al., 1978)

Treat-ment of the DNA fragTreat-ments with the EcoRI methylase, in the presence ofS-adenosylmethoinine, will result in the methylation of the internal-most

A residue within the EcoRI recognition sequence (5-GAATTC-3) DNAmodiﬁed in this fashion is unable to be cleaved by the restriction enzyme (seeFigure 2.1) The oligonucleotide linkers are then added to the methylatedDNA in large excess in the presence of high concentrations of DNA ligase.Subsequent treatment with the EcoRI restriction enzyme will result in DNA

Trang 12

5.1 GENOMIC LIBRARIES 189

cleavage only within the linker molecules which are the only ones thatcontain non-methylated EcoRI restriction enzyme recognition sequences.The resulting DNA fragments can then be cloned into the EcoRI restrictionsite of a suitable vector

• Restriction enzymes that generate sticky ends The genomic DNA may

be initially digested with a commonly occurring restriction enzyme thatgenerates sticky ends For example, digestion on genomic DNA with therestriction enzyme Sau3AI (recognition sequence 5-GATC-3) generatesDNA fragments that are compatible with the sticky end produced byBamHI (recognition sequence 5-GGATCC-3) cleavage of a vector Theease of this second approach makes its use far more prevalent

Once the DNA fragments are produced, there are cloned into a suitable vector.Often this will be aλ based vector but, as we have seen in Chapter 3, a variety

of vectors are available for cloning large DNA fragments The recombinant

vector and insert combinations are then grown in E coli such that a single

bacterial colony or viral plaque arises from the ligation of a single genomic

DNA fragment into the vector E coli cells infected with either a λ phage

or transformed with a plasmid DNA are unable to support the replication

of additional DNA molecules of the same type Consequently, each λ plaque

or bacterial colony contains multiple copies of the same recombinant DNAmolecule A library of these molecules is produced by pooling colonies orplaques such that sufﬁcient are present to ensure that each genomic DNAfragment is represented at least once within the library The main advantage ofcloning large DNA fragments is that fewer individual clones must be pooledtogether to form a representative library A pertinent question to ask here ishow many individual colonies or plaques must be pooled to ensure that alibrary is truly representative of the genomic DNA from which it was made.The answer to this depends upon both the size of the genome from which thelibrary is made and upon the average size of the cloned DNA fragments within

the library For example, if a library of the E coli genome (4.6 Mbp) were

constructed containing 5 kbp fragments, then the fraction of the genome size

compared to the average individual cloned fragment size (f ) would give the

lowest possible number of clones that the library must contain:

f = genome sizefragment size = 4600 000 bp

5000 bp = 920

Therefore, an E coli genomic library of this size would require at least 920

independent clones Using the same calculation, a human genomic librarycontaining similar sized inserts would require at least 580 000 independent

Trang 13

recombinants to construct a representative library If the fragment size isincreased to 20 kbp, as is common forλ vectors, then the human library must

contain at least 145 000 independent recombinant clones to be representative.The ratio of genome size to fragment size is, however, an under-estimate

of the complexity required for the construction of a library Libraries mustcontain a much larger number of recombinant clones than this since somesequences are invariably under-represented either by chance sampling error,

or as a consequence of the DNA sequence itself – perhaps the cloned DNA isrelatively toxic to the host cell in which the recombinant vector is replicated, orthey contain sequences that are difﬁcult to clone, e.g highly repetitive DNA Inthe mid-1970s, Clarke and Carbon derived a formula relating the probability

recombinants (Clarke and Carbon, 1976):

where f is the ratio of the genome size to the fragment size described above, and

ln is the natural log Therefore, to achieve a 99 per cent probability (P = 0.99)

of including any particular sequence of random human genomic DNA in a

library of 20 kbp fragments, N = 6.7 × 105 In practice, most human genomiclibraries will contain over one million independent recombinant clones

The pooling together of either recombinant plaques or bacterial colonies

generates a primary library The recombinant clones are simply washed off the

growth plates and combined into a suitable test-tube The library should tain a representative copy of each DNA molecule from which it was produced

con-Of course, it is possible that some DNA molecules cannot be incorporated

within the library Certain DNA sequences may be toxic to E coli The foreign

DNA may

• be fortuitously expressed in E coli and the protein or protein fragment may

be harmful to bacterial growth,

• act as a binding site for E coli proteins and sequester them in such a way

that they are unable to perform their natural function or

• be highly repetitive and eliminated from bacterial cell through tion

recombina-The primary library is usually of a low titre and is often quite unstable Toincrease both its stability and its titre, the library is often subjected to anampliﬁcation step That is, the collection of phages or bacterial colonies is

Trang 14

5.2 cDNA LIBRARIES 191

plated out once more, and the resulting progeny collected to form an ampliﬁed library The ampliﬁed library usually has a much larger volume than the

primary library, and consequently may be screened many, or even hundreds,

of times Pooled collections of λ phages can be stored almost indeﬁnitely.

Bacterial cells harbouring plasmids are more difﬁcult to store and there is often

a high degree of recombinant clone loss upon resurrection of frozen bacterialcells Ampliﬁcation of the library is essential if the library is to be screenedmultiple times However, it is possible that the ampliﬁcation process will result

in the composition of the ampliﬁed library not truly reﬂecting the primaryone As we have already discussed, certain DNA sequences may be relatively

toxic to E coli cells; as a consequence bacteria harbouring such clones will

grow more slowly than other bacteria harbouring DNA sequences that do notaffect bacterial growth Such problematic DNA sequences may be present inthe primary library, but will be lost, or under-represented, after the growthphase required to produce the ampliﬁed library

5.2 cDNA Libraries

Not only are the genomes of higher-eukaryotic organisms big, but also only asmall fraction of the DNA contained within them codes for genes The HumanGenome Sequencing Project (Chapter 9) has estimated that genes constitute onlyabout 1.5 per cent of the DNA contained within the genome The knowledge

of the entire genome sequence is important to understand the potential of a cell,i.e the proteins that it could potentially produce, but perhaps more important

is knowledge of the protein content that individual cells actually produce.All cells within an individual organism are derived from the same genomesequence, but the way in which the genome is transcribed and translated isunique to individual cell types, and to the individual developmental stages ofeach cell Although many of the genes expressed by different cell types will bethe same, e.g the genes encoding the enzymes of the TCA cycle, some will also

be different, e.g some of the genes expressed within a skin cell will be different

to those of a muscle cell These differentially expressed genes, and the proteinsthat they produce, deﬁne each individual cell type Thus, the mRNA that iscontained within a cell gives us a snapshot of the genes being expressed withinthat cell at any particular time mRNA actually represents only a small fraction

of the total RNA contained within a cell (Table 5.2)

Most eukaryotic protein coding genes are transcribed by RNA polymerase IIand the resulting mRNA is usually subjected to a number of post-transcriptionalmodiﬁcations, including the additions of a 7-methylguanosine cap at the 5-end,and the addition of 100–200 adenine residues (a poly(A) tail) at the 3-end

Trang 15

Table 5.2. The distribution of RNA molecules within cells In eukaryotes, RNA merase II is responsible for the production of approximately 60 per cent of newly synthesized transcripts Due to its instability, however, mRNA accumulates at a level

poly-of 10 per cent or less (Brandhorst and McConkey, 1974)

of the transcript (see Figure 1.27) Additionally, the mRNA undergoes splicing

to remove the introns so that the translation of a single contiguous messagecan occur

The problem with mRNA is, of course, that it cannot be maintained instable vectors and is difﬁcult to manipulate Consequently, a DNA copy (called

complementary DNA, or cDNA) of the mRNA is required before a library can

be constructed The conversion of RNA to DNA is dependent upon the action

of reverse transcriptase, an enzyme found in retroviruses that is responsible for

the conversion of their RNA genome into a DNA copy prior to integration intohost cells (Figure 5.3) David Baltimore and Howard Temin ﬁrst discoveredthe enzyme independently in 1970 (Temin and Mizutani, 1970; Baltimore,1970) Reverse transcriptase is an RNA-dependent DNA polymerase that, like

Figure 5.3. Reverse transcriptase The X-ray crystal structure at 1.8 ˚A resolution of a catalytically active fragment of reverse transcriptase from Moloney murine leukemia virus (MMLV-RT) (Georgiadis et al., 1995) The enzyme is an RNA-dependent DNA polymerase that is used in the conversion of mRNA into cDNA

Trang 16

all other DNA polymerases, catalyses the addition of new nucleotides to agrowing chain in a 5to 3direction Reverse transcriptases generally have twotypes of enzymatic activity

• DNA polymerase activity In the retroviral life cycle, reverse transcriptase

produces a DNA copy from RNA only but, as used in the laboratory, it willtranscribe both single-stranded RNA and single-stranded DNA templateswith essentially the same efﬁciency In both cases, an RNA or DNA primer

is required to initiate synthesis

• RNaseH activity RNaseH is a ribonuclease that degrades the RNA from

RNA–DNA hybrids, such as those formed during reverse transcription

of an RNA template RNaseH functions as both an endonuclease andexonuclease to hydrolyse its target molecules

All retroviruses encode their own reverse transcriptase (RT), but the cially available enzymes used in cDNA library construction are derived eitherfrom Moloney murine leukemia virus (MMLV-RT) or from Avian myeloblas-tosis virus (AMV-RT), after puriﬁcation of the enzyme from virally infected

commer-cells or following expression in and puriﬁcation from E coli Both enzymes

have the same fundamental activities, but differ in a number of characteristics,including temperature and pH optima MMLV-RT is a single polypeptide of

71 kDa in size, while AMV-RT is composed of two polypeptide chains 64 kDaand 96 kDa in size Most importantly, MMLV-RT has a very weak RNaseHactivity compared to AMV-RT, which gives it an obvious advantage whenbeing used to synthesize DNA from long RNA molecules

The process of producing a double-stranded cDNA copy of an mRNAmolecule is shown in Figure 5.4 The presence of a polyA tail is unique tomRNA, and provides a mechanism of distinguishing and isolating mRNA fromthe more abundant rRNA and tRNA molecules mRNA can be physicallyisolated from its more abundant relatives by passing total RNA over a column

to which polymers of deoxythymidine (oligo-dT) are bound RNA moleculesthat do not contain multiple adenine residues will be unable to adhere to such

a column and will ﬂow straight through the column mRNA molecules, on theother hand, will bind through complementary base pairing to the column andwill be eluted only when the concentration of salt ﬂowing through the column

is lowered

The cloning of cDNA is initiated by mixing short (12–18 base) cleotides of dT with puriﬁed mRNA such that the oligonucleotide will anneal tothe polyA tail of the RNA molecule Reverse transcriptase is then added and usesthe oligo-dT as a primer to synthesize a single strand of cDNA in the presence of

Trang 17

TTTTT– 5'

TTTTT– 5' AAAAA– 3'

AAAAACCC– 3'

Reverse transcriptase + dNTPs

3'–CCC

Double-stranded cDNA

5'–GGG

Figure 5.4. The construction of a cDNA library See the text for details

the four deoxynucleotide triphosphates (dNTPs) The resulting molecules will

be double-stranded hybrids of one cDNA and one mRNA molecule An

oligo-dT primer used to make a cDNA strand will have heterologous ends The primercan pair at numerous positions throughout the polyA tail and consequentlywill yield cDNA fragments of different lengths which may have been derivedfrom the same mRNA molecule To overcome this problem, anchored oligo-

dT primers are often employed In addition to the 12–18 base dT sequence,anchored primers are constructed such that the extreme 3-end contains either

a G, A, or C residue (Liang and Pardee, 1992) Such primers (5-T12 – 18V-3,where V= G, A, or C) will only efﬁciently initiate DNA replication if they arepaired at the extreme 5-end of the polyA tail, when the G, A, or C residue canbase pair with the nucleotide immediately preceding the polyA sequence.The production of the second DNA strand, like all DNA replication, requires

a primer to initiate DNA synthesis However, beyond the polyA tail, mRNA

Trang 18

molecules produced from different genes will be different Therefore, a anism is required to initiate DNA synthesis at sequences corresponding to the

mech-5-end of the mRNA Early cDNA cloning strategies involved the formation of

a hair-pin in the newly synthesized cDNA strand, which would serve as a priming structure for the formation of the second strand The hair-pin would besubsequently removed from the double-stranded cDNA by treatment with S1

self-nuclease (Efstratiadis et al., 1976) However, such methods invariably resulted

in the loss of sequences at the 5-end of genes, and so the second DNA strand isusually synthesized following either nick translation or homopolymer tailing

• Nick translation RNAse H is used to partially digest the RNA component of

the RNA–DNA hybrids (Gubler and Hoffman, 1983) The remaining RNA

is then used as a primer for fresh DNA synthesis using DNA polymerase I

in the presence of the four dNTPs and ﬁnally DNA ligase is used to seal anyremaining nicks in the DNA backbone The resulting double-stranded cDNAmolecule can subsequently be cloned into a suitable vector

• Homopolymer tailing The RNA–DNA hybrids formed after the ﬁrst

cDNA strand synthesis are treated with the enzyme terminal transferase

in the presence of a single deoxynucleotide triphosphate Terminaldeoxynucleotidal transferase (TdT) is a template independent polymerasethat catalyses the addition of deoxynucleotides to the 3-ends of DNAmolecules (Chang and Bollum, 1986) TdT activity was initially identiﬁed

by the analysis of immunoglobin (VDJ) recombination in which extranucleotides were found to be inserted into the joined segments that werenot present in either segment before joining (Alt and Baltimore, 1982) TdT

is found at high concentration in the thymus and bone marrow where suchrecombination events occur, but is commercially available as a recombinant

protein over-produced in and puriﬁed from E coli DNA (and RNA)

molecules incubated with TdT in the presence of dCTP will have multiple

C residues added to their 3-ends (Figure 5.4) Prior to the synthesis of thesecond DNA strand, the RNA of the RNA–DNA hybrids must be removed

to provide a single-stranded template for new DNA synthesis This can beachieved easily by treating the hybrids with alkali RNA is hydrolysed intoribonucleotides around pH 11, while DNA is resistant to hydrolysis up toabout pH 13 (Watson and Yamazaki, 1973) Increasing the pH to about 12therefore results in the hydrolysis of the RNA, but not the DNA Full-lengthcDNA strands are separated from the ribonucleotides on the basis of theirsize using sucrose gradient centrifugation The resulting cDNA strands willhave multiple C residues at their 3-ends and multiple T residues at their

5-ends (Figure 5.4) Second-strand cDNA synthesis is then initiated using

Trang 19

TTTTT– 5' AAAAA– 3' AAAAA– 3'

TTTTT AAAAA

or Promoter

Promoter

Figure 5.5. cDNA that is to be expressed must be cloned in a deﬁned orientation so that the promoter element to which it is attached will initiate the transcription of the sense strand of the DNA, rather than the antisense strand

an oligo-dG primer that will bind, through complementary base pairing,

to the newly formed polyC sequence Reverse transcriptase, performing therole of a DNA-dependent DNA polymerase, in the presence of the fourdNTPs will produce the second cDNA strand

Homopolymer tailing has an additional advantage in that both the 5- and 3ends of the original mRNA are tagged with speciﬁc and known sequences in theresulting double-stranded cDNA This can be immensely helpful when cloningcDNA fragments in a speciﬁc orientation is required, e.g during the expression

-of the cDNA mRNA molecules are directional The 5-end represents thebeginning of the gene sequence, and the 3 polyA tail occurs at the end of thegene sequence Therefore, if we want to express the cDNA in, for instance,bacterial cells, it is important to ensure that only the sense strand of the cDNA

is transcribed If the antisense strand is cloned downstream of a bacterialpromoter, then the resulting transcript (if produced at all) will not encode theintended protein (Figure 5.5)

5.3 Directional cDNA Cloning

The synthesis of cDNA using modiﬁed oligonucleotides to initiate each strand

of DNA synthesis allows the insertion of unique restriction enzyme recognitionsites at either end of the cDNA so that cloning of the cDNA fragments can onlyoccur in one direction (Figure 5.6) In the example shown here, the oligo-dTprimer also contains additional sequences at the 5-end that encode a XhoI

Trang 20

5.3 DIRECTIONAL cDNA CLONING 197

Double-stranded cDNA

Primer - 5'–GGGGAATTCGGGGG– 3' 5'–GGGGAATTCGGGGG–3'

Cut with EcoRI and XhoI

XhoI EcoRI

Promoter

XhoI EcoRI

Clone into EcoRI- XhoI-cut vector

Figure 5.6. Directional cDNA cloning Modiﬁed primers initiate DNA synthesis and result in the insertion of restriction enzyme recognition sequences at the 5 - and 3 -ends

of the cDNA

restriction enzyme recognition site (5-CTCGAG-3) As we discussed for PCR

in Chapter 4, the primer initiates DNA synthesis and is itself incorporated intothe extended product Thus a XhoI restriction enzyme recognition site will

be incorporated into the 3-end of the cDNA Similarly, the primer used toinitiate the second cDNA strand contains, in addition to the oligo-dG sequence,

an EcoRI restriction enzyme recognition site (5-GAATTC-3) at its 5-end.Consequently, the produced cDNA will contain an EcoRI site its 5-end and aXhoI site at its 3-end The placement of these sites means that the cDNA can

be cloned directionally A plasmid bearing a suitable promoter followed by, inorder, an EcoRI and a XhoI restriction enzyme recognition site will accept the

Trang 21

cut cDNA fragments in one orientation only Thus the promoter will drive theexpression of the gene encoded by the cDNA and not the reverse orientation ofthe opposite strand.

An obvious problem of cutting cDNA with restriction enzymes is that thecDNA itself may contain restriction enzyme recognition sites Strategies toovercome this problem similar to those we have already encountered during theconstruction of genomic libraries can also be employed here Additionally, theinclusion of methylated forms of various deoxynucleotides during the synthesis

of the cDNA will protect the newly synthesized DNA strands from cleavage bycertain restriction enzymes For example, cDNA produced in the presence ofmethylated dCTP will be resistant to cleavage by XhoI by virtue of the presence

of the methylated C residues (Endow and Roberts, 1977) Alternatively, cutting restriction enzyme recognition sites, e.g the recognition sequence forNotI (5-GCGGCCGC-3), may be added to the ends of the cDNA fragments

rare-to reduce the likelihood of enzyme cleavage within the cDNA itself

The initiation of cDNA synthesis using oligo-dT primers has been immenselysuccessful in the construction of a variety of cDNA libraries The approachdoes, however, have limitations An oligo-dT primer is suitable only forreverse transcription of mRNA molecules with poly(A) tails Prokaryotic RNA

and some eukaryotic mRNAs do not have polyA tails (Adesnik et al., 1972).

Therefore prokaryotic cDNA libraries cannot be produced by this method,and in eukaryotic libraries some sequences may never be present Additionally,initial priming at the 3-end of transcripts will tend to result in the formation

of libraries that are enriched with DNA fragments representing the 3-ends

of genes – long transcripts may therefore not be fully represented within thelibrary These problems can be addressed by using random primers to initiatethe ﬁrst strand of cDNA synthesis The random primers are usually six tonine nucleotides in length and are synthesized to be a mixture of all possiblebases at each position (5-NNNNNN-3) Random primers will hybridize atrandom positions along the mRNA and will serve as starting points for DNAsynthesis cDNA cloned by this method, following the synthesis of the secondstrand, is unlikely to be full length, but will generate DNA fragments that aremore representative of the starting mRNA Methods have been devised to clonefull-length cDNAs starting from a fragment that may have been isolated from

a random-primed library (Frohman, Duch and Martin, 1988)

5.4 PCR Based Libraries

The construction of high-quality cDNA libraries is both time consuming andtechnically difﬁcult The stability and permanency of a library in which thecDNA fragments have been physically cloned into a vector, coupled with the

Trang 22

5.4 PCR BASED LIBRARIES 199

ability to screen it multiple times, makes these libraries popular choices forisolating cDNA clones In many cases, however, the need to construct a clonedcDNA library can be bypassed by the analysis of PCR products formed frommRNA This type of approach is only possible if screening of the DNA fragments(Chapter 6) is performed using nucleic acid hybridization and is not applicablewhen functional analysis of the encoded protein is required Nevertheless, PCR-based libraries are easy and rapid to both construct and screen PCR-basedlibraries are constructed using a combination of reverse transcriptase and PCR

(RT–PCR) (Mocharla, Mocharla and Hodes, 1990) RT–PCR is both sensitive

and versatile The technique can be used to determine the presence or absence of

a transcript, to estimate expression levels and to clone cDNA products withoutthe necessity of constructing and screening a cDNA library

A generalized overall scheme for the production of an RT–PCR library from amixed population of unknown mRNA molecules is shown in Figure 5.7 MostRT–PCR protocols employ reverse transcriptase to produce the ﬁrst cDNA

strand (Murakawa et al., 1988) The production of a single strand of cDNA is

Trang 23

sufﬁcient prior to the progression of the PCR stage, where second-strand cDNAsynthesis and subsequent PCR ampliﬁcation is performed using a thermostableDNA polymerase – e.g Taq DNA polymerase (see Chapter 4) In addition

to their DNA-dependent DNA polymerase activity, some thermostable DNA

polymerases (e.g Thermus thermophilus (Tth) DNA polymerase) possess a

reverse transcriptase activity in the presence of manganese ions This has led tothe development of protocols for single-enzyme reverse transcription and PCRampliﬁcation (Myers and Gelfand, 1991) Systems have also been developed inwhich the reverse transcriptase reaction and PCR are performed in the samebuffer to eliminate secondary additions to the reaction mix to decrease bothhands-on time and the likelihood of introducing contaminants into the reaction(Wang, Cao and Johnson, 1992) Such systems are ideal for the ampliﬁcation

of mRNA molecules whose sequence is already known using highly specificprimers, but the construction of an amplified representative library requiresadditional steps to ensure that each mRNA molecule within the population isrepresented within the library Several methods have been devised to amplifyall potential mRNA species within a sample The method outlined in Figure 5.7utilizes many of the same elements as we have already seen in cDNA libraryconstruction The first cDNA strand is synthesized using reverse transcriptasefrom an oligo-dT primer to which additional, unique sequences have beenadded at the 5-end The mRNA strand of the RNA–DNA hybrid is removed

by treatment with RNaseH, prior to the addition of multiple C residues at the 3end of the DNA molecule using terminal transferase The second cDNA strand

-is synthesized using an oligo-dG primer that, again, has unique sequences at its

5-end The thermostable DNA polymerase that will be used for the subsequentPCR reaction may also be used to perform the synthesis of the second cDNAstrand The unique sequences at the 5- and 3-ends of the resulting double-stranded cDNA are then used as primer binding sites for a PCR reaction usingprimers containing these sequences The resulting PCR products will contain ahuge number of copies of each cDNA molecule produced in the RT reaction

5.5 Subtraction Libraries

As we have discussed earlier, many of the mRNA molecules produced bydifferent cells will be the same For example, almost all cells need to producethe enzymes required for glucose metabolism, and many of the intracellularprotein components of all cells, are identical Therefore, we might want tojust concentrate on the differences between cell types to identify genes thatare distinctive to a cell type, developmental stage or particular environmentalstress The advantage of PCR-based cDNA libraries is that they are amenable to

Trang 24

5.5 SUBTRACTION LIBRARIES 201

Boil and anneal

Extract biotin using avidin

Trang 25

the removal (subtraction) of sequences that are common between two separatelibraries This gives an enrichment of the unique sequences and allows these to

be studied more readily A mechanism by which this type of subtraction canoccur is shown in Figure 5.8

Two PCR based cDNA libraries are constructed from different mRNAsamples One of the libraries (the driver) is produced using an oligonucleotidethat has a biotin moiety chemically added to it Biotin is a cofactor requiredfor enzymes that are involved in ATP-dependent carboxylation reactions,e.g acetyl-CoA carboxylase and pyruvate carboxylase (Figure 5.9) Biotindeﬁciencies in animals are rare, but can be observed following excessiveconsumption of raw eggs (Baugh, Malone and Butterworth, 1968) The binding

of an egg-white protein, called avidin, to biotin prevents its intestinal absorption(Figure 5.9) The complex formed between biotin and avidin is extraordinarily

stable (binding constant, K d ∼ 1 × 10−15M) and avidin can effectively sequester

NH HN

(a)

(b)

Figure 5.9. The structure of biotin and the avidin–biotin complex (a) The molecular structure of biotin (b) The binding of the biotin molecule to avidin Shown here is an avidin monomer with a biotin molecule (blue) bound (Pugliese et al., 1993) The functional avidin molecule is a homotetramer

Tiêu đề	Analysis Of Genes And Genomes Phần 5
Trường học	University of Science
Chuyên ngành	Molecular Biology
Thể loại	Bài tập lớn
Năm xuất bản	2023
Thành phố	Hồ Chí Minh

Định dạng
Số trang	50
Dung lượng	1,01 MB