1. Trang chủ
  2. » Kỹ Thuật - Công Nghệ

Biochemistry, 4th Edition P98 potx

10 158 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 10
Dung lượng 691,99 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Phosphorylation of Ser residues and methylation of Lys residues in histone tails also contribute to transcription regulation Figure 29.30.. Structural studies on regulatory proteins that

Trang 1

Covalent Modification of Histones

Chromatin is also remodeled through the action of enzymes that covalently modify

side chains on histones within the core octamer These modifications either

dimish DNA⬊histone associations through disruption of electrostatic interactions or

in-troduce substitutions that can recruit binding of new protein participants through

protein–protein interactions

Initial events in transcriptional activation include acetyl-CoA–dependent acetylation

of

(HATs)(Figure 29.30) The histone transacetylases responsible are essential

compo-nents of several megadalton-size complexes known to be required for transcription

co-activation (co-co-activation in the sense that they are required along with RNA polymerase

II and other components of the transcriptional apparatus) Examples of such

com-plexes include the TFIID (some of whose TAFIIs have HAT activity), the SAGA

com-plex(which also contains TAFIIs), and the ADA complex N-Acetylation suppresses the

positive charge in histone tails, diminishing their interaction with the negatively

charged DNA

Phosphorylation of Ser residues and methylation of Lys residues in histone tails also

contribute to transcription regulation (Figure 29.30) Attachment of small proteins to

histone C-terminal lysine residues through ubiquitination and sumoylation (see

Chap-ter 31) are two additional forms of covalent modification found in nucleosomes

Col-lectively, these modifications create binding sites for proteins that modulate chromatin

structure, such as the chromatremodeling complexes with bromodomains that

in-teract specifically with acetylated lysine residues and chromodomains that bind to

methylated lysine residues A “histone code” has emerged

Covalent Modification of Histones Forms the Basis of the Histone Code

A code based on histone-tail covalent modifications determines gene expression

through selective recruitment of proteins Proteins that cause chromatin

com-paction (heterochromatin formation) lead to repression; proteins giving easier

ac-cess to DNA through relaxation of histone⬊DNA interactions favor the possibility of

gene expression

Histone globular regions

119

2

2

1 3 5 8 12 16

4

4

9

9

14 14

acK

meR

meK

PS

18

18

23

23

20

36

20

20 16

12

12

15 8

5

H3

5

5

3 1

DNA DNA

H2A

FIGURE 29.30 A schematic diagram of the nucleosome illustrating the various covalent modifications on the

N-terminal tails of histones acK acetylated lysine

residue; meK  methylated lysine residue; meR  methy-lated arginine residue; PS phosphorylated serine residue The numbers indicate the positions of the amino acids in the amino acid sequences Note the prevalence

of modifiable sites, particularly acetylatable lysines, on the N-terminal tails of histones H2B, H3, and H4.

Trang 2

The prominent forms of histone covalent modification are lysine acetylation, ly-sine methylation, serine phosphorylation, lyly-sine ubiquitination, and lyly-sine sumoy-lation The lysine residue at position 9 (K9) in the histone H3 amino acid sequence

is methylated in heterochromatin, the compacted, repressed state of chromatin In contrast, lysine 4 (K4) of histone H3 typically is methylated in chromatin where gene expression is active Different proteins are recruited to these two methylated

forms of histone H3 Methylated K9 recruits heterochromatin protein 1 (HP1), which binds via its chromodomain On the other hand, methylated K4 binds CHD1,

a chromatin remodeling protein with two chromodomains (CHD1 is an acronym

for chromodomain, helicase, DNA-binding.) Ubiquitination of Lys120 in the C-terminal tail of H2B favors methylation (and thus transcription activation), whereas ubiquitination of Lys119in the C-terminal tail of H2A favors repression Sumoylation of Lys residues tends to repress transcription; apparently, sumolyation antagonizes acetylation

Methylation and Phosphorylation Act as a Binary Switch

in the Histone Code

As cells enter mitosis, the chromatin becomes condensed and histone H3 is not only methylated at K9 but also phosphorylated at the adjacent serine residue, S10 S10 phosphorylation triggers the dissociation of HP1 from the heterochromatin Thus, phosphorylation of the residue neighboring K9 trumps HP1 binding Similarly, phosphorylation of the threonine residue (T3) neighboring K4 in the histone H3 tail evicts CHD1 from its site on the methylated K4 Apparently, lysine methylation is the

“on” position for the binary switch that recruits specific proteins to histone tails, and phosphorylation at a neighboring residue turns the switch to the “off” position by ejecting the bound proteins There are at least 16 instances of serines or threonines immediately flanking lysine residues in the four histones that constitute the histone core octamer of the nucleosome The methylation-phosphorylation binary switch may

be a general phenomenon in the regulation of chromatin dynamics

Chromatin Deacetylation Leads to Transcription Repression

Deacetylation of histones is a biologically relevant matter, and enzyme complexes that

carry out such reactions have been characterized Known as histone deacetylase

com-plexes, or HDACs, they catalyze the removal of acetyl groups from lysine residues

along the histone tails, restoring the chromatin to a repressed state Beyond these ef-fects on transcription, histone modifications determine whether significant cellular events involving DNA allocation through mitosis and meiosis may occur

Nucleosome Alteration and Interaction of RNA Polymerase II with the Promoter Are Essential Features in Eukaryotic Gene Activation

Gene activation (the initiation of transcription) can thus be viewed as a process requiring two principal steps: (1) alterations in nucleosomes (and thus, chromatin) that relieve the general repressed state imposed by chromatin structure, followed by (2) the interaction of RNA polymerase II and the GTFs with the promoter

Transcription activators(proteins that bind to enhancers and response elements)

ini-tiate the process by recruiting chromatin-altering proteins (the chromatin-remodeling

complexes and histone-modifying enzymes described previously) Once these alterations

have occurred, promoter DNA is accessible to TBP⬊TFIID, the other GTFs, and RNA polymerase II Transcription activation, however, requires communication between

RNA polymerase II and the transcription activator for transcription to take place Me-diator (or Srb/Med) fulfills this function MeMe-diator interacts with both the transcrip-tion activator and the CTD of RNA polymerase II This Mediator bridge provides an

essential interface for communication between enhancers and promoters, triggering RNA polymerase II to begin transcription A general model for transcription initia-tion is shown in Figure 29.31 Once transcripinitia-tion begins, Mediator is replaced by

Trang 3

another complex called Elongator Elongator has HAT subunits whose activity

remod-els downstream nucleosomes as RNA polymerase II progresses along the

chromatin-associated DNA

The interactions described thus far emphasize regulation of gene expression at

the level of RNA polymerase II recruitment to promoters However, whole-genome

analyses show that, for many genes, RNA polymerase II is already situated at

pro-moters and appears to be paused there, awaiting signals that will activate the

elon-gation phase of transcription Thus, the expression of many genes may be regulated

at the level of transcription elongation

Beyond these considerations, various mechanisms regulate gene expression

through events that take place subsequent to transcription Post-transcriptional

gene regulation mediated by microRNAs, such as RNAi (see Chapter 12) and gene

silencing (see Chapter 10), as well as alternative RNA splicing and nucleotide

changes introduced through RNA editing (as described in this chapter), are

mech-anisms targeting transcripts Post-translational modifications of proteins also play a

major role in the regulation of gene expression, as assessed at the level of

biologi-cal activity (see Chapter 30)

A SINE of the Times

An interesting twist on transcription regulation comes from the discovery that

cer-tain noncoding RNAs (ncRNAs) act as transcription factors through direct binding

to RNA polymerase II For example, ncRNA B2 in mouse and Alu RNA in humans

are RNAs encoded within short interdispersed elements (SINEs) SINEs are

abun-dant within animal DNA and were once considered “junk” DNA because they lack

protein-coding properties Ala RNA or ncRNA B2 blocks promoter-bound RNA

polymerase II by interfering with transcription initiation

Specific DNA Sequences?

Proteins that recognize nucleic acids do so by the basic rule of macromolecular

recog-nition That is, such proteins present a three-dimensional shape or contour that is

structurally and chemically complementary to the surface of a DNA sequence When

the two molecules come into contact, the numerous atomic interactions that underlie

recognition and binding can take place Nucleotide sequence–specific recognition

by the protein involves a set of atomic contacts with the bases and the sugar–

phosphate backbone Hydrogen bonding is critical for recognition, with amino acid

side chains providing most of the critical contacts with DNA Protein contacts with the

bases of DNA usually occur within the major groove (but not always) Protein contacts

with the DNA backbone involve both H bonds and salt bridges with electronegative

oxygen atoms of the phosphodiester linkages Structural studies on regulatory proteins

that bind to specific DNA sequences have revealed that roughly 80% of such proteins

can be assigned to one of three principal classes based on their possession of one of

Coactivator

Acetylase

pol II, GTFs

Co-repressor

Deacetylase

TF

FIGURE 29.31 A model for the transcriptional regulation

of eukaryotic genes The DNA is a green ribbon wrapped around disclike nucleosomes A specific tran-scription factor (TF, pink) is bound to a regulatory ele-ment (either an enhancer or silencer) RNA polymerase II and its associated GTFs (blue) are bound at the pro-moter The N-terminal tails of histones are shown as wavy lines (blue) emanating from the nucleosome discs A specific transcription factor that is a transcrip-tion activator stimulates transcriptranscrip-tion through interac-tion with a co-activator whose HAT activity renders the

DNA more accessible and through interactions with the

Mediator complex associated with RNA polymerase II.

A specific transcription factor that is a repressor inter-acts with a co-repressor that has HDAC activity that deacetylates histones, restructuring the nucleosomes into a repressed state (From Figure 1 in Kornberg, R D., 1999.

Eukaryotic transcription control Trends in Biochemical Sciences

24:M46–M49.)

Trang 4

three kinds of small, distinctive structural motifs: the helix-turn-helix (or HTH), the

zinc finger(or Zn-finger), and the leucine zipper-basic region (or bZIP) The latter two

motifs are found only in DNA-binding proteins from eukaryotic organisms

In addition to their DNA-binding domains, these proteins commonly possess other structural domains that function in protein⬊protein recognitions essential to oligomerization (for example, dimer formation), DNA looping, transcriptional ac-tivation, and signal reception (for example, effector binding)

␣-Helices Fit Snugly into the Major Groove of B-DNA

A recurring structural feature in DNA-binding proteins is the presence of

-helical segments that fit directly into the major groove of B-form DNA The

diam-eter of an -helix (including its side chains) is about 1.2 nm The dimensions of the

major groove in B-DNA are 1.2 nm wide by 0.6 to 0.8 nm deep Thus, one side of an

-helix can fit snugly into the major groove Although examples of -sheet DNA

recognition elements in proteins are known, the -helix and B-form DNA are the

predominant structures involved in protein⬊DNA interactions Significantly, pro-teins can recognize specific sites in “normal” B-DNA; the DNA need not assume any unusual, alternative conformation (such as Z-DNA)

Proteins with the Helix-Turn-Helix Motif Use One Helix

to Recognize DNA

The HTH motif is a protein structural domain consisting of two successive -helices

separated by a sharp -turn (Figure 29.32) Within this domain, the -helix situated

more toward the C-terminal end of the protein, the so-called helix 3, is the DNA

recognition helix; it fits nicely into the major groove, with several of its side chains

touching DNA base pairs Helix 2, the helix at the beginning of the HTH motif,

cre-ates a stable structural domain through hydrophobic interactions with helix 3 that locks helix 3 into its DNA interface Proteins with HTH motifs bind to DNA as dimers In the dimer, the two helix 3 cylinders are antiparallel to each other, such that their N⎯→C orientations match the inverted relationship of nucleotide sequence

in the dyad-symmetric DNA-binding site An example is Antp Antp is a member of

a family of eukaryotic proteins involved in the regulation of early embryonic devel-opment that have in common an amino acid sequence element known as the

homeobox6domain.The homeobox is a DNA motif that encodes a related 60–amino acid sequence (the homeobox domain) found among proteins of virtually every

HUMAN BIOCHEMISTRY

Storage of Long-Term Memory Depends on Gene Expression Activated

by CREB-Type Transcription Factors

Learning can be defined as the process whereby new information is

acquired and memory as the process by which this information is

re-tained Short-term memory (which lasts minutes or hours) requires

only the covalent modification of preexisting proteins, but long-term

memory (which lasts days, weeks, or a lifetime) depends on gene

ex-pression, protein synthesis, and the establishment of new neuronal

connections

The macromolecular synthesis underlying long-term memory

storage requires cAMP-response element-binding (CREB) protein–

related transcription factors and the activation of cAMP-dependent

gene expression Serotonin (5-hydroxytryptamine, or 5-HT, a

hor-mone implicated in learning and memory) acting on neurons

pro-motes cAMP synthesis, which in turn stimulates protein kinase A to

phosphorylate CREB protein–related transcription factors that activate transcription of cAMP-inducible genes These genes are

characterized by the presence of CRE (cAMP response element)

consensus sequences containing the 8-bp TGACGTCA palindrome

CREB transcription factors are bZIP-type proteins (see later

discus-sion) These exciting findings opened a new arena in molecular

biology, the molecular biology of cognition Eric Kandel was awarded

the 2000 Nobel Prize in Physiology or Medicine for, among other things, his discovery of the role of CREB-type transcription factors in long-term memory storage

Cognition is the act or process of knowing; the acquisition of knowledge.

6Homeo derives from homeotic genes, a set of genes originally discovered in the fruit fly Drosophila melanogaster through their involvement in the specification of body parts during development.

FIGURE 29.32 An HTH motif protein: Antp monomer

bound to DNA Helix 3 (yellow) is locked into the major

groove of the DNA by helix 2 (magenta) (pdb id  9ANT).

Trang 5

eukaryote, from yeast to man Embedded within the homeobox domain is an HTH

motif Homeobox domain proteins act as sequence-specific transcription factors.

Typically, the homeobox portion comprises only 10% or so of the protein’s mass,

with the remainder of the protein serving in protein⬊protein interactions essential

to transcription regulation Other DNA-binding proteins with HTH motifs are lac

re-pressor, trp rere-pressor, and the C-terminal domain of CAP.

How Does the Recognition Helix Recognize Its Specific DNA-Binding Site? The

edges of base pairs in dsDNA present a pattern of hydrogen-bond donor and

ac-ceptor groups within the major and minor grooves, but only the pattern displayed

on the major-groove side is distinctive for each of the four base pairs A⬊T, T⬊A, C⬊G,

and G⬊C (You can get an idea of this by inspecting the structures of the base pairs

in Figure 11.6.) Thus, the base-pair edges in the major groove act as a recognition

matrixidentifiable through H bonding with a specific protein, so it is not necessary

to melt the base pairs to read the base sequence Although formation of such

H bonds is very important in DNA⬊protein recognition, other interactions also play

a significant role For example, the C-5-methyl groups unique to thymine residues

are nonpolar “knobs” projecting into the major groove

Proteins Also Recognize DNA via “Indirect Readout” Indirect readoutis the term

for the ability of a protein to indirectly recognize a particular nucleotide sequence by

recognizing local conformational variations resulting from the effects that base

se-quence has on DNA structure Superficially, the B-form structure of DNA appears to

be a uniform cylinder Nevertheless, the conformation of DNA over a short distance

along its circumference varies subtly according to local base sequence That is, base

sequences generate unique contours that proteins can recognize Because these

con-tours arise from the base sequence, the DNA-binding protein “indirectly reads out”

the base sequence through interactions with the DNA backbone In the E coli Trp

re-pressor⬊trp operator DNA complex, the Trp repressor engages in 30 specific

hydro-gen bonds to the DNA: 28 involve phosphate groups in the backbone; only 2 are to

bases Thus, some sequence-specific DNA-binding proteins are able to recognize an

overall DNA conformation caused by the specific DNA sequence

Some Proteins Bind to DNA via Zn-Finger Motifs

There are many classes of Zn-finger motifs The prototype Zn-finger is a structural

fea-ture formed by a pair of Cys residues separated by 2 residues, then a run of 12 amino

acids, and finally a pair of His residues separated by 3 residues (Cys-x2-Cys-x12-His-x3

-His) This motif may be repeated as many as 13 times over the primary structure of a

Zn-finger protein Each repeat coordinates a zinc ion via its 2 Cys and 2 His residues

(Figure 29.33) The 12 or so residues separating the Cys and His coordination sites

Cys

Cys

His

His

Zn

Cys Cys His

His

(b)

Zn

(c)

FIGURE 29.33 The Zn-finger motif of the C 2 H 2

type showing (a) the coordination of Cys and His residues to Zn and (b) the secondary structure.

(c) Structure of a classic C2 H 2 zinc finger protein (zif268) with three zinc fingers bound to DNA (pdb id  1ZAA).

(c)

Trang 6

are looped out and form a distinct DNA interaction module, the so-called Zn-finger When Zn-finger proteins associate with DNA, each Zn-finger binds in the major groove and interacts with about five nucleotides, adjacent fingers interacting with contiguous stretches of DNA Many DNA-binding proteins with this motif have been identified In all cases, the finger motif is repeated at least two times, with at least a 7– to 8–amino acid linker between Cys/Cys and His/His sites Proteins with this

gen-eral pattern are assigned to the C 2 H 2 classof Zn-finger proteins to distinguish them

from proteins bearing another kind of Zn-finger, the C x type,which includes the C4

and C6Zn-finger proteins The Cxproteins have a variable number of Cys residues

available for Zn chelation For example, the vertebrate steroid receptors have two sets

of Cys residues, one with four conserved cysteines (C4) and the other with five (C5)

Some DNA-Binding Proteins Use a Basic Region-Leucine

Zipper (bZIP) Motif

bZIP is a structural motif characterizing the third major class of sequence-specific,

DNA-binding proteins This motif was first recognized by Steve McKnight in C/EBP, a

heat-stable, DNA-binding protein isolated from rat liver nuclei that binds to both CCAAT promoter elements and certain enhancer core elements.7The DNA-binding domain of C/EBP was localized to the C-terminal region of the protein This region shows a notable absence of Pro residues, suggesting it might be arrayed in an -helix.

Within this region are two clusters of basic residues: A and B Further along is a 28-residue sequence When this latter region is displayed end-to-end down the axis of

a hypothetical -helix, beginning at Leu315, an amphipathic cylinder is generated, sim-ilar to the one shown in Figure 6.22 One side of this amphipathic helix consists prin-cipally of hydrophobic residues (particularly leucines), whereas the other side has an array of negatively and positively charged side chains (Asp, Glu, Arg, and Lys), as well

as many uncharged polar side chains (glutamines, threonines, and serines)

The Zipper Motif of bZIP Proteins Operates Through Intersubunit

Interaction of Leucine Side Chains

The leucine zipper motif arises from the periodic repetition of leucine residues within this helical region The periodicity causes the Leu side chains to protrude from the same side of the helical cylinder, where they can enter into hydrophobic interactions with a similar set of Leu side chains extending from a matching helix in a second polypeptide These hydrophobic interactions establish a stable noncovalent linkage, fostering dimerization of the two polypeptides (as shown in Figure 29.34) The leucine zipper is not a DNA-binding domain Instead, it functions in protein dimer-ization Leucine zippers have been found in other mammalian transcriptional

regu-latory proteins, including Myc, Fos, and Jun.

The Basic Region of bZIP Proteins Provides the DNA-Binding Motif

The actual DNA contact surface of bZIP proteins is contributed by a 16-residue

seg-ment that ends exactly 7 residues before the first Leu residue of the Leu zipper This DNA contact region is rich in basic residues and hence is referred to as the

basic region.Two bZIP polypeptides join via a Leu zipper to form a Y-shaped

mol-ecule in which the stem of the Y corresponds to a coiled pair of -helices held by

the leucine zipper The arms of the Y are the respective basic regions of each polypeptide; they act as a linked set of DNA contact surfaces (Figure 29.34) The dimer interacts with a DNA target site by situating the fork of the Y at the center

of the dyad-symmetric DNA sequence The two arms of the Y can then track along the major groove of the DNA in opposite directions, reading the specific

recogni-tion sequence (Figure 29.35) An interesting aspect of bZIP proteins is that the two

polypeptides need not be identical (Figure 29.35) Heterodimers can form, pro-vided both polypeptides possess a leucine zipper region An important

conse-Chelation is from the Greek word chele, meaning

“claw”; it refers to the binding of a metal ion to

two or more nonmetallic atoms in the same

molecule

Leucine zipper (dimerization motif) BR-B

BR-A

Basic region

(DNA contact

surface)

C C

FIGURE 29.34 Model for a dimeric bZIP protein Two

bZIP polypeptides dimerize to form a Y-shaped

mole-cule The stem of the Y is the Leu zipper, and it holds

the two polypeptides together Each arm of the Y is the

basic region from one polypeptide Each arm is

com-posed of two -helical segments: BR-A and BR-B (basic

regions A and B) 7The acronym C/EBP designates this protein as a “CCAAT and enhancer-binding protein.”

Trang 7

quence of heterodimer formation is that the DNA target site need not be a

palin-dromic sequence The respective basic regions of the two different bZIP

polypep-tides (for example, Fos and Jun) can track along the major groove reading two

dif-ferent base sequences Heterodimer formation expands enormously the DNA

recognition and regulatory possibilities of this set of proteins

and Delivered to the Ribosomes for Translation?

Transcription and translation are concomitant processes in prokaryotes, but in

eukaryotes, the two processes are spatially separated (see Chapter 10)

Transcrip-tion occurs on DNA in the nucleus, and translaTranscrip-tion occurs on ribosomes in the cytoplasm.

Consequently, transcripts must be transported from the nucleus to the cytosol to

be translated On the way, these transcripts undergo processing: alterations that

convert the newly synthesized RNAs, or primary transcripts, into mature

messen-ger RNAs Also, unlike prokaryotes, in which many mRNAs encode more than

one polypeptide (that is, they are polycistronic), eukaryotic mRNAs encode only

one polypeptide (that is, they are exclusively monocistronic)

Eukaryotic Genes Are Split Genes

Most genes in higher eukaryotes are split into coding regions, called exons,8and

noncoding regions, called introns (Figure 29.36; see also Figure 10.20) Introns are

the intervening nucleotide sequences that are removed from the primary

tran-script when it is processed into a mature RNA Gene expression in eukaryotes

en-tails not only transcription but also the processing of primary transcripts to yield the

mature RNA molecules we classify as mRNAs, tRNAs, rRNAs, and so forth

The Organization of Exons and Introns in Split Genes Is Both

Diverse and Conserved

Split genes occur in an incredible variety of interruptions and sizes The yeast actin

geneis a simple example, having only a single 309-bp intron that separates the

nu-cleotides encoding the first 3 amino acids from those encoding the remaining 350 or

so amino acids in the protein The chicken ovalbumin gene is composed of 8 exons

FIGURE 29.35 Model for the heterodimeric bZIP tran-scription factor c-Fos ⬊c-Jun bound to a DNA oligomer

containing the AP-1 consensus target sequence TGACTCA (pdb id  1FOS).

Gene

Promoter/enhancer sequences

Exon 1 Intron Exon 2 Intron Exon 3 Intron Exon 4

DNA

coding

strand

5 

mRNA

mRNA transcript

signal

3 -untranslated region (variable length since transcription termination is imprecise)

Processing (capping, methylation, poly (A) addition, splicing)

Exon 1

5 -untranslated region

7-mG cap Mature mRNA

Exon 1 Exon 2 Exon 3 Exon 4

(A)100–200

FIGURE 29.36 The organization of split eukaryotic genes.

8Although the term exon is commonly used to refer to the protein-coding regions of an interrupted or

split gene, a more precise definition would specify exons as sequences that are represented in mature

RNA molecules This definition encompasses not only protein-coding genes but also the genes for

var-ious RNAs (such as tRNAs or rRNAs) from which intervening sequences must be excised in order to

generate the mature gene product.

Trang 8

and 7 introns The two vitellogenin genes of the African clawed toad Xenopus laevis are

both spread over more than 21 kbp of DNA; their primary transcripts consist of just

6 kb of message that is punctuated by 33 introns The chicken pro ␣-2 collagen gene

has a length of about 40 kbp; the coding regions constitute only 5 kb distributed over

51 exons within the primary transcript The exons are quite small, ranging from 45 to

249 bases in size

Clearly, the mechanism by which introns are removed and multiple exons are spliced together to generate a continuous, translatable mRNA must be both precise and complex If one base too many or too few is excised during splicing, the coding

sequence in the mRNA will be disrupted The mammalian DHFR (dihydrofolate

reductase) geneis split into 6 exons spread over more than 31 kbp of DNA The

6 exons are spliced together to give a 6-kb mRNA (Figure 29.37) Note that, in three different mammalian species, the size and position of the exons are essentially the same but that the lengths of the corresponding introns vary considerably Indeed, the lengths of introns in vertebrate genes range from a minimum of about 60 bases

to more than 10,000 bp Many introns have nonsense codons in all three reading frames and thus are untranslatable Introns are found in the genes of mitochondria and chloroplasts as well as in nuclear genes Although introns have been observed in archaea and even bacteriophage T4, none are known in the genomes of bacteria

Post-Transcriptional Processing of Messenger RNA Precursors Involves Capping, Methylation, Polyadenylylation, and Splicing

Capping and Methylation of Eukaryotic mRNAs The protein-coding genes of eukaryotes are transcribed by RNA polymerase II to form primary transcripts or

pre-mRNAs that serve as precursors to mRNA As a population, these RNA molecules are very large and their nucleotide sequences are very heterogeneous because they represent the transcripts of many different genes, hence the

des-ignation heterogeneous nuclear RNA, or hnRNA Shortly after transcription of

hnRNA is initiated, the 5-end of the growing transcript is capped by addition of

a guanylyl residue This reaction is catalyzed by the nuclear enzyme guanylyl

transferase using GTP as substrate (Figure 29.38) The cap structure is

methyl-ated at the 7-position of the G residue Additional methylations may occur at the 2-O positions of the two nucleosides following the 7-methyl-G cap and at the 6-amino group of a first base adenine (Figure 29.39)

3 ⴕ-Polyadenylylation of Eukaryotic mRNAs Transcription by RNA polymerase II typically continues past the 3-end of the mature messenger RNA Primary tran-scripts show heterogeneity in sequence at their 3-ends, indicating that the precise point where termination occurs is nonspecific However, termination does not nor-mally occur until RNA polymerase II has transcribed past a consensus AAUAAA

se-quence known as the polyadenylylation signal.

Most eukaryotic mRNAs have 100 to 200 adenine residues attached at their 3-end, the poly(A) tail [Histone mRNAs are the only common mRNAs that lack

Chinese hamster

Exon Intron

Mouse

Human

kb

FIGURE 29.37 The organization of the mammalian

DHFR gene in three representative species Note that

the exons are much shorter than the introns Note also

that the exon pattern is more highly conserved than

the intron pattern.

Trang 9

poly(A) tails.] These A residues are not encoded in the DNA but are added

post-transcriptionally by the enzyme poly(A) polymerase, using ATP as a substrate The

consensus AAUAAA is not itself the poly(A) addition site; instead it defines the

po-sition where poly(A) addition occurs (Figure 29.40) The consensus AAUAAA is

found 10 to 35 nucleotides upstream from where the nascent primary transcript is

cleaved by an endonuclease to generate a new 3-OH end This end is where the

poly(A) tail is added The processing events of mRNA capping, poly(A) addition,

and splicing of the primary transcript create the mature mRNA Interestingly, both

the guanylyl transferase that adds the 5-cap structure and the enzymes that process

the 3-end of the transcript and add the poly (A) tail are anchored to RNA

poly-merase II via interactions with its RPB1 CTD

Nuclear Pre-mRNA Splicing

Within the nucleus, hnRNA forms ribonucleoprotein particles (RNPs) through

as-sociation with a characteristic set of nuclear proteins These proteins interact with

the nascent RNA chain as it is synthesized, maintaining the hnRNA in an untangled,

O HN

H2N N

N N

OH OH

N N

N N

NH2

OH

5 

P

5 -capped transcript

N

P PNPNP .

O

H2N

GTP

CH2 O

OH OH

N

N N

HN

G

CH2 O

OH

5-end of transcript N

N N

N N

NH2

P PNPNP .

A

+

P

+

O

P P P

P P

P P

P

Guanylyl transferase

FIGURE 29.38 The capping of eukaryotic pre-mRNAs Guanylyl transferase catalyzes the addition of a guanylyl

residue (Gp) derived from GTP to the 5 -end of the growing transcript, which has a 5-triphosphate group already

there In the process, pyrophosphate (pp) is liberated from GTP and the terminal phosphate (p) is removed from

the transcript Gppp pppApNpNpNp ⎯→ GpppApNpNpNp  pp  p (A is often the initial nucleotide in the

primary transcript).

O

HN

H2N N

N+

N

CH3

CH2 O P

O–

O

O–

O

O–

O

O CH2 O

O O CH3

N N

N N

NH2

O–

O O CH3

N N

N N

NH2

O–

N NH O

O

etc.

A

U

5

3

OH OH O

FIGURE 29.39 Methylation of several specific sites located at the 5 -end of eukaryotic pre-mRNAs is

an essential step in mRNA maturation A cap bearing only a single OCH 3 on the guanyl is termed

cap 0 This methylation occurs in all eukaryotic mRNAs If a methyl is also added to the 2-O position

of the first nucleotide after the cap, a cap 1 structure is generated.This is the predominant cap form

in all multicellular eukaryotes Some species add a third OCH 3 to the 2 -O position of the second

nucleotide after the cap, giving a cap 2 structure Also, if the first base after the cap is an adenine, it

may be methylated on its 6-NH 2 In addition, approximately 0.1% of the adenine bases throughout

the mRNA of higher eukaryotes carry methylation on their 6-NH 2 groups.

Trang 10

accessible conformation The substrate for splicing, that is, intron excision and exon ligation, is the capped primary transcript emerging from the RNA polymerase

II transcriptional apparatus, in the form of an RNP complex Splicing occurs ex-clusively in the nucleus The mature mRNA that results is then exported to the cytoplasm to be translated Splicing requires precise cleavage at the 5- and 3-ends

of introns and the accurate joining of the two ends Consensus sequences define the exon/intron junctions in eukaryotic mRNA precursors, as indicated from an analy-sis of the splice sites in vertebrate genes (Figure 29.41) Note that the sequences GU and AG are found at the 5- and 3-ends, respectively, of introns in pre-mRNAs from higher eukaryotes In addition to the splice junctions, a conserved sequence within

the intron, the branch site, is also essential to pre-mRNA splicing The site lies 18 to

40 nucleotides upstream from the 3-splice site and is represented in higher eu-karyotes by the consensus sequence YNYRAY, where Y is any pyrimidine, R is any purine, and N is any nucleotide

The Splicing Reaction Proceeds via Formation of a Lariat Intermediate

The mechanism for splicing nuclear mRNA precursors is shown in Figure 29.42

A covalently closed loop of RNA, the lariat, is formed by attachment of the

5-phosphate group of the intron’s invariant 5-G to the 2-OH at the invariant branch site A to form a 2-5 phosphodiester bond Note that lariat formation creates

an unusual branched nucleic acid The lariat structure is excised when the 3-OH of the consensus G at the 3-end of the 5 exon (Exon 1, Figure 29.42) covalently joins with the 5-phosphate at the 5-end of the 3 exon (Exon 2) The reactions that occur are transesterification reactions where an OH group reacts with a phospho-ester bond, displacing an OOH to form a new phosphophospho-ester link Because the reac-tions lead to no net change in the number of phosphodiester linkages, no energy

in-DNA

3

3 

RNA polymerase

Initiates RNA polymerasecontinues

A A U A A

A A U A A

G/U

G/U

G/U cap

cap

cap

3 -OH cap

cap

CFs

CFs

Cleavage, CFs dissociate,

as does 3 -fragment

CPSF dissociates

Polyadenylylates the 3 -end

CPSF

CPSF

CPSF

CPSF

PAP

(A)100–200

A A U A A

A A U A A

A A U A A A A A A A A

FIGURE 29.40 Poly(A) addition to the 3 -ends of

tran-scripts occurs 10 to 35 nucleotides downstream from a

consensus AAUAAA sequence, defined as the

polyadeny-lylation signal CPSF (cleavage and polyadenypolyadeny-lylation

specificity factor) binds to this signal sequence and

medi-ates looping of the 3 -end of the transcript through

in-teractions with a G/U-rich sequence even further

down-stream Cleavage factors (CFs) then bind and bring about

the endonucleolytic cleavage of the transcript to create

a new 3 -end 10 to 35 nucleotides downstream from

the polyadenylylation signal Poly(A) polymerase (PAP)

then successively adds 200 to 250 adenylyl residues to

the new 3 -end (RNA polymerase II is also a significant

part of the polyadenylylation complex at the 3 -end of

the transcript, but for simplicity in illustration, its

pres-ence is not shown in the lower part of the figure.)

A G : G U A A G U

Exon

5 ⴕ-Splice Site Consensus

Intron

Py Py Py Py Py Py Py Py – C A G : G – –

Exon

3 ⴕ-Splice Site Consensus

Intron

FIGURE 29.41 Consensus sequences at the splice sites

in vertebrate genes.

Ngày đăng: 06/07/2014, 14:20

TỪ KHÓA LIÊN QUAN