1. Trang chủ
  2. » Luận Văn - Báo Cáo

Báo cáo khoa học: Alternative splicing: global insights potx

11 549 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 11
Dung lượng 401,13 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Splice-sensitive microarray platforms and deep sequencing allow quantitative profiling of very large numbers of alternative splicing events, whereas global analysis of the targets of RNA

Trang 1

Alternative splicing: global insights

Martina Hallegger*, Miriam Llorian* and Christopher W J Smith

Department of Biochemistry, University of Cambridge, UK

Introduction

Alternative splicing allows individual genes to produce

two or more variant mRNAs, which in many cases

encode functionally distinct proteins With the

progres-sive generation of ever larger sequence datasets, the

proportion of multi-exon human genes that are known

to be alternatively spliced has expanded to 92–94%, of

which 85% have a minor isoform frequency of at least

15% [1,2] Despite some debate about the extent to

which all of this alternative splicing is functionally

important [3], there is no disputing that alternative

splicing is a major contributor to the diverse repertoire

of transcriptomes and proteomes Its importance is

underscored by the fact that misregulated alternative

splicing can lead to human disease [4,5] As part of the

overarching effort to understand how the information

encrypted within genomes is used to generate fully

functional organisms, it is therefore necessary to deci-pher the ‘RNA codes’ underlying regulated patterns of alternative splicing

Traditionally, research on alternative splicing regula-tion focused on the study of minigene models in vitro

or in vivo The picture that emerged is that regulation

of alternative splicing occurs via the action of numer-ous RNA binding proteins expressed at variable levels between tissues These activators and repressors often mediate their effects by binding to enhancer and silen-cer elements within or surrounding alternatively spliced exons (reviewed in [6]) Although much progress has been made using model systems, a drawback is that even when a model alternative splicing event has been thoroughly characterized it is not immediately obvi-ous which of its features are generally shared by

Keywords

alternative splicing; microarray; RNA-Seq

Correspondence

C W J Smith, Department of

Biochemistry, University of Cambridge, 80

Tennis Court Road, Cambridge CB2 1GA,

UK

Fax: +44 1223 766002

Tel: +44 1223 333655

E-mail: cwjs1@cam.ac.uk

*These authors contributed equally to this

work

(Received 26 August 2009, accepted

22 October 2009)

doi:10.1111/j.1742-4658.2009.07521.x

Following the original reports of pre-mRNA splicing in 1977, it was quickly realized that splicing together of different combinations of splice sites – alternative splicing– allows individual genes to generate more than one mRNA isoform The full extent of alternative splicing only began to

be revealed once large-scale genome and transcriptome sequencing projects began, rapidly revealing that alternative splicing is the rule rather than the exception Recent technical innovations have facilitated the investigation of alternative splicing at a global scale Splice-sensitive microarray platforms and deep sequencing allow quantitative profiling of very large numbers of alternative splicing events, whereas global analysis of the targets of RNA binding proteins reveals the regulatory networks involved in post-transcrip-tional gene control Combined with sophisticated computapost-transcrip-tional analysis, these new approaches are beginning to reveal the so-called ‘RNA code’ that underlies tissue and developmentally regulated alternative splicing, and that can be disrupted by disease-causing mutations

Abbreviations

CLIP, UV cross-linking and immunoprecipitation; CELF, CUGBP and ETR3 like family (of RNA binding proteins); CUGBP, CUG binding protein; miRNA, micro-RNA; RNP, ribonucleoprotein; MBNL, muscleblind like; PTB, polypyrimidine tract binding protein; SELEX, selective evolution of ligands by exponential enrichment; SR protein, serine-arginine rich protein.

Trang 2

coregulated alternative splicing events as part of a

common regulatory programme, and which features

are oddities of the particular model system Over

recent years new high-throughput methodologies have

allowed the analysis of thousands of alternative

splic-ing events in parallel These tools – principally

splice-sensitive microarrays, but also medium-throughput

automated RT-PCR, and increasingly deep sequencing

– allow large-scale quantitative profiling of splice

vari-ants This is important in allowing the generation of

large datasets of coregulated splicing events – a

prere-quisite for defining RNA codes Biomedically, these

approaches can facilitate the identification of splicing

signatures that are associated with pathologies [7] At

the same time, improved methods for defining the full

cellular complement of RNAs to which a particular

protein binds – for example, CLIP (UV cross-linking

and immunoprecipitation [8]) and its ‘next generation’

derivative HITS-CLIP [9] or CLIP-Seq [10] – as well

as a global analysis of alternative splicing changes

pro-duced as a result of splicing factor knockdown or

knockout, provide additional ‘factor-centric’ datasets

that can contribute to defining the codes

Several recent reviews have covered different aspects

of these global analyses [11–15] The aim of this

mini-review is to highlight some of the recently published

information that contributes towards breaking the

RNA code by the application of high-throughput

methodology, mainly focusing upon work in

mamma-lian systems We start by providing a brief review of

the enabling technologies, and move on to discuss the

insights they have allowed and possible future

develop-ments

Analogue and digital transcriptome

profiling

Early microarrays typically contained probes consisting

of full-length cDNAs or oligonucleotide probes located

towards the 3¢ end of transcripts, and were unable to

distinguish alternatively spliced isoforms However, a

number of current array designs, in different ‘flavours’

depending on the location of the probes, can

distin-guish between splice variants (Fig 1A, Table 1): (a)

til-ing arrays, with overlapptil-ing probes across a known

genomic sequence (a chromosome or an entire

gen-ome) [16]; (b) exon-body arrays, in which probes are

located within exons For example, the Affymetrix

human ExonArray includes 1.4 million probe sets

cor-responding to all known human exons, ranging from

the well annotated to more speculative computational

predictions [17–20]; (c) splice-junction arrays, which

contain probes crossing spliced junctions [21]; or

(d) exon-junction arrays, which contain probes within exons as well as across exon junctions Among the exon-junction arrays that have been used successfully are human and mouse arrays interrogating 3100 and

3700 cassette exons, respectively [22,23] A similar design has been used to interrogate 8315 alternative splicing events in Drosophila [24–26] Finally, a ‘whole transcript’ microarray monitoring 203 672 exons and

178 351 exon junctions has allowed the identification

of more than 24 000 human alternative splicing events [27] Such arrays have been applied successfully to study changes in alternative splicing under different conditions ranging from tissue-specific changes [17,27,28], cancer-associated splicing [19,29], signal-activated splicing [26,30], developmentally regulated splicing [20,31], as well as to define functional targets

by splicing factor depletion [18,25,32–34] and alterna-tive splicing events linked to nonsense-mediated decay [35] Although splice-sensitive microarrays have been applied with great success (see Table 1), they have some limitations, including cross-hybridization prob-lems, limited dynamic range, as well as a low signal-to-noise ratio due to background In particular, many of the normal rules for optimal probe design have to be relaxed or ignored in the case of exon-junction probes Finally, arrays are not an ideal platform for discover-ing new alternative splicdiscover-ing events, includdiscover-ing, for example, inclusion of pseudo-exons (see accompanying review by Dhir and Buratti [36]), and they are limited

to organisms with sequenced genomes

Sequence-based methods, including small tags, such

as expressed sequence tags, cap analysis of gene expression [37], serial analysis of gene expression [38],

as well as full-length cDNAs [39,40], have been used to obtain digital counts of transcript abundance, but they have suffered from bias introduced in the sample prep-aration, inability to detect lowly expressed genes and low statistical power The development of high-throughput DNA sequencing technologies [10,41,42] circumvents many of these previous barriers [1,43–48] RNA-Seq has the capacity to generate millions of short sequence reads (25–30 or 200–400 nucleotides depending on the sequencing technology) of cDNAs derived from polyA-enriched mRNA [45] Reads are then mapped on to unique locations on the genome and annotated transcriptome (for splice-junction reads), providing a digital count of expressed sequences (exons) Differences in read densities across genes in different conditions allow for quantification of gene expression [2,43] Comparison with microarray or RT-PCR data shows that read counts give an accurate estimate of relative gene expression levels across a very broad dynamic range [1,2]

Trang 3

Because many sequence reads span exon–exon

junc-tions, RNA-Seq can identify novel splicing events The

discovery of new alternative splicing events and

mRNA isoforms is an area where the new sequencing

technologies will have an immediate impact However,

a greater challenge is to harness RNA-Seq for digital

quantitative profiling of alternative splicing (Fig 1B)

In principle, changes in alternative splicing between

two conditions can be quantitated by comparing the

number of reads mapping to reciprocal events (e.g

exon inclusion versus skipping) [2], or by normalizing

the number of reads mapping to a particular splice

junction or exon by the number of reads across the

gene In practice, large-amplitude changes in

alterna-tive splicing events within genes that are themselves

highly expressed are readily detected (e.g the

‘switch-like’ events reported in [2]) Only in one-third of

105 000 annotated alternative splicing events were reciprocal reads detected by Wang et al [2], allowing quantification of tissue-specific differential splicing using a minimum threshold of 10% change in inclu-sion ratio between tissues However, more subtle changes in alternative splicing within genes for which few reads are available will evade detection [49] Recent estimates suggest that 200 million reads would

be required to quantitate accurately the splicing levels

in 80% of genes [15] In the future, the progressively decreasing cost and increasing read lengths and volume

of high-throughput sequencing can only advance the ability of RNA-Seq to profile alternative splicing quan-titatively Methods to ‘focus’ sequence reads on to splice junctions, such as RNA-mediated annealing,

A

B

Fig 1 High-throughput methods for global analyses of alternative splicing (A) Schematic representation of different splice-sensitive micro-arrays (adapted from [27]) Exon micro-arrays, typically Affymetrix Exon Arrays, contain oligonucleotide probe sets for every known and predicted exon Junction arrays, typically used in [21], contain probes spanning exon junctions across annotated genes Exon-junction arrays typically contain both exon-body and exon-junction probes The coverage of these arrays varies from a few thousand cassette exons [22,23] to all annotated alternatively spliced genes in Drosophila [24–26] or every single annotated exon and exon junction in  18 000 human genes [27] The bottom panel shows an example of differential exon usage for a typical cassette exon by means of the differential hybridization signals (B) RNA-Seq The genomic structure for a typical cassette exon is depicted in the middle of the panel, where constitutive exons are shown

in purple and the alternative cassette exon in blue Sequence reads obtained from the high-throughput method are represented in colour-coded rectangles (see inset) and are mapped within the genomic sequence The counting of reads corresponding to inclusion (upper) and skipping (bottom) allows for the estimation of ‘inclusion ratios’ for the different alternatively spliced isoforms.

Trang 4

selection, extension and ligation [50] or preselection by

customized capture arrays [51], might enable more

cost-effective quantitative profiling of a large number

of alternative splicing events In the meantime, some

of the splice-sensitive microarray platforms will remain

competitive

Surveying splicing regulator targets

Cataloguing the targets of RNA binding proteins that

are known splicing regulators provides a

complemen-tary entry point for unravelling RNA codes

‘Func-tional targets’ can be classified as the set of alternative

splicing events that are affected by perturbing the

levels of a splicing regulator, by knockdown, knockout

or overexpression These targets can be identified by

global transcriptome profiling tools, such as

splice-sensitive microarrays [18,25,32–34],

medium-through-put RT-PCR [52], RNA-Seq or even quantitative

proteomics [53] However, apparent functional targets

can include indirect secondary targets

A complementary approach is to identify direct

RNA ‘binding targets’ Selective evolution of ligands

by exponential enrichment (SELEX) is an initial fully

in vitro approach that defines the optimal binding site, typically short variably degenerate motifs, for an RNA binding protein by iterative selection from an ini-tially fully degenerate sequence pool [54] A variant approach, genomic SELEX, uses RNA transcribed from genomic DNA as the starting pool for selection [55] SELEX is a useful, although not obligatory, precursor to methods that catalogue the actual RNA species (mRNA or pre-mRNA) bound by a splicing regulatory protein Direct immunoprecipitation with-out prior cross-linking (RNP immunoprecipitation) followed by hybridization to arrays can be a useful approach [25] However, a more powerful approach for identifying binding targets is CLIP (Fig 2), which was originally developed to identify targets of the neuron-specific NOVA proteins [8,56] RNA is first cross-linked in vivo to bound protein by UV irradia-tion, fragmented to  100 nucleotide tags, isolated by immunoprecipitation, reverse transcribed and then sequenced A key feature of CLIP is that UV induces

‘zero-length’ cross-links only between RNA and directly bound proteins, thereby allowing enrichment

Table 1 Summary of splice-sensitive microarray analyses.

Validation rate (events tested) Reference

203 672 exons ⁄ 178 351 exon junctions 48 tissues and cell lines Human 74% (23 events tested) [27]

110 367 exons ⁄ 93 382 exon junctions Time course of heart development Mouse Not mentioned [31]

Affymetrix Exon Array Probe

sets for 1 million exons

Colon, bladder, prostate cancer tissues

Exon Array and array featuring

exon-body and exon-junction

probe sets

UPF3 in HeLa

8315 mRNAs ⁄ 9868 alt

junction probes

Knockdown of SR and hnRNP proteins in S2 cells

Knockdown of hnRNP proteins

in S2 cells

Alternative splicing changes upon insulin or Wingless stimulation

Trang 5

of specifically bound sequences by

immunoprecipita-tion under stringent condiimmunoprecipita-tions The original CLIP

procedure has now been modified, with direct

high-throughput sequencing of reverse transcribed tags

[9,10] The so-called HITS-CLIP [9] or CLIP-Seq [10]

protocols allow saturated coverage of binding targets,

giving a truly global view of the RNP landscape of

individual proteins, and suggesting possible novel

func-tions This ‘next generation’ CLIP approach has

already been applied to the splicing regulators NOVA

[57], FOX2 [58], SFRS1 (better known as SF2⁄ ASF)

[59,60], as well as the miRNA-associated protein,

argonaute [61] The comprehensive view afforded by

this approach reveals additional, nonsplicing-related,

roles for these RNA binding proteins For example,

a surprising new function for NOVA2 in alternative

poly(A)-site choice was discovered Neuronal cells in

general tend to process at promoter-distal poly(A)-sites

and the NOVA2 targets follow this trend Proliferating

cells produce shorter 3¢ UTRs and therefore reduce

the potential of miRNA regulation [62] By the

same token, neuronal transcripts with long UTRs are

potentially more prone to regulatory inputs from both

miRNAs and 3¢ UTR binding proteins

In practice, methods to define functional and

binding targets are complementary A comprehensive

global analysis of the Drosophila homologues of the

mammalian hnRNPA⁄ B proteins, hrp36, hrp38, hrp40,

hrp48, involved analysis by a splice-sensitive array of alterations in alternative splicing upon knockdown, determination of SELEX motifs in vitro and direct immunoprecipitation without prior cross-linking followed by hybridization to arrays using a whole genome tiling array [25] This provided many insights into the functional redundancy and specialization of this family, and provided hints about their probable mechanism of action Perhaps most surprisingly, in view of popular models about antagonism between the two families of proteins, very few alternative splicing events were found to be regulated by both hnRNP and

SR proteins [24,25]

Tissue and individual variations in alternative splicing

Over the last year, several reports have focussed on the global analysis of transcript isoform differences between human tissues [1,2,16,27,28,47,63,64], mouse tissues [31,63], normal and cancer tissues [64], in response to specific signalling pathways in Drosophila [26], or developmental transitions in human brain [28], mouse heart [31] and mouse stem cells [63] The combi-nation of these approaches has revealed extensive transcript complexity

Sequencing approaches show that many transcripts extend beyond the previously annotated 5¢ and 3¢ gene

Fig 2 HITS-CLIP Intact tissue or tissue culture cells are UV irradiated to induce covalent cross-links between RNA and RNA binding pro-teins Cells are lysed under very stringent conditions and treated with DNAse and partially digested with RNAses The RNA–RNP complex is pulled-down by immunoprecipitation The RNA is radioactively 5¢ labelled and ligated to a 5¢ RNA linker The sample is run on SDS ⁄ PAGE with neutral pH and blotted Only RNA cross-linked to protein will be transferred on to the membrane A small fragment of membrane is iso-lated at a position that corresponds to the protein plus RNA between 50 and 100 nucleotides After proteinase K digestion, the RNA is recovered from the membrane and ligated on its 3¢ end to an RNA adapter with complementarity to the RT primer The following PCR step with primer complementary to ligated linkers also allows the addition of appropriate HITS-specific primer sequences (adapted from [76]).

Trang 6

boundaries [1,2,63] Moreover, there has been a

substantial increase in the number of known alternative

splicing events, with the capacity of discovering new

splice junctions, ranging from 1400 in one study [63] to

between 4294 and 11 099 in another [1] The majority of

detected alternative splicing events, including those

newly discovered, show clear tissue specificity,

demon-strating the importance of alternative splicing in

tissue-specific programmes of gene expression In one study

alone, involving 400 million 32 base reads from 15

human tissues and cell lines, 22 000 tissue-specific

alter-native transcript events were identified [2] A group of

alternative splicing events that shows extreme changes

between tissues – so-called ‘switch-like’ events – is

asso-ciated with the regulation of highly tissue-specific

func-tions by switching between distinct full-length isoforms

[2] Perhaps unsurprisingly, some of these switch-like

alternative splicing events within highly expressed genes

(e.g TPM1) have been used for many years as model

systems of regulated alternative splicing

Interestingly, although in many cases alternative

splicing regulates functionally coherent groups of

genes, there is no significant overlap between those

genes that are differentially transcribed and those that

are differentially spliced within the same tissue or

within specific cell programmes [27,30,31,42,65] For

example, upon T cell activation, genes related to the

immunological response are affected at the level of

transcription, whereas cell cycle genes are differentially

spliced [30] These findings build upon the original

observations of Pan et al [23] suggesting that overall

programmes of tissue-specific gene expression involve

independent subprogrammes operating on different

subsets of genes at the levels of transcription and

splic-ing [66] On the other hand, in response to certain

sig-nalling pathways in Drosophila melanogaster cells, a

40% overlap was found between genes that undergo

both transcriptional and splicing changes, suggesting

that transcriptional and post-transcriptional

co-ordina-tion could be important to deploy quick responses

upon certain stimuli [26]

Sequencing and array studies have also provided

fascinating glimpses at the degree to which alternative

splicing varies between individuals RNA-Seq of

sam-ples originating from seven cerebellar cortex samsam-ples

[2] and exon tiling array analysis of 57 lymphoblastoid

cell lines [16] both showed a significant association

between genomic variations (single nucleotide

poly-morphisms) and alternative splicing patterns Happily

(for those working on mechanisms of tissue-specific

splicing), both studies indicated that although

alterna-tive splicing variation between individuals is common,

it is secondary to tissue-specific alternative splicing

Motifs and maps

RNA-Seq and microarray analysis on tissues have generated a genome-scale catalogue of isoform expres-sion profiles [2,17,27,31] These data provide a resource

to identify the RNA sequences involved in the regula-tion of tissue-specific alternative splicing by motif enrichment analysis In some cases, the motifs associ-ated with tissue-specific alternative splicing hint at the involvement of ‘usual suspects’ – well-known splicing regulators with defined binding sequences

By microarray profiling 48 human tissues and sys-tematically screening for 4-mer to 7-mer RNA ‘words’ associated with 24 426 alternatively spliced exons, Castle et al [27] identified 143 motifs enriched near tissue-specific exons Interestingly, the two most fre-quent motifs, UCUCU and UGCAUG, coincide with binding consensus sequences for PTB⁄ nPTB and FOX splicing factors, and show a distinct pattern of geno-mic localization Similar observations were made based

on RNA-Seq reads from 15 human tissues and cell lines [2] UCUCU motifs were enriched within a 200 nucleotide region upstream of cassette exons that are upregulated in brain and striated muscle The extent to which these exons are spliced correlates inversely with PTB expression levels [2,27], consistent with PTB’s well-known role as a splicing repressor [67]

The Castle et al [27] junction array was also used to analyse alternative splicing during development of the mouse heart, resulting in the identification of 63 devel-opmentally regulated alternative splicing events, falling into three temporal groups More than half of these events were regulated similarly during development of the chicken heart [31] Enriched motifs included bind-ing sites for the CUGBP, MBNL, FOX, STAR and PTB families of splicing factors Forty-four of these alternative splicing events were further investigated in hearts from transgenic animals that overexpressed CUGBP1 or were depleted of MBNL1 Of the 24 ex-ons with altered inclusion levels, 13 were regulated by CUGBP1, five by MBNL1 and six antagonistically by both [31] The switch in relative activities of CUGBP and MBNL proteins during development appears to explain a large subset of splicing transitions detected during postnatal heart development

Observation of enriched motifs in the cases above allowed inferences to be drawn about the probable cognate binding proteins, e.g Fox, PTB, MBNL and CELF proteins However, there are more than 300 RNA binding proteins encoded in mammalian genomes [68], which have the potential to act as splic-ing regulators, but for most little or nothsplic-ing is known about their binding specificity Traditional SELEX to

Trang 7

determine their binding specificity would be laborious.

However, a new array-based procedure may provide

the capability to rapidly derive the optimal binding

motifs for many of these proteins [69], which would

assist in future attempts to link factors with enriched

motifs

NOVA and FOX maps

In the case of two families of mammalian proteins, the

FOX and NOVA proteins, a variety of techniques,

culminating in HITS-CLIP analysis, have converged

on very similar RNA maps, in which the precise

location of binding sites for the cognate proteins is

predictive of their action as either repressors or

silenc-ers of alternatively spliced exons

The NOVA proteins are neuron-specific RNA

bind-ing proteins that are targets of a neuronal autoimmune

response associated with cancer Analysis of these

proteins in the Darnell laboratory has led the way in

the global analysis of RNA binding protein function

[70] SELEX analysis indicated that the optimal

bind-ing site for NOVA consisted of clusters of three

YCAY motifs [71], and importantly a cluster of such

motifs matched a cis element crucial for

NOVA-regu-lated alternative splicing of an exon in the GABAA

gene Analysis of alterations in alternative splicing in

the neocortex of wild-type and Nova2) ⁄ ) mice using

an Affymetrix prototype junction array with  40 000

probe sets allowed the identification of  50

alterna-tive splicing events that were NOVA regulated [72]

The genes affected by NOVA-dependent alternative

splicing were highly enriched for proteins involved in

synaptic function, emphasizing the fact that alternative

splicing targets functionally coherent groups of genes

The CLIP method was originally developed to analyse

in vivo NOVA binding RNAs by conventional cloning

and sequencing of purified RNA tags Of the moderate

number of sequence tags identified, only  20%

con-tained clusters of YCAY motifs, but in these cases the

tags were often associated with NOVA-regulated

alternative splicing events [56] On the basis of the

accumulated group of validated NOVA targets, a

bio-informatic exercise was carried out to identify clusters

of YCAY motifs within 200 nucleotides of alternative

exons or their flanking constitutive exons, and

more-over to predict whether these clusters would act as

enhancers or silencers [73] The resulting NOVA RNA

map contained various intronic and exonic silencers, as

well as intronic enhancers NOVA clusters within the

downstream intron were invariably enhancers, whereas

within the exon and most positions in the upstream

intron they were silencers Most recently, the NOVA

RNA map has been refined by high-throughput sequencing (using the Roche 454 platform) of NOVA2 CLIP tags from mouse neocortex, with confirmation

of splicing outcomes by splice-junction array compari-son of wild-type and Nova2) ⁄ )mice [57] As expected, the comprehensive HITS-CLIP approach rediscovered many of the previously known NOVA targets, as well

as many new ones The refined NOVA map showed that NOVA binding clusters within 500 nucleotides of the alternative 5¢ splice site or constitutive 3¢ splice site acted as enhancers, whereas NOVA binding within 500 nucleotides of the constitutive 5¢ splice site or sur-rounding the NOVA-regulated exon was inhibitory The FOX1 and -2 proteins are alternative splicing regulators that have a single RNA binding domain with an unusual degree of specificity for the cognate UGCAUG binding site [74] In a number of recent global transcriptome profiling studies, FOX binding motifs were found to be associated with exons regu-lated in striated muscle and neurons [2,27,75], consis-tent with the expression patterns of FOX1 and -2 Analysis of breast and ovarian cancer using an RT-PCR panel of alternative splicing events indicated that one-third of cases of increased exon skipping in cancer were associated with downstream FOX sites More-over, FOX2 expression is lower in breast cancer and its own alternative splicing is altered in ovarian cancer [64] Closer analysis of the various FOX datasets showed an interesting position-dependent effect, remi-niscent of the NOVA map [57,73] When located downstream of alternative splicing exons, FOX binding sites act as enhancers, whereas on the upstream side they act as repressors (Fig 3) The FOX ‘RNA map’ was also converged upon by two additional approaches that used FOX binding sites and mRNA targets as the starting point The long and nondegener-ate nature of the FOX binding site allowed Zhang

et al [75] to conduct a computational search for posi-tionally conserved UGCAUG motifs within 200 nucle-otides of internal exons across 28 vertebrate genomes Comparing the bioinformatics with data collected from the Castle et al [27] custom exon-junction array for alternative splicing in 47 different tissues and cell lines, they identified the position dependency of FOX bind-ing sites Finally, CLIP-Seq analysis was carried out for FOX2 binding sites in human embryonic stem cells [58] Of 5.3 million 36 nucleotide reads, 4.4 million mapped to unique genomic locations leading to the identification of > 3500 clusters representing genuine FOX2 binding events Surprisingly, although the UGCAUG motif was highly enriched, an exact match was only found in 22% of clusters, and even the core GCAUG pentamer was present in only 33%,

Trang 8

indicating that FOX2 can bind to other sites, perhaps

in co-operation with other proteins FOX2 sites were

highly enriched around alternative splicing exons and a

similar position-dependent FOX2 activity map was

deduced Interestingly, it appears that FOX2 is a key

player in a splicing regulatory network in human

embryonic stem cells The alternative splicing events

regulated by FOX2 were highly enriched for splicing

regulatory proteins, including numerous hnRNP and

SR proteins and an autoregulatory splicing event in

the FOX2 gene itself [58] In contrast, different sets of

FOX2 targets were identified in neural progenitors,

with the major functional enrichment being for

cyto-skeletal proteins, consistent with other reports

[27,58,64,75]

Towards a predictive splicing map

Global alternative splicing profiling points towards

the association of some sequence motifs and their

cognate binding proteins with some tissue-specific splicing programmes, whereas the NOVA and FOX splicing maps indicate the position-dependent activity

of some splicing regulators But even the activity of FOX and NOVA when bound at particular locations

is dependent upon the binding and activity of other factors There is still some way to go before a full tissue-specific splicing code, with the ability to predict the consequences of mutations, is deciphered A recent study highlighted one of the important future directions The Frey and Blencowe groups have developed a machine-learning approach in which the tissue-specific splicing profiles of 3707 mouse cassette exons, gathered using a custom junction-array plat-form [22], have been combined with over a thousand separate ‘RNA features’ in order to generate a ‘splic-ing code’ that predicts changes in exon inclusion between tissues The features include known protein binding sequences (including FOX, NOVA and PTB⁄ nPTB), motifs with predicted silencer or enhan-cer activity, secondary structures, conservation, exon and intron size, and whether exon inclusion or skip-ping introduces a premature termination codon Using this approach, distinct combinations of fea-tures are found to be predictive of five different tis-sue categories of alternative splicing: central nervous system, muscle, embryo, ‘digestive organs’ (including liver, kidney, gut) and tissue independent (B Frey, personal communication) This pioneering study is based upon a moderate number of cassette exons and

27 tissue-specific datasets, but it provides a clear direction for future endeavours Further refinement

of the splicing code will be readily achieved by a combination of additional tissue datasets and analysis

of transcriptomes of defined cell types (most tissues contain a variety of differentiated cell types), together with larger numbers and different categories of alter-native splicing events The ability to sequence the transcriptomes of single cells [63] will also be enor-mously helpful as improved methods for sequencing-based quantitative profiling of alternative splicing are developed Of course, defining the logic of the code will pose many questions about the underlying mech-anisms For example, why do FOX and NOVA pro-teins inhibit from an upstream position, but activate from downstream of an alternative exon? As the details of the splicing codes are revealed, there will

be scope for a great deal of further mechanistic dis-section at the molecular level However, in contrast

to earlier work on alternative splicing mechanisms, experimentalists will know in advance that they are revealing the mechanisms of generally applicable pro-grammes

A

B

Fig 3 Position-dependent activity of FOX proteins (A) Enrichment

of UGCAUG motifs in the downstream intron is associated with

increased exon inclusion in heart, skeletal muscle, brain and

cere-bellar cortex Higher motif frequency in the upstream intron is

asso-ciated with reduced inclusion in skeletal muscle Adapted from [2].

(B) Enrichment of FOX binding sites on the upstream side of

alter-natively spliced exons, indicated by the blue line, is associated with

FOX-dependent exon skipping, whereas enrichment on the

down-stream side (red line) is associated with FOX-dependent inclusion.

Adapted from [2,58,64,75].

Trang 9

We thank Brendan Frey for comments on the

manu-script and for communicating unpublished data Work

in the CWJS laboratory is funded by the Wellcome

Trust (programme grant 077877) and by EC grant

EURASNET-LSHG-CT-2005-518238

References

1 Pan Q, Shai O, Lee LJ, Frey BJ & Blencowe BJ (2008)

Deep surveying of alternative splicing complexity in the

human transcriptome by high-throughput sequencing

Nat Genet 40, 1413–1415

2 Wang ET, Sandberg R, Luo S, Khrebtukova I, Zhang

L, Mayr C, Kingsmore SF, Schroth GP & Burge CB

(2008) Alternative isoform regulation in human tissue

transcriptomes Nature 456, 470–476

3 Melamud E & Moult J (2009) Stochastic noise in

splic-ing machinery Nucleic Acids Res 37, 4873–4886

4 Raponi M & Baralle D (2009) Alternative splicing: good

and bad effects of translationally silent substitutions

FEBS J 277, doi:10.1111/j.1742-4658.2009.07519.x

5 Faustino NA & Cooper TA (2003) Pre-mRNA splicing

and human disease Genes Dev 17, 419–437

6 Matlin AJ, Clark F & Smith CW (2005) Understanding

alternative splicing: towards a cellular code Nat Rev

Mol Cell Biol 6, 386–398

7 Soreq L, Gilboa-Geffen A, Berrih-Aknin S, Lacoste P,

Darvasi A, Soreq E, Bergman H & Soreq H (2008)

Identifying alternative hyper-splicing signatures in

MG-thymoma by exon arrays PLoS ONE 3, e2392

8 Ule J, Jensen K, Mele A & Darnell RB (2005) CLIP:

a method for identifying protein-RNA interaction sites

in living cells Methods 37, 376–386

9 Jensen KB & Darnell RB (2008) CLIP: crosslinking and

immunoprecipitation of in vivo RNA targets of

RNA-binding proteins Methods Mol Biol 488, 85–98

10 Wang Z, Gerstein M & Snyder M (2009) RNA-Seq: a

revolutionary tool for transcriptomics Nat Rev Genet

10, 57–63

11 Ben-Dov C, Hartmann B, Lundgren J & Valcarcel J

(2008) Genome-wide analysis of alternative pre-mRNA

splicing J Biol Chem 283, 1229–1233

12 Hartmann B & Valcarcel J (2009) Decrypting the

genome’s alternative messages Curr Opin Cell Biol 21,

377–386

13 Moore MJ & Silver PA (2008) Global analysis of

mRNA splicing RNA 14, 197–203

14 Wang Z & Burge CB (2008) Splicing regulation: from a

parts list of regulatory elements to an integrated splicing

code RNA 14, 802–813

15 Blencowe BJ, Ahmad S & Lee LJ (2009)

Current-gener-ation high-throughput sequencing: deepening insights

into mammalian transcriptomes Genes Dev 23, 1379–1386

16 Kwan T, Benovoy D, Dias C, Gurd S, Provencher C, Beaulieu P, Hudson TJ, Sladek R & Majewski J (2008) Genome-wide analysis of transcript isoform variation in humans Nat Genet 40, 225–231

17 Clark TA, Schweitzer AC, Chen TX, Staples MK, Lu

G, Wang H, Williams A & Blume JE (2007) Discovery

of tissue-specific exons using comprehensive human exon microarrays Genome Biol 8, R64

18 Oberdoerffer S, Moita LF, Neems D, Freitas RP, Hacohen N & Rao A (2008) Regulation of CD45 alternative splicing by heterogeneous ribonucleoprotein, hnRNPLL Science 321, 686–691

19 Gardina PJ, Clark TA, Shimada B, Staples MK, Yang

Q, Veitch J, Schweitzer A, Awad T, Sugnet C, Dee S

et al.(2006) Alternative splicing and differential gene expression in colon cancer detected by a whole genome exon array BMC Genomics 7, 325

20 Yamamoto ML, Clark TA, Gee SL, Kang JA, Schweitzer AC, Wickrema A & Conboy JG (2009) Alternative pre-mRNA splicing switches modulate gene expression in late erythropoiesis Blood 113, 3363–3370

21 Johnson JM, Castle J, Garrett-Engele P, Kan Z, Loerch

PM, Armour CD, Santos R, Schadt EE, Stoughton R

& Shoemaker DD (2003) Genome-wide survey of human alternative pre-mRNA splicing with exon junc-tion microarrays Science 302, 2141–2144

22 Fagnani M, Barash Y, Ip JY, Misquitta C, Pan Q, Saltzman AL, Shai O, Lee L, Rozenhek A, Mohammad

N et al (2007) Functional coordination of alternative splicing in the mammalian central nervous system Genome Biol 8, R108

23 Pan Q, Shai O, Misquitta C, Zhang W, Saltzman AL, Mohammad N, Babak T, Siu H, Hughes TR, Morris

QD et al (2004) Revealing global regulatory features of mammalian alternative splicing using a quantitative microarray platform Mol Cell 16, 929–941

24 Blanchette M, Green RE, Brenner SE & Rio DC (2005) Global analysis of positive and negative pre-mRNA splicing regulators in Drosophila Genes Dev 19, 1306–1314

25 Blanchette M, Green RE, MacArthur S, Brooks AN, Brenner SE, Eisen MB & Rio DC (2009) Genome-wide analysis of alternative pre-mRNA splicing and RNA-binding specificities of the Drosophila hnRNP A⁄ B family members Mol Cell 33, 438–449

26 Hartmann B, Castelo R, Blanchette M, Boue S, Rio

DC & Valcarcel J (2009) Global analysis of alternative splicing regulation by insulin and wingless signaling in Drosophilacells Genome Biol 10, R11

27 Castle JC, Zhang C, Shah JK, Kulkarni AV, Kalsotra

A, Cooper TA & Johnson JM (2008) Expression of 24,426 human alternative splicing events and predicted

Trang 10

cis regulation in 48 tissues and cell lines Nat Genet 40,

1416–1425

28 Johnson MB, Kawasawa YI, Mason CE, Krsnik Z,

Coppola G, Bogdanovic D, Geschwind DH, Mane

SM, State MW & Sestan N (2009) Functional and

evolutionary insights into human brain development

through global transcriptome analysis Neuron 62, 494–

509

29 Thorsen K, Sorensen KD, Brems-Eskildsen AS,

Modin C, Gaustadnes M, Hein AM, Kruhoffer M,

Laurberg S, Borre M, Wang K et al (2008)

Alterna-tive splicing in colon, bladder, and prostate cancer

identified by exon array analysis Mol Cell Proteomics

7, 1214–1224

30 Ip JY, Tong A, Pan Q, Topp JD, Blencowe BJ &

Lynch KW (2007) Global analysis of alternative splicing

during T-cell activation RNA 13, 563–572

31 Kalsotra A, Xiao X, Ward AJ, Castle JC, Johnson JM,

Burge CB & Cooper TA (2008) A postnatal switch of

CELF and MBNL proteins reprograms alternative

splicing in the developing heart Proc Natl Acad Sci

USA 105, 20333–20338

32 Hung LH, Heiner M, Hui J, Schreiner S, Benes V &

Bindereif A (2008) Diverse roles of hnRNP L in

mam-malian mRNA processing: a combined microarray and

RNAi analysis RNA 14, 284–296

33 Chawla G, Lin CH, Han A, Shiue L, Ares M Jr &

Black DL (2009) Sam68 regulates a set of alternatively

spliced exons during neurogenesis Mol Cell Biol 29,

201–213

34 Xing Y, Stoilov P, Kapur K, Han A, Jiang H, Shen S,

Black DL & Wong WH (2008) MADS: a new and

improved method for analysis of differential alternative

splicing by exon-tiling microarrays RNA 14,

1470–1479

35 Saltzman AL, Kim YK, Pan Q, Fagnani MM, Maquat

LE & Blencowe BJ (2008) Regulation of multiple core

spliceosomal proteins by alternative splicing-coupled

nonsense-mediated mRNA decay Mol Cell Biol 28,

4320–4330

36 Dhir A & Buratti E (2009) Alternative splicing: role of

pseudoexons in human disease and potential therapeutic

strategies FEBS J 277, doi:10.1111/j.1742-4658.2009

07520.x

37 Shiraki T, Kondo S, Katayama S, Waki K, Kasukawa

T, Kawaji H, Kodzius R, Watahiki A, Nakamura M,

Arakawa T et al (2003) Cap analysis gene expression

for high-throughput analysis of transcriptional starting

point and identification of promoter usage Proc Natl

Acad Sci USA 100, 15776–15781

38 Velculescu VE, Zhang L, Vogelstein B & Kinzler KW

(1995) Serial analysis of gene expression Science 270,

484–487

39 Iida K, Fukami-Kobayashi K, Toyoda A, Sakaki Y,

Kobayashi M, Seki M & Shinozaki K (2009) Analysis

of multiple occurrences of alternative splicing events in Arabidopsis thalianausing novel sequenced full-length cDNAs DNA Res 15, 155–164

40 Kim YC, Wu Q, Chen J, Xuan Z, Jung YC, Zhang

MQ, Rowley JD & Wang SM (2009) The transcriptome

of human CD34+ hematopoietic stem-progenitor cells Proc Natl Acad Sci USA 106, 8278–8283

41 Ansorge WJ (2009) Next-generation DNA sequencing techniques N Biotechnol 25, 195–203

42 Calarco JA, Saltzman AL, Ip JY & Blencowe BJ (2007) Technologies for the global discovery and analysis of alternative splicing Adv Exp Med Biol 623, 64–84

43 Mortazavi A, Williams BA, McCue K, Schaeffer L & Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq Nat Meth 5, 621–628

44 Nagalakshmi U, Wang Z, Waern K, Shou C, Raha D, Gerstein M & Snyder M (2008) The transcriptional landscape of the yeast genome defined by RNA sequencing Science 320, 1344–1349

45 Wilhelm BT, Marguerat S, Watt S, Schubert F, Wood

V, Goodhead I, Penkett CJ, Rogers J & Bahler J (2008) Dynamic repertoire of a eukaryotic transcriptome surveyed at single-nucleotide resolution Nature 453, 1239–1243

46 Lister R, O’Malley RC, Tonti-Filippini J, Gregory BD, Berry CC, Millar AH & Ecker JR (2008) Highly inte-grated single-base resolution maps of the epigenome in Arabidopsis Cell 133, 523–536

47 Sultan M, Schulz MH, Richard H, Magen A, Klingen-hoff A, Scherf M, Seifert M, Borodina T, Soldatov A, Parkhomchuk D et al (2008) A global view of gene activity and alternative splicing by deep sequencing of the human transcriptome Science 321, 956–960

48 Cloonan N, Forrest AR, Kolle G, Gardiner BB, Faulk-ner GJ, Brown MK, Taylor DF, Steptoe AL, Wani S, Bethel G et al (2008) Stem cell transcriptome profiling via massive-scale mRNA sequencing Nat Meth 5, 613–619

49 Li H, Lovci MT, Kwon YS, Rosenfeld MG, Fu XD & Yeo GW (2008) Determination of tag density required for digital transcriptome analysis: application to an androgen-sensitive prostate cancer model Proc Natl Acad Sci USA 105, 20179–20184

50 Yeakley JM, Fan JB, Doucet D, Luo L, Wickham E, Ye

Z, Chee MS & Fu XD (2002) Profiling alternative splic-ing on fiber-optic arrays Nat Biotechnol 20, 353–358

51 Hodges E, Xuan Z, Balija V, Kramer M, Molla MN, Smith SW, Middle CM, Rodesch MJ, Albert TJ, Hannon

GJ et al (2007) Genome-wide in situ exon capture for selective resequencing Nat Genet 39, 1522–1527

52 Venables JP, Koh CS, Froehlich U, Lapointe E, Couture S, Inkel L, Bramard A, Paquet ER, Watier V, Durand M et al (2008) Multiple and specific mRNA processing targets for the major human hnRNP proteins Mol Cell Biol 28, 6033–6043

Ngày đăng: 06/03/2014, 09:22

TỪ KHÓA LIÊN QUAN

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

🧩 Sản phẩm bạn có thể quan tâm