Báo cáo khoa học: Structural and functional analysis of ataxin-2 and ataxin-3 potx

Our analyses focused on presumably functional amino acids and the construction of tertiary structure models of the RNA-binding Lsm domain of ataxin-2 and the deubiquitinating Josephin do

Trang 1

Structural and functional analysis of ataxin-2 and ataxin-3

Mario Albrecht1,*, Michael Golatta2,*, Ullrich Wu¨llner3and Thomas Lengauer1

1Max-Planck-Institute for Informatics, Saarbru¨cken, Germany;2Institute for Medical Biometry, Informatics, and Epidemiology, University of Bonn, Germany;3Department of Neurology, University of Bonn, Germany

Spinocerebellar ataxia types 2 (SCA2) and 3 (SCA3) are

autosomal-dominantly inherited, neurodegenerative

dis-eases caused by CAG repeat expansions in the coding

regions of the genes encoding ataxin-2 and ataxin-3,

respectively To provide a rationale for further functional

experiments, we explored the protein architectures of

ataxin-2 and ataxin-3 Using structure-based multiple sequence

alignments of homologous proteins, we investigated

domains, sequence motifs, and interaction partners Our

analyses focused on presumably functional amino acids and

the construction of tertiary structure models of the RNA-binding Lsm domain of ataxin-2 and the deubiquitinating Josephin domain of ataxin-3 We also speculate about dis-tant evolutionary relationships of ubiquitin-binding UIM, GAT, UBA and CUE domains and helical ANTH and UBX domain extensions.

Keywords: spinocerebellar ataxia; Machado–Joseph dis-ease; polyglutamine disorder; ubiquitin; valosin-containing protein.

Spinocerebellar ataxia types 2 (SCA2) and 3 (SCA3) are

autosomal-dominantly inherited, neurodegenerative

dis-orders [1,2] SCA3 has also been known as Machado–

Joseph disease (MJD), and SCA2 and SCA3 belong to a

heterogeneous group of trinucleotide repeat disorders This

group includes Huntington disease (HD),

dentatorubral-pallidoluysian atrophy (DRPLA), and other spinocerebellar

ataxia types such as SCA1, SCA7 and SCA17 [3–7] The age

of onset of SCA2 and SCA3 is in the third to fourth decade

[8] The disorders share common phenotypic features such

as the degeneration of speciﬁc vulnerable neuron

popula-tions and the presence of intracellular aggregapopula-tions of the

mutant proteins in affected neurons In contrast, the

expression of the disease-associated genes occurs in a great

variety of tissues and is not restricted to neuronal cells.

The SCA2 and SCA3/MJD genes have been mapped to

chromosomes 12q24.1 and 14q32.1 [1,2] The common

underlying genetic basis of SCA2 and SCA3 is the

expansion of a CAG repeat region beyond a certain

thresh-old These CAG repeats encode a polyglutamine (polyQ)

tract in the respective proteins ataxin-2 and ataxin-3 The

polyQ stretch in ataxin-2 lies near the N-terminus at the

5¢-end of the coding region of exon 1 [9], but the polyQ region

of ataxin-3 is contained in exon 10 close to the C-terminus [10] While ataxin-2 is located predominantly in the Golgi apparatus [11], ataxin-3 is found in both the nucleus and the cytoplasm of cells [12].

To provide a rationale for further experiments, we characterized the protein architectures of ataxin-2 and ataxin-3 and investigated domains, sequence motifs, and interaction partners To explore the functional implications,

we assembled a multiple sequence alignment for the Lsm domain of ataxin-2 homologues including the yeast homologue Pbp1 We also constructed a 3D structural model for the RNA-binding Lsm domain of ataxin-2 Similarly, we used a structure-based multiple sequence alignment of the Josephin domain of ataxin-3 homologues

to derive a 3D model of this domain and to analyse speciﬁc residues involved in deubiquitination.

Materials and methods Protein sequences were retrieved from the NCBI [13], Ensembl [14], and SWISS-PROT/TrEMBL (SPTrEMBL) [15] databases and protein domain architectures from the Pfam [16] and SCOP [17] databases Sequence accession numbers are given in the respective figure legends and Tables S1 and S2 Species names are abbreviated by first letters (Table S3) Protein structures were obtained from the PDB database [18] The secondary structure assignments of PDB structures were taken from the DSSP database [19] A single capital letter appended to the actual PDB identifier denotes the chosen structure chain We used thePSI-BLAST suite of programs [20] to search for homologues (E-value cut-off 0.005) and the web servers PSIPRED [21], SAM-T99 [22], and SSpro2 [23] to predict the secondary structure of proteins and to form a consensus prediction by majority voting [24] To predict intrinsically unstructured and disordered regions in proteins, we explored the consensus

of the results returned by the DisEMBL [25], DISOPRED [26], GlobPlot [27], NORSp [28] and PONDR [10] online

Correspondence toM Albrecht, Max-Planck-Institute for Informatics,

Stuhlsatzenhausweg 85, 66123 Saarbru¨cken, Germany

E-mail: mario.albrecht@mpi-sb.mpg.de

Abbreviations: A2BP1, ataxin-2 binding protein 1; DRPLA,

denta-torubral pallidoluysian atrophy; DUB, deubiquitinating enzymes;

HD, Huntington disease; MJD, Machado–Joseph disease; NLS,

nuclear localization signal; OTU, otubains; PABP, poly(A)-binding

protein; RMSD, root mean square deviation; SCA, spinocerebellar

ataxia; SnRNPs, small nuclear ribonucleoproteins; UBP,

ubiquitin-speciﬁc protease; UCH, ubiquitin C-terminal hydrolase; UIM,

ubiquitin-interacting motif; VCP, valosin-containing protein

*Note: M Albrecht and M Golatta contributed equally to this work

(Received 6 April 2004, accepted 7 June 2004)

Trang 2

servers The nuclear localization signals in ataxin-3

homo-logues were discovered with help of the prediction server

PSORT II [29].

Multiple sequence alignments were assembled by means

of T-COFFEE [30] and improved manually by minor

adjustments based on structure prediction results and

pair-wise structure superpositions computed by the programCE

[31] The root mean square deviations (RMSDs) were taken

from theCEsuperpositions We investigated the results of all

state-of-the-art fold recognition methods available via the

online meta-server BioInfo.PL [32], which contacts a dozen

other state-of-the-art prediction servers (the names of which

are listed on the web site http://Bioinfo.PL/Meta/) The

associated 3D-Jury system allows for the comparison and

evaluation of the predicted 3D models in a consensus view

[33] To model the protein structure of 2 and

ataxin-3, we submitted the constructed sequence–structure

align-ments to the 3D modelling server WHAT IF [34] The

sequence alignments depicted in the ﬁgures were prepared in

theSEAVIEWeditor [35] and illustrated by the web service

ESPript [36] The protein structure images were drawn

in the Accelrys Discovery Studio ViewerLight The online

version of this manuscript contains supplementary material,

and our web site will provide additional pictures.

Results and discussion

Protein architecture of ataxin-2

Ataxin-2 has 1312 residues (including 22 glutamines of the

polyQ stretch) and a molecular mass of  140 kDa

Ataxin-2 is a highly basic protein except for one acidic region

(amino acid 254–475) containing 46 acidic amino acids

(Fig 1) This region covers roughly exons 2–7 and is

predicted to consist of two globular domains named Lsm

(Like Sm, amino acid 254–345) [37] and LsmAD

(Lsm-associated domain, amino acid 353–475) The LsmAD

domain of ataxin-2 contains both a clathrin-mediated trans-Golgi signal (YDS, amino acid 414–416) and an endoplasmic reticulum (ER) exit signal (ERD, amino acid 426–428) [11,38] It is composed mainly of a-helices according to the results from secondary structure prediction servers.

The rest of ataxin-2 outside of the Lsm and LsmAD domains is only weakly conserved in eukaryotic ataxin-2 homologues and is predicted to be intrinsically unstructured according to the consensus result from the DisEMBL, DISOPRED, GlobPlot, NORSp and PONDR online servers These nonglobular, flexible N- and C-terminal tails (amino acid 1–253 and 476–1312) contain the polyQ region (amino acid 166–187), several highly conserved short sequence motifs as possible protein interaction sites, and conspicuous (R)RG peptides at the C-terminus of the LsmAD domain One of the sequence motifs constitutes a putative PABP [poly(A)-binding protein] interacting motif PAM2 (amino acid 908–925) [39], and (R)RG peptides are well-known to bind RNA in other proteins [40] The N- and C-terminal tails of ataxin-2 also have a high content of proline (179 prolines out of 1090 amino acids, 16.4%) This property and the low complexity of unstructured sequence regions may lead to several significant, but probably false-positive, hits during aPSI-BLASTsearch for homologues of ataxin-2 For instance, despite the use of the standard low complexity filter, our PSI-BLAST search with human ataxin-2 homologues found several questionable hits outside globular domains to homologues of the poly-glutamine DRPLA gene product atrophin For instance, starting thePSI-BLASTsearch with an Arabidopsis thaliana ataxin-2 homologue (SPTrEMBL: Q94AM9), human atrophin is retrieved in the third iteration with an E-value

of 5 · 10)11 Conversely, using the rat atrophin homologue (SPTrEMBL: Q62901) as the start sequence, human

ataxin-2 was detected in the second iteration with an E-value of

8 · 10)04.

Fig 1 Protein architectures of human ataxin-2, its yeast homologue Pbp1, and the P falciparum homologue PF13_0048 of the decapping enzyme DCP2 (DCP2_Pf)

Trang 3

RNA binding of ataxin-2

The Lsm domain of ataxin-2 is typical of RNA-binding Sm

and Sm-like proteins, which often form cyclic 6-, 7- or even

14-oligomers [41–43] Generally, Lsm domain proteins are

involved in a variety of essential RNA processing events

including RNA modiﬁcation, pre-mRNA splicing, and

mRNA decapping and degradation Some of them are also

important components of spliceosomal small nuclear

ribo-nucleoproteins (snRNPs).

The LsmAD domain is contained in the Pfam database

with the name Ataxin-2_N and also occurs in another, as

yet uncharacterized Plasmodium falciparum/yoelii yoelii

gene products PF13_0048/PY07327 without an Lsm

domain (Fig 1) Both Plasmodium gene products have an

additional N-terminal DCP2 domain (also termed box A),

which is always followed by a NUDIX domain [44] in all

known DCP2 homologues This NUDIX domain

consti-tutes the catalytic subunit of the mRNA decapping

holoenzyme DCP1–DCP2 [45,46].

The physiological function of ataxin-2 and closely related

eukaryotic homologues in RNA processing is as yet quite

unexplored [47–50] Interestingly, ataxin-2 has been

observed to interact with A2BP1 (ataxin-2 binding protein

1) [38], whose RNA-binding Caenorhabditis elegans

homo-logue, fox-1, regulates tissue-speciﬁc alternative splicing [51].

Disruption of the human A2BP1 gene may cause epilepsy or

mental retardation [52] In addition, ataxin-2 shows

signi-ﬁcant homology to the yeast protein Pbp1

(Pab1/PABP-binding protein 1), which also contains the Lsm and

LsmAD domains; regions outside of these two globular

domains are predicted to be mainly unstructured in Pbp1 as

in ataxin-2.

Although the C-terminal tail of Pbp1 does not contain a

PAM2 motif [39], this yeast protein regulates

polyadenyla-tion after pre-mRNA splicing and interacts with the

C-terminal part of the yeast homologue PAB1 of the

human PABP [53] A2BP1 and PABP are also

evolutionar-ily related and possess RNA recognition motifs [38] These

observations strongly suggest that ataxin-2 is involved in

similar mRNA processing tasks.

Structural modelling of ataxin-2

First, we compiled a list of ataxin-2 homologues including

the yeast homologue Pbp1 and several Lsm domains of

snRNPs and other Sm and Sm-like proteins from various

species Then, we assembled a structure-based multiple

sequence alignment of the Lsm domains,

crystallographi-cally determined structures of which reveal a close structural

homology between archaeal and eukaryotic proteins

(Fig 2) [42,43,54–65] This suggests that the function and

the RNA-binding mode of the Lsm domain have been

preserved during evolution.

The RNA-binding Lsm domain is characterized by a

conserved sequence motif consisting of two short segments

known as Sm1 and Sm2, which are separated by a variable

linker [66,67] The very strong conservation of certain

glycine residues is especially striking and also demonstrates

the evolutionary relationship of ataxin-2 to Lsm domain

proteins The amide groups of the glycines are known to

stabilize the protein structure when forming hydrogen

bonds to adjacent b-strands [55] The secondary structure predictions of ataxin-2 and its yeast homologue Pbp1 are also in good agreement with the known structure of the Lsm domain as open b-barrel, consisting of an N-terminal a-helix followed by a strongly bent ﬁve-stranded antiparallel b-sheet with a 310helical turn in some cases before the ﬁfth b-strand.

The top two alignment rows in Fig 2 show human ataxin-2 aligned with the Pyrococcus abyssi Sm1 protein (PDB identiﬁer 1m8v, chain A), the crystal structure of which consists of a heptameric ring with a central cavity like other Lsm domain oligomers [65] This Sm1 protein provides the only Lsm domain structure, which is bound

to RNA inside and outside of the doughnut-shaped ring at

an internal and an external binding site Therefore, we used this alignment of ataxin-2 to Sm1 to model the 3D structure

of the Lsm domain of ataxin-2 in complex with RNA and Lsm domains of ataxin-2 protomers (Fig 3).

Functional analysis of the Lsm domain

We applied the same colour scheme to functionally relevant residues shown in the multiple sequence alignment and the 3D model of ataxin-2 (Figs 2 and 3) Based on the crystal structure of Sm1 from P abyssi bound to uridine heptamers (U7), we marked several amino acids in Sm1, which are involved in RNA binding [65] and are mostly physico-chemically conserved in ataxin-2 (Sm1/ataxin-2 residue numbers) The residues forming the internal U7binding site are H37/K299, N39/L302 and R63/K330, while ionic interactions between K22/K284, R63/K330 and D65/S332 stabilize the RNA-binding area The residues involved in the external U7binding site are R4/R266, H10/T272 and Y34/ Y296, stabilized by a hydrogen bond between H10/T272 and Y34/Y296 It is interesting to note that Sm1 from

P abyssi and from Archaeoglobus fulgidus (PDB identiﬁer 1i4k, chain A) share identical RNA-binding residues except for H10, which is replaced by an asparagine [59,65] Furthermore, we investigated whether ataxin-2 may also form oligomers through the Lsm domain To this end, we used the detailed crystal structure analyses of the very similar snRNP heterodimers D1–D2and D3–B [55] Because

of analogous intermolecular interactions in both dimers, we focused on the complex of D3 with B This complex is stabilized mainly by the pairing of the ﬁfth b-strand (b5) from D3with the fourth b-strand (b4) from B (D3/ataxin-2– B/ataxin-2): R69/V335–R73/K330, L71/V337–L71/L328, and L73/F339–L69/S326 In addition, two hydrophobic clusters formed by residues of D3and B contribute to the stability of the dimer The ﬁrst cluster includes F70/V336 and I72/Q338 (both in b5 strand) of D3 and F27/Y289 (b2 strand), L67/M324, V70/I327 and L72/L328 (all in b4 strand) of B The second cluster consists of P6/M267, L10/ L271 (both in a-helix), V18/C279 (b1 strand), L32/F293 (b2 strand), I33/K294 (loop after b2 strand), I68/F334, L71/ V337 and L73/F339 (all in b5 strand) of D3and I41/L304, C43/A306 (both in b3), L69/S326 and L71/L328 (both in b4) of B Stacking interactions between guanidinium groups

of arginines R69/V335 of D3and R25/G287 and R49/T312

of B as well as an ionic interaction between E21/Q282 of D3 and R65/S322 of B stabilize the dimer further However, the latter salt bridge is not observed in the D –D complex

Trang 4

D3

Trang 5

despite identical amino acids Altogether, the degree of

conservation of amino acids relevant for heterodimerization

is only moderate, but may still suggest that ataxin-2 may

form Lsm domain oligomers.

Protein architecture of ataxin-3

The longest splice variant of ataxin-3 possesses 376 amino

acids (including 22 glutamines of the polyQ stretch, amino

acid 296–317) and an approximate molecular weight of

42 kDa Ataxin-3 consists of a globular deubiquitinating

N-terminal Josephin domain (amino acid 1–170) [68,69] and

a ﬂexible C-terminal tail containing two

ubiquitin-interact-ing motifs (UIMs) [70] (also termed LALAL motifs and

PUBs [71], amino acid 223–240 and 243–260) and the polyQ

region (amino acid 296–317) (Fig 4) [72] A slightly shorter

alternative splice variant of ataxin-3 with 373 amino acids

has a third UIM (amino acid 334–351) at the C-terminus.

An as yet uncharacterized ataxin-3 paralogue on the X

chromosome (sequence identity 70%) is expressed in testis

(ataxin-3t) [10] The Josephin domain is also found without

a C-terminal tail in other, as yet uncharacterized, proteins

named josephins (Fig 5) [73].

A highly conserved, putative nuclear localization signal

(NLS) is found upstream of the polyQ stretch (RKRR,

amino acid 282–285), which may be bipartite in the

Caenorhabditis elegans homologue of ataxin-3, consisting

of 17 residues (RRDRQKFLERFEKKKEE, amino acid

296–312) This NLS follows a potential casein kinase II (CK-II) phosphorylation site (TSEE, amino acid 277–280), which may determine the rate of the observed ataxin-3 transport into the nucleus [74] Ataxin-3 may also contain

a nuclear export signal (NES) following the Josephin domain (ADQLLQMIRV, amino acid 174–183) based on our comparison with a published sequence proﬁle of nuclear export signals [75] Furthermore, ataxin-3 contains several conserved sequence motifs similar to NR- and CoRNR-boxes L-x-x-L-L/[IL]-x-x-[IV]-I of transcriptional coactiva-tors and corepressors, respectively [73] Indeed, ataxin-3 interacts with histones and the histone acetyltransferases CBP, p300, and PCAF, which work as transcriptional coactivators In particular, dependent on these cofactors, ataxin-3 represses histone acetylation and transcription [76], and altered protein acetylation has already been implicated

in polyglutamine disease processes [77] Generally, the (de-)ubiquitination of histones has been linked to transcrip-tional regulation [78], which may also explain the observed interactions of ataxin-3.

Ataxin-3 is evolutionarily conserved in eukaryotes inclu-ding P falciparum and plants, but not yeast The P falci-parum homologue PFL1295w of ataxin-3 (ataxin-3_Pf), whose gene expression is upregulated similarly to the

P falciparum josephin homologue PF11_0125 in gameto-cytes [79–81], constitutes an exception because it has only the second UIM conserved (amino acid 250–267) and has

an additional ubiquitin-like UBX domain [82–85] at the

Fig 3 3D model of the Lsm domain of ataxin-2 using three adjacent protomers of the Sm1 protein from P abyssi as template (PDB identiﬁer 1m8v, chain A, B and G) The model illustrates predicted internal (blue) and external (green) binding sites of ataxin-2 to RNA (grey) a-Helices are in shown in red, b-strands are shown in cyan Only functionally relevant residues of the central ataxin-2 protomer are annotated as follows: dark blue boxes point to residues forming the internal site, and light blue boxes mark amino acids stabilizing the RNA binding area; dark green boxes highlight residues involved in the external site, and light green ones indicate stabilizing hydrogen bonds

Trang 6

C-terminus (amino acid 271–381) instead of the

polyQ-containing region [69] Like human ataxin-3, this ataxin-3

homologue PFL1295w also has a potential casein kinase II

phosphorylation site (TSDE, amino acid 278–281) close to

basic amino acids, which can be indicative of an NLS

(KKIH, amino acid 293–296) near the N-terminus of the

UBX domain In contrast, the prediction server PSORT II

returns another region inside the UBX domain as a possible

NLS (PRRK, amino acid 339–342) It is unclear which NLS

motif may be functionally more relevant because both

NLS motifs correspond to amino acids at solvent exposed

N-termini of the second and fourth b-strand in the crystal

structure of the UBX domain of the cofactor p47 (PDB

identiﬁer 1s3s) [86] Similar to the P falciparum homologue,

the Cryptosporidium parvum homologue of ataxin-3 also

possesses only one UIM motif (amino acid 266–283) and a

C-terminal UBX domain (amino acid 288–397) instead of a

polyQ region.

Ubiquitin binding of ataxin-3

Ubiquitination fulﬁlls many cellular functions in

cytoplas-mic trafﬁcking, guiding speciﬁc proteins through the

endocytic pathways, and targeting proteins to the

protea-some [84,87–93] Above all, the ubiquitin–proteasomal pathway is involved in processing mutant or damaged proteins that cause neurodegenerative diseases The small ubiquitin protein can be covalently linked to other proteins

as single molecule or polyubiquitin chain.

Recently, the two UIMs between the Josephin domain and the polyQ stretch of ataxin-3 have been shown to be capable of binding tetraubiquitin and polyubiquitinated proteins [68,94–97] In our previous study, we used the C-terminal ANTH domain extension, which consists of an antiparallel three-helix bundle, to model the structure of the UIMs in the C-terminal tail of ataxin-3 [73] In fact, novel structure determinations have shown that UIM peptides are a-helices and can form helix bundles in the crystal structure [98] In contrast, the NMR solution structures of UIM peptides reveal that they are single amphiphatic a-helices connected by unstructured linkers [99,100] The latter observation is in agreement with the observed ﬂexibility of the C-terminal tail of ataxin-3 [72].

Furthermore, the ANTH domain itself is evolutionarily, structurally, and functionally related to a VHS domain [101] Lately, the structure of the GAT (GGAs and Tom1) domain directly following the VHS domain of Tom1 and GGAs (Golgi-associate, c-adaptin ear-containing, Arf-binding proteins) was determined crystallographically [102–105] The GAT domain contains a three-helix bundle, which we found to superimpose very well with the helical bundle of the C-terminal ANTH domain extension (RMSD 3.1 A˚, PDB identiﬁers 1o3x and 1hx8, A chains).

Interestingly, the GAT domains of GGAs and Tom1 have been reported to interact with ubiquitin [106–108] The corresponding ubiquitin binding site was located to the third a-helix of the GAT three-helix bundle, and hydro-phobic amino acids like leucines are important for the interaction (Fig 5) The same residue type also plays an essential role in binding ubiquitin to the UIM a-helix [98– 100] and the third a-helix of the helical bundle in the homologous CUE and UBA domains [109] However, the sequence similarity is quite low, and thus it is difﬁcult to deduce an evolutionary relationship, although the ubiquitin binding sequence of GGAs and Tom1 resembles a noncanonical UIM whose, otherwise strictly conserved, serine residue is replaced by an asparagine except in case of human GGA3 (Fig 5).

Further interaction partners of ataxin-3

It has been shown that ataxin-3 interacts with the ubiquitin-like (UBL) domain of the homologous ubiquitin- and proteasome-binding factors hHR23A and hHR23B, whose yeast orthologue is Rad23 [96,110–112] The latter factors are also involved in the nucleotide excision repair pathway

by targeting the ubiquitinated nucleotide excision repair factor XPC/Rad4 to the proteasome [113] Their UBL domain binds to a UIM helix of the 26S proteasome subunit S5a, and this interaction disrupts the interdomain contacts between the N-terminal ubiquitin-mimicking UBL domain and the two C-terminal ubiquitin-binding UBA domains, thereby inducing the change from a closed to an open protein conformation [109,111,114,115] Rad23 and the yeast orthologue Rpn10 of S5a serve as alternative ubiquitin receptors for the proteasome [116], and the UBA domains

Fig 4 Protein architectures of human ataxin-3, its P falciparum

homologue PFL1295w (ataxin-3_Pf), and human josephin 1

Trang 7

of Rad23 inhibit proteasome-catalysed proteolysis by

sequestering Lys48-linked polyubiquitin chains [117,118].

In particular, the NMR solution structures of the UBL

domain of hHR23A/B bound to a UIM peptide of S5a

[99,119] could be used to model the complex of hHR23A/B

and ataxin-3 Similarly, the complex of a UIM of ataxin-3

with ubiquitin could be modelled based on the NMR

solution structure of the UIM of the Vps27 protein bound

to ubiquitin [100].

The C-terminal region of ataxin-3 including the polyQ

region interacts with the N-terminal

cofactor/substrate-binding adaptor domain of the valosin-containing protein

VCP/p97/Cdc48/VAT/ter94 [96,120–123] VCP is an

important multifunctional AAA+ ATPase with two

C-terminal ATPase domains after the adaptor domain, which

provide the energy for major conformational changes [124].

VCP forms hexamers and works as molecular chaperone

involved in a variety of intracellular functions including cell

cycle progression, membrane fusion, vesicle-mediated

trans-port, transcription activation, apoptosis prevention, and

ubiquitin-proteasome degradation, modulating

polygluta-mine-induced neurodegeneration [96,120–123,125–127].

VCP binds the ubiquitin E3 ligase and the chain assembly

factor UFD2a/E4B, which is a U box homologue of yeast

Ufd2 [128], and interacts with and regulates the degradation

of the proteasome-associated ataxin-3, forming a trimeric

complex of ataxin-3, VCP, and UFD2a [96,127,129–131].

Interestingly, Ufd2 binds the UBL domain of Rad23 and

competes with Rad23 for binding to the Rpn1 proteasome

subunit, while the N-terminal UBL domain of the ubiquitin

C-terminal hydrolase Ubp6 interacts with Rpn1 without

competition with Rad23 [116,132].

Furthermore, VCP also binds the C-terminal UBX

domain of the membrane fusion adaptor p47/SHP1/EYC/

Ubx3 [85,86,133], which consists of three domains

UBA-SEP-UBX [134] The crystallographically determined

com-plex of the N-terminal adaptor domain of VCP with this

UBX domain (PDB identiﬁer 1s3s) indicates the interacting

residues [86] and could be used to model the putative

complex of VCP with the C-terminal UBX domain of the

ataxin-3 homologue from P falciparum (ataxin-3_Pf) Like

the UBX domain of p47, ataxin-3_Pf contains the conserved

loop that is essential for an interaction with VCP because it

inserts into a hydrophobic pocket of VCP [86] The UBX

domain structure of p47 is extended at its N-terminus by a

disordered peptide structure and an additional a-helix of as

yet unknown functional relevance [86] The length of this

a-helix is similar to a UIM a-helix (Fig 5), and such a UIM

also precedes the UBX domain of ataxin-3_Pf Therefore, this a-helix of p47 might be related to the second UIM in ataxin-3 homologues (recall that the ﬁrst UIM is missing in ataxin-3_Pf) In addition, the arrangement of one UIM helix followed by a C-terminal UBX domain is also found in the cofactor Ubx2 with domain architecture

UBA-UAS-Fig 5 Multiple sequence alignment of UIM peptides, divided into

groups by horizontal lines from top to bottom: UIM sequences of the

Pfam seed alignment including ﬁrst, second, and third UIMs of ataxin-3

homologues, UIM-like peptides from GGAs and Tom1, and related

AP180 sequences The latter are derived from the 3D structure

super-position of the GAT domain of human GGA1 with the AP180

exten-sions from Rattus norvegicus and D melanogaster (PDB identiﬁers 1hf8

and 1hx8, respectively) The second group of UIMs in ataxin-3

homo-logues also includes the similar N-terminal a-helix of the UBX domain

extension of p47 (PDB identiﬁer 1s3s) For each group, amino acids in

alignment columns with a majority of identical residues are printed on a

black background, and similar amino acids are highlighted in grey

Trang 8

UIM-UBX [133] The UIM of Ubx2 binds ubiquitin chains,

and the UBX domain interacts with VCP Thus the same

interactions can be expected for ataxin-3_Pf.

The C-terminal, presumably VCP-binding, UBX domain

of ataxin-3_Pf appears to correspond to the VCP-binding

C-terminal part of human ataxin-3, which follows the

second UIM and includes the polyQ region [120,123,131].

In addition, the polyQ tract of ataxin-3 has been shown to

be indispensable for the interaction with VCP, and its length

correlates with the strength of the interaction These

obser-vations raise the question how human ataxin-3 binds VCP

in contrast to its P falciparum homologue This is

partic-ularly interesting because VCP may suppress polyQ induced

neurodegeneration, and mutations in VCP have been

observed to cause cytoplasmic vacuoles followed by cell

death because of a dysfunctional second ATPase domain

and inclusion body formation [120–123,127,135,136] We

also observed that all VCP sequence variations associated

with Paget disease of bone and frontotemporal dementia

(IBMPFD) [135] are not located in the binding interface of a

UBX domain with the N-terminal adaptor domain of VCP,

but are involved in interactions between protein regions

(for details see the online supplement) Therefore, motions

of the adaptor domain, which are essential for proper VCP

function [124,127], may be impaired by

IBMPFD-associ-ated mutations.

According to a recent yeast-2-hybrid screen [137], a

josephin homologue from Drosophila melanogaster

(CG3781) on the X chromosome interacts with the heat

shock protein HSP60b (CG2830), which is involved in

spermatogenesis [138,139], suppresses ubiquitination [140]

and associates with 38 further proteins including a ubiquitin

E3 ligase, but no other deubiquitinating enzyme except

josephin (CG8184) Interestingly, HSP40 and HSP70

chap-erones have already been observed to associate with VCP,

and they also colocalize with intranuclear ataxin-3

aggre-gates and may play an important role in the disease process

and the impairment of the ubiquitin-proteasome system

[121,141–149].

Structural modelling of the Josephin domain

Recently, it has been observed that the Josephin domain

contains highly conserved amino acids reminiscent of the

catalytic residues of a deubiquitinating cysteine protease

[69], and ﬁrst experimental results support this function

hypothesis [68]: decrease of polyubiquitination of 125

I-labelled lysozyme by removal of ubiquitin, cleavage of the

ubiquitin protease substrate ubiquitin-AMC, and binding of

the speciﬁc ubiquitin protease inhibitor ubiquitin-aldehyde

(Ubal) Mutating the catalytic cysteine in ataxin-3 inhibits

these functions [68].

Previously, we modeled the 3D structure of ataxin-3

based on the ANTH domain [150] of the adaptin AP180

as structural template [73] However, this prediction has to

be revised with regard to the N-terminal Josephin domain

because of the identiﬁed cysteine protease signature [69].

In contrast to our previous prediction [73], which relied

on the secondary structure prediction from a single server,

we now formed the consensus result of the three

state-of-the-art secondary structure prediction servers PSIPRED,

SAM-T99, and SSpro2 All three online servers basically

returned the same secondary structure for human ataxin-3 and josephin 1, resulting in a much more reliable secondary structure prediction of b-strands besides a-heli-ces We propose that the increased accuracy of this prediction is due, at least in part, to a substantial growth

of protein sequence and structure databases The predic-ted b-strands in the Josephin domain corroborate a cysteine protease fold of deubiquitinating enzymes (DUBs) and do not support the ANTH domain structure consisting solely of a-helices In hindsight, the fold recognition methods applied in the past to predict the structure of ataxin-3 may have been misguided by the pronounced prediction of a-helices only.

DUBs process ubiquitin proteolytically at the C-terminus and can be divided into at least two evolutionarily related families of cysteine proteases, UBPs (ubiquitin-speciﬁc proteases) and UCHs (ubiquitin C-terminal hydrolases) [151,152] However, new ubiquitin-speciﬁc families such as otubains (OTU) and JAMMs with low sequence similarity

to known DUBs are still being discovered [151] A consensus of fold recognition servers now selects both available UCH domain structures of human UCH-L3 [153] and yeast YUH1 [154], which superimpose with a low RMSD of 2.0 A˚ (PDB identifiers 1uch and 1cmxA, respectively), as best modelling templates with a moderate confidence score for human josephin 1, but still with only a weak score for ataxin-3 The pairwise sequence–structure alignments returned by the structure prediction servers for 3D modelling differ mainly in the central part of the Josephin domain (amino acid 47–117 in ataxin-3) aligned to DUBs This finding underpins the distant relationship of the Josephin domain to known DUBs The central part does not contain catalytic residues and is thus less conserved, containing insertions of variable length and structure in other cysteine proteases [155].

Based on a multiple sequence alignment of Josephin domain homologues (Fig 6), we used the crystallographi-cally determined structure of YUH1 bound to the ubiquitin-like inhibitor Ubal (PDB identiﬁer 1cmx, chains A and B, respectively) to model the tertiary structure of the Josephin domain of ataxin-3 in complex with Ubal (Fig 7) Thus, the structure of ataxin-3 is predicted to be distinct from the ﬁnger–palm–thumb architecture of UBPs such as USP7/ HAUSP [156] Because of the low degree of conservation in the central part, we believe that ataxin-3 and josephin 1 adopt slightly different structures in this part, which are not very similar to YUH1 In addition, we observed that the Josephin domain also resembles the OTU domain because both have a highly conserved histidine three residues downstream of the catalytic cysteine Interestingly, like ataxin-3, the deubiquitinating OTU domain protein VCIP135 interacts with the N-terminal adaptor domain of VCP through the C-terminal tail including a UBL domain and dissociates p47 from the complex with VCP during ATP hydrolysis of VCP [157,158] This observation also indicates a close functional relationship of the homologous ubiquitin-like UBL and UBX domains.

Functional analysis of the Josephin domain The active site of UCHs is divided into two parts as follows (YUH1/ataxin-3 residue numbers) [153,154]: The

Trang 9

N-terminal part consists of a glutamine (Q84/Q9) upstream

of a cysteine (C90/C14), both of which form an oxyanion

hole to accommodate the negative charge on the substrate

carbonyl oxygen during catalysis The C-terminal part

contains a histidine (H166/H119), which is thought to be

deprotonated, and an asparagine or aspartate (D181/N134),

both of which activate the side chain of the cysteine to

unleash a nucleophilic attack on the carbonyl carbon atom

of the scissile peptide bond The cysteine, histidine, and asparagine/aspartate constitute the catalytic triad charac-teristic of cysteine proteases such as papain.

While all four discussed catalytic residues are strictly conserved in the Josephin domain (Fig 6), a function-ally relevant disordered loop (E144–N164/V79–Q100)

Fig 6 Structure-based multiple sequence alignment of the Josephin domains of ataxin-3 homologues with the crystallographically determined UCH domains of human UCH-L3 and yeast YUH1 The known DSSP secondary structure assignments of UCH-L3 and YUH1 are shown at the top of the alignment (curled lines for a-helix, arrows for b-strands) The corresponding consensus secondary structure predictions for human ataxin-3 and josephin 1 are also depicted Alignment columns with identical residues are highlighted in purple-coloured boxes, those with more than 50% physico-chemically similar amino acids in yellow boxes (bold-printed letters) Text labels (including UCH-L3/YUH1 and ataxin-3/josephin 1 residue numbers) point to catalytic residues (four grey-shaded boxes) and to other highly conserved amino acids in the Josephin domain The PDB/ SPTrEMBL identiﬁers of UCH-L3 and YUH1 are 1uch/P15374 and 1cmxA/P35127, respectively NCBI or Ensembl accession numbers for Josephin domain homologues are given in Table S3

Trang 10

positioned over the catalytic cleft is aligned in the less

conserved central part This loop maintains an inaccessible

active site, but becomes ordered upon binding of Ubal [154].

Therefore, it may control substrate speciﬁcity together with

further strongly conserved amino acids such as N88/L13,

which forms hydrogen bonds with main chain groups of the

loop, and Y167/W120 next to the catalytic histidine [154].

Unfortunately, the structure of the central part and the

loop function remains unclear for the Josephin domain

because of insufﬁcient sequence similarity to UCHs The

Josephin domain is also missing the N-terminal extensions

of UCHs, which are involved in substrate recognition [154].

In addition, a functional relevance of a second strictly

conserved histidine H17, two highly conserved asparagines N20 and N21, and another identical glutamine Q24 downstream of the catalytic cysteine C14 cannot be derived either from the structural model of the Josephin domain (Figs 6 and 7) However, considering their distance from the active site and location inside the protein, they may be solely important for the stability of the domain fold This may also hold true for the strictly conserved S135 and P140 after the catalytically active N134 In contrast, it is easy to interpret

an alternative splice variant of ataxin-3 [10], which consists

of a deletion of the residues from E10 to Q64 including the catalytic cysteine and thus cannot possess proteolytic activity.

Fig 6 (Continued)

Định dạng
Số trang	16
Dung lượng	1,77 MB