Although its N-terminal region contains several well-known functional domains, its entire C-terminal proline-rich region of 800 amino acids lacks detectable sequence homology to any prev
Trang 1exemplified by a novel neuronal protein, CASK-interactive protein1
Annama´ria Bala´zs1,*, Veronika Csizmok2,*, La´szlo´ Buday1,2, Marianna Raka´cs2, Robert Kiss3,
Mo´nika Bokor4, Roopesh Udupa2, Ka´lma´n Tompa4 and Peter Tompa2
1 Department of Medical Chemistry, Semmelweis University Medical School, Budapest, Hungary
2 Biological Research Center, Institute of Enzymology, Hungarian Academy of Sciences, Budapest, Hungary
3 Laboratory of Structural Chemistry and Biology, Institute of Chemistry, Eo¨tvo¨s Lora´nd University, Budapest, Hungary
4 Research Institute for Solid State Physics and Optics, Hungarian Academy of Sciences, Budapest, Hungary
Keywords
anchor; docking; post-synaptic density;
scaffold; unstructured
Correspondence
P Tompa, Institute of Enzymology,
Biological Research Center, Hungarian
Academy of Sciences, Karolina ut 29, 1113
Budapest, Hungary
Fax: +36 1 466 5465
Tel: +36 1 279 3143
E-mail: tompa@enzim.hu
*These authors contributed equally to this
work
(Received 26 February 2009, revised 15
April 2009, accepted 12 May 2009)
doi:10.1111/j.1742-4658.2009.07090.x
CASK-interactive protein1 is a newly recognized post-synaptic density protein in mammalian neurons Although its N-terminal region contains several well-known functional domains, its entire C-terminal proline-rich region of 800 amino acids lacks detectable sequence homology to any previously characterized protein We used multiple techniques for the struc-tural characterization of this region and its three fragments By bioinfor-matics predictions, CD spectroscopy, wide-line and1H-NMR spectroscopy, limited proteolysis and gel filtration chromatography, we provided evidence that the entire proline-rich region of CASK-interactive protein1 is intrinsi-cally disordered We also showed that the proline-rich region is biochemi-cally functional, as it interacts with the adaptor protein Abl-interactor-2
To extend the finding of a high level of disorder in this scaffold protein, we collected 74 scaffold proteins (also including proteins denoted as anchor and docking), and predicted their disorder by three different algorithms
We found that a very high fraction (53.6% on average) of the residues fall into local disorder and their ordered domains are connected by linker regions which are mostly disordered (64.5% on average) Because of this high frequency of disorder, the usual design of scaffold proteins of short globular domains (86 amino acids on average) connected by longer linker regions (140 amino acids on average) and the noted binding functions of these regions in both CASK-interactive protein1 and the other proteins studied, we suggest that structurally disordered regions prevail and play key recognition roles in scaffold proteins
Structured digital abstract
l MINT-7034649 : Caskin1 (uniprotkb: Q8VHK2 ) physically interacts ( MI:0915 ) with L1CAM (uniprotkb: P32004 ) by two hybrid ( MI:0018 )
l MINT-7034677 : Caskin1 (uniprotkb: Q8VHK2 ) physically interacts ( MI:0915 ) with NCK1 (uniprotkb: P16333 ) by two hybrid ( MI:0018 )
l MINT-7034706 : Caskin1 (uniprotkb: Q8VHK2 ) physically interacts ( MI:0915 ) with Stathmin-3 (uniprotkb: Q9NZ72 ) by two hybrid ( MI:0018 )
Abbreviations
Abi2, Abl-interactor-2; Caskin1, CASK-interactive protein1; CBP, CREB-binding protein; GFP, green fluorescent protein; GST, glutathione transferase; IDP, intrinsically disordered protein; IUP, intrinsically unstructured protein; MAPK, mitogen-activated protein kinase; PRD, proline-rich region; PSD, post-synaptic density; SAM, sterile a motif; Ste5, Sterile 5.
Trang 2Recently, novel scaffold proteins have been discovered
in the brain, particularly in neuronal cells, and are
referred to as the CASK-interactive protein (Caskin)
family [1] Caskin1 and its isoform Caskin2, which are
present in the post-synaptic density (PSD), are
multi-domain proteins possessing six ankyrin repeats, two
sterile a motifs (SAM domains) and a single SH3
domain in the N-terminal part (cf.Fig 1) In contrast,
there are no recognizable domains in the C-terminal
part, which is dominated by a long proline-rich region [1], designated as the proline-rich domain (PRD) in this work Caskin1 can bind the Cask adaptor protein [1], Abl-interactor-2 (Abi2), and another nine proteins shown in this work, and is presumably involved
in the signal pathway related to the Abl tyrosine kinases (A Balazs, V Csizmok, P Tompa, R Udupa
& L Buday, unpublished results) The molecular mechanism of the function of Caskins is not known at
Fig 1 The diagram at the bottom shows a schematic representation of the domain structure of Caskin1 The N-terminal half contains six ankyrin repeats, one SH3 domain and the two SAM domains, whereas the C-terminal half contains no recognizable domain, and has been designated as a proline-rich region ⁄ domain (PRD) The proline-rich region was cut into three parts (PRD1, PRD2 and PRD3), cloned and char-acterized individually in this work Above the scheme is the prediction by the IUPred algorithm, which shows that the entire PRD region is probably intrinsically disordered (the score is above 0.5).
l MINT-7034579 : Caskin1 (uniprotkb: Q8VHK2 ) physically interacts ( MI:0915 ) with ABI2 (uni-protkb: Q9NYB9 ) by two hybrid ( MI:0018 )
l MINT-7034720 : Caskin1 (uniprotkb: Q8VHK2 ) physically interacts ( MI:0915 ) with Synapto-tagmin (uniprotkb: P21579 ) by two hybrid ( MI:0018 )
l MINT-7034691 : Caskin1 (uniprotkb: Q8VHK2 ) physically interacts ( MI:0915 ) with Neurexin-2 (uniprotkb: Q9P2S2 ) by two hybrid ( MI:0018 )
l MINT-7034617 : Caskin1 (uniprotkb: Q8VHK2 ) physically interacts ( MI:0915 ) with CASK (uniprotkb: P07498 ) by two hybrid ( MI:0018 )
l MINT-7034748 : Caskin1 (uniprotkb: Q8VHK2 ) physically interacts ( MI:0915 ) with SIAH1 (uniprotkb: Q8IUQ4 ) by two hybrid ( MI:0018 )
l MINT-7034663 : Caskin1 (uniprotkb: Q8VHK2 ) physically interacts ( MI:0915 ) with Myosin-Ib (uniprotkb: O43795 ) by two hybrid ( MI:0018 )
l MINT-7034734 : Caskin1 (uniprotkb: Q8VHK2 ) physically interacts ( MI:0915 ) with Septin-4 (uniprotkb: O43236 ) by two hybrid ( MI:0018 )
l MINT-7034634 : Caskin1 (uniprotkb: Q8VHK2 ) physically interacts ( MI:0915 ) with EPHA2 (uniprotkb: P29317 ) by two hybrid ( MI:0018 )
l MINT-7034765 , MINT-7034783 : Caskin1 (uniprotkb: Q8VHK2 ) physically interacts ( MI:0915 ) with ABI2 (uniprotkb: Q9NYB9 ) by pull down ( MI:0096 )
Trang 3present As a result of their large size, the capacity to
bind multiple partners and the lack of catalytic
domains, they probably fall into the class of scaffold
proteins, which bind components of a signal
trans-duction pathway simultaneously and ensure the
speci-ficity and efficiency of signal propagation [1] As
the long regions of these proteins often lack any
sequence similarity to other proteins and appear to
lack folded structural domains, we anticipated that
structural disorder may be a general feature of scaffold
proteins In a recent review, structural disorder in
several scaffold proteins and in other proteins of
multiple binding partners (without adherence to the
accepted definition of scaffolds) has been suggested
and analysed [2]
As a result of the rapid advance of knowledge
on intrinsically disordered⁄ unstructured proteins
(IDPs⁄ IUPs), the concept of protein disorder has
gained general recognition recently [3–6] Physical
evidence exists for the disorder of about 500 proteins
[7], and bioinformatics predictions suggest that
disor-der is prevalent in the proteome of eukaryotes, with
more than 10% of their proteins being fully disordered
[8–10] Disorder is most often implicated in signalling
and regulatory functions, and its functional benefits
often manifest themselves in protein–protein
recogni-tion [5,11] One advantage often referred to is that
their extended structure enables IDPs to have a large
interaction capacity with small protein size [12], which
might be directly related to the involvement of
disor-der in scaffold proteins In fact, there is an elevated
level of disorder in hub proteins, i.e proteins involved
in multiple interactions [13–16], and disorder increases
with the number of proteins in multiprotein complexes
[17] The functional role of structural disorder has
been noted in a few scaffold proteins, such as Sterile 5
(Ste5) [18], BRCA1 [19], CREB-binding protein (CBP)
[4] and Mypt1 [20], and it has been suggested that
flexibility provided by disorder is instrumental in
overcoming steric hindrance in the assembly of large
multiprotein complexes [21]
Motivated by the apparent relationship between
pro-tein disorder and scaffold function, in this article, we
provide evidence that the proline-rich region of the
newly recognized Caskin1 is intrinsically disordered
To extend this finding, we also collected 74 scaffold
proteins (also including proteins denoted as anchor
and docking) and examined them by three different
disorder predictor algorithms, i.e IUPred [22,23],
VSL2 [24,25], and FoldIndex [26] We found that, in
these proteins, the frequency of disorder is very high
(49.7%, 63.36% and 47.82% predicted by IUPred,
VSL2 and FoldIndex, respectively), which is similar to
that in the most disordered functional class, RNA chaperones [27] The implications of these findings with respect to the function of Caskin1, and of scaf-fold proteins in general, are discussed
Results
Structural characterization of Caskin1 fragments
As described in the introductory paragraphs, the N-terminal half of Caskin1 contains a number of well-known domains involved in protein–protein inter-action, such as the ankyrin repeats, SH3 and SAM domains (Fig 1) The three-dimensional structures of these domains have been well characterized [28–30] However, the C-terminal part of Caskin1 does not contain any domain, but possesses several proline-rich stretches Because proline is incompatible with repeti-tive secondary structural elements [31] and is known
to be enriched in IDPs [5], we assumed that the C-terminus of Caskin1 might be intrinsically disor-dered This expectation was first confirmed by bioin-formatics predictions by the IUPred algorithm (Fig 1) High IUPred scores indicate that the entire proline-rich region of Caskin1 (amino acids 603–1430)
is disordered
To confirm this prediction, a variety of experimental approaches were also applied, as earlier it has been suggested [5] that, as a result of the limitations of most techniques, a multitude of approaches need to be applied for the conclusive demonstration of disorder The full-length proline-rich region of Caskin1 with a histidine tag on its C-terminus (PRD-His) was cloned and expressed in bacteria However, the expression of this construct was rather difficult because of the high proteolytic sensitivity of the protein, characteristic of IDPs Therefore, only CD, gel filtration and limited proteolysis experiments could be performed, which do not require large amounts of protein For detailed studies, the full-length proline-rich region was cut into three parts, selected for splitting at sites of high local disorder in the IUPred prediction (PRD1-His, Lys603– Lys804; PRD2-His, Val805–Ala1199; PRD3-His, Glu1200–Glu1430), cloned into PQE2 and pET20b vectors with a C-terminal His tag and expressed in Escherichia coli
One important feature of IDPs is their heat stability Therefore, purification of the full-length proline-rich region and its fragments from the bacterial extracts was started by boiling the proteins at 100C for 5 min and loading the supernatants on an Ni–agarose affinity chromatograph The heat stability of the fragments and of full-length PRD-His during purification
Trang 4provides the first line of experimental evidence for
disorder
The CD spectrum of PRD-His shows a minimum at
202 nm (Fig 2A), which is characteristic of a protein
in a largely disordered conformation The CD spectra
of the separate PRDs also show characteristic minima
around 200 nm (Fig 3A), which underscores the
unstructured nature of these regions In the case of
PRD2-His, and a little less in the case of PRD3-His, a
small shoulder at around 220 nm appears, which
indi-cates secondary structural elements in this region of
the protein In addition, the sum of the spectra of the
three fragments almost completely reproduces the
spec-trum of full-length PRD-His (Fig 3B), which confirms
the overall random structure of the proline-rich region,
i.e the lack of discernible long-range interactions in
this region of Caskin1
Another characteristic feature of IDPs is their
extreme sensitivity to proteolysis [5] At typical
prote-ase concentrations at which globular proteins are
hardly affected, these proteins are degraded rapidly
and completely In accordance with this, PRD-His
shows a greater sensitivity to proteolysis with a
prote-ase of wide substrate specificity, subtilisin, than does
the globular control protein BSA (Fig 2B); this
pro-vides an indication of its disordered conformation
Gel filtration data also verify the disordered nature
of the proline-rich region, as the apparent molecular
mass (mapp) of PRD-His (334.5 kDa) is 3.9 times
higher than the real value (85.9 kDa) (Fig 2C) The
three fragments also show a high apparent molecular
mass: 4.5 (PRD1-His, 95.5 kDa), 2.2 (PRD2-His,
91.9 kDa) and 5.4 (PRD3-His, 125.4 kDa) times
higher than the real molecular mass (21, 41.7 and
23.2 kDa, respectively) (Fig 3D) Because the column
was calibrated with globular proteins, these ratios
sug-gest a largely unfolded conformational state, as values
of mapp⁄ m = 4–5 are typical of fully disordered
proteins [20]
We have demonstrated previously that the high
hydration potential of IDPs can be detected by
wide-line 1H-NMR measurements [32,33] This technique is
suitable for the measurement of the amount of bound
water after freezing out bulk water We compared the
temperature dependence of the mobile water fractions
of the three fragments PRD1-His, PRD2-His and
PRD3-His (Fig 3C) The amount of water in the
hydrate layer far exceeds that of BSA and approaches
that of ERD10, an IDP characterized previously [33],
which provides further evidence for the open and
largely solvent-exposed nature It is of note that the
mobile water fraction of PRD2-His shows some
devia-tion from that of the other two fragments, i.e the level
of hydration of this fragment is lower than that of the other two, which indicates some local preference for ordering within this region
Fig 2 Structural characterization of the proline-rich region of Caskin1 (PRD-His) (A) CD spectrum of PRD-His; the large minimum
at 202 nm is typical of IDPs (B) Limited proteolysis experiment with
a broad substrate specificity enzyme, subtilisin, at 1 : 2000 enzyme
to substrate ratio Aliquots were withdrawn at times 0 s, 10 s, 30 s and 1 min, and run on SDS-PAGE Caskin1 is much more sensitive to the enzyme than is the control globular protein BSA (C) Gel filtration chromatography of control globular proteins ( , see Materials and methods) and PRD-His of Caskin1 (h) PRD is an extended, random coil-like protein, with an mappvalue 3.9 times that of its real m value.
Trang 5The one-dimensional 1H-NMR spectra of the
PRD-His fragments (Fig 4) also underscores a largely
disor-dered conformational state Chemical shifts show a
poor dispersion, i.e amide proton signals are clustered
within a half-p.p.m range centred at 8 p.p.m., whereas
the methyl group protons are clustered at around
1 p.p.m Such a limited dispersion and signal overlap
in1H chemical shifts are typical of IDPs [34]
The proline-rich regions of Caskin1 interact
with Abi2
To demonstrate that the proline-rich regions
character-ized above are biochemically functional, we studied the
interaction of Caskin1 fragments with Abi2, which is
an adaptor protein identified originally by its inter-action with Abl tyrosine kinase [35] Caskin1 was cut into five regions and expressed as glutathione transfer-ase (GST) fusion proteins These protein regions repre-sent the ankyrin repeats and the SH3 domain together (ANK⁄ SH3-GST), the two SAM domains (SAM-GST) and the three proline-rich regions (PRD1–3-GST) of the C-terminal PRD The full-length PRD of Caskin1 was also expressed (PRD-GST) Green fluorescent pro-tein (GFP)-tagged Abi2 was expressed in COS7 cells, extracts of which were used for the GST pull-down assay As shown inFig 5, the first and second proline-rich regions of Caskin1 (PRD1-GST and PRD2-GST)
A B
C D
Fig 3 Structural characterization of fragments of PRD (A) Far-UV CD spectra of PRD1-His (blue), PRD2-His (green) and PRD3-His (red) All spectra show a characteristic minimum at around 200 nm, which underscores the unstructured nature of the proline-rich region (B) Compar-ison of the far-UV CD spectrum of the full-length PRD-His (full line) and the sum of the spectra of PRD1-His, PRD2-His and PRD3-His (bro-ken line) The sum of the spectra of the three fragments reproduces the spectrum of PRD-His, which confirms the overall random structure
of the full-length proline-rich region and the lack of appreciable long-range structural organization within this region of the protein (C) The temperature dependence of the mobile water fraction of PRD1-His (blue), PRD2-His (green) and PRD3-His (red), compared with that of the globular control BSA (cyan) and the disordered control ERD10 (black) The large amount of water in the hydrate layer of PRDs suggests their open, solvent-exposed conformations (D) Gel filtration chromatography of the fragments PRD1-His, PRD2-His and PRD3-His shows that all three fragments have an extended conformation with mappvalues 4.5, 2.2 and 5.4 times higher than the real m values, respectively.
Trang 6were able to interact with the GFP-Abi2 protein,
whereas ANK⁄ SH3-GST, SAM-GST and PRD3-GST
did not show an association It is worth noting that
the second proline-rich region showed significantly
increased interaction compared with the first,
suggest-ing that PRD2-GST contains the major bindsuggest-ing site
for Abi2 (Fig 5) This is supported by the finding that
the full-length PRD-GST has binding characteristics similar to that of PRD2-GST These in vitro data suggest that the proline-rich fragments of C-terminal Caskin1 are functional and may interact with SH3 domain-containing proteins, such as Abi2 [We have also found an in vivo association and colocalization of Abi2 with Caskin1 (A Balazs, V Csizmok, P Tompa,
R Udupa & L Buday, unpublished results).]
Caskin1 is a scaffold protein Although the exact function of Caskin1 is uncertain, several observations suggest that it probably belongs
to the family of scaffold proteins Scaffold proteins are signalling proteins that typically have multiple binding domains for simultaneous interaction with a variety of partners They have no catalytic activity, but tether several signalling proteins to organize them into path-ways, thus providing directionality and specificity in signalling For example, the Shank proteins serve as important scaffold molecules modulating signalling pathways at the post-synaptic sites of brain excitatory synapses [36] Ste5 serves in the yeast mating pathway, ensuring that components of the mitogen-activated protein kinase (MAPK) cascade, also involved in osmoresponse and filamentation pathways, act specifi-cally [18] In our case, Caskin1 has been found in a yeast two-hybrid screen to bind about 10 other part-ners besides Abi2 (Table 1), and several points suggest that it is a bona fide scaffold protein: (a) Caskin1 has
a modular structure with several of its domains and non-domain regions involved in protein–protein inter-actions; (b) none of its domains shows catalytic func-tion; (c) it has 11 different partners all involved in signal transduction; (d) it is preferentially located in the PSD, known to harbour many proteins of signal-ling and scaffold function (e.g PSD95, Shank, Homer, etc [37]); (e) it has long uncharacterized regions which lack sequence similarity to other proteins, and has been shown here to be intrinsically disordered The appearance and functional role of structural disorder have been explicitly noted in other scaffold proteins, such as Ste5 [18], BRCA1 [19], CBP [4] and Mypt1 [20] Thus, we decided to study this feature in detail to gain further insight into the possible importance of disorder in Caskin1 function and the class of scaffolds
in general
The collection of scaffold proteins for bioinformatics study, however, is hampered by the lack of consensus
on the definition of these proteins In this article, we focus on three classes of complex-forming proteins of related function, also including anchor and docking proteins The prototype for anchor proteins is the
A
B
Fig 4 (A,B) One-dimensional 1H-NMR spectrum of PRD3-His
shows a narrow p.p.m range and limited dispersion, typical for an
unfolded polypeptide (B is the enlarged part of A between 6.3 and
8.8 p.p.m.).
Fig 5 Proline-rich regions of Caskin1 interact with Abi2 Lysates
of COS7 cells expressing GFP-Abi2 were subjected to affinity
purifi-cation with the following Caskin1 GST fusion proteins (20 lgÆ
point)1) immobilized on glutathione–agarose beads: the ankyrin
repeats and the SH3 domain (ANK ⁄ SH3-GST), the SAM domains
(SAM-GST) and the three proline-rich regions (PRD1–3-GST) The
full-length PRD-GST was also used Bound proteins were eluted by
SDS sample buffer, subjected to 7.5% SDS-PAGE, transferred to
nitrocellulose and immunoblotted with monoclonal anti-GFP IgG.
Lysates of COS7 cells immunoblotted with anti-GFP IgG are also
shown (bottom panel).
Trang 7A-kinase anchoring protein, which localizes protein
kinase A to different subcellular compartments [38]
Docking proteins, in general, have an N-terminal
membrane targeting element, typically a Pleckstrin
homology domain, a myristoylation site or a short
transmembrane domain After direct or indirect
inter-actions with a tyrosine kinase, the docking protein
becomes tyrosine phosphorylated on multiple sites that
can interact with signalling proteins containing SH2
domains Insulin receptor substrate 1, for example,
con-tains an N-terminal Pleckstrin homology domain and a
phosphotyrosine-binding domain, and nearly 20
poten-tial tyrosine phosphorylation sites at the C-terminus
[39] As suggested above, scaffold proteins are able to
interact with many different proteins at the same time,
but they are typically not subject to phosphorylation,
which creates novel binding sites The lack of consensus
on these definitions is also indicated by the sole study
addressing the structural disorder in scaffold proteins
[2], in which several proteins clearly not of scaffold
function (e.g p53, a transcription factor and
voltage-activated potassium channel, a binding partner of the
scaffold protein PSD95 [40]) were involved Our study
encompasses proteins involved in the formation of
mul-tiprotein complexes, which have modular organization
We collected 74 such proteins by literature search and
analysed their disorder by three different algorithms
Prediction of disorder in scaffold proteins
The structural disorder of the 74 scaffold, docking and
anchor proteins was predicted by three different
algo-rithms, i.e IUPred, VSL2 and FoldIndex (Table S1,
see Supporting information) We found that the ratio
of residues in local disorder was very high (49.7%, 63.36% and 47.82% predicted by IUPred, VSL2 and FoldIndex, respectively) in these proteins, which is comparable with the ratio found in the most disor-dered protein families i.e proteins involved in tran-scription or signal transduction [41] and in RNA chaperones [27] This high level of disorder suggests functional importance in scaffolds
Further, we asked whether disorder can be ascribed
to regions intervening between the noted functional domains in these proteins To this end, their sequences were analysed to localize their structured PFAM domains Some, described in detail in the literature, are shown inFig 6 The analysis of regions connecting the domains gave a very high disorder ratio: 61.13%, 77.53% and 54.84% predicted by IUPred, VSL2 and FoldIndex, respectively; in certain proteins, such as GRB2-associated proteins, it exceeded 90% (Table S1)
To demonstrate that these intervening regions are not merely there to connect ordered functional domains,
we characterized their length distribution in the exam-ined proteins (Fig 7) Although globular domains tend
to be short and show a rather normal distribution, with an average length of 86 amino acids, the distribu-tion of linker regions is wide, with an average length
of 140 amino acids, and a maximal length as long as
1579 amino acids (in BRCA1)
Discussion
Our knowledge of the structure of scaffold proteins is largely limited to those regions for which three-dimen-sional structure has been established However, if we consider that the binding of numerous proteins in tight proximity is rather difficult in the case of a rigid, glob-ular structure, it is reasonable to assume that these proteins contain long, disordered regions Nevertheless, the occurrence of disorder and its functional conse-quences in scaffold proteins have never been examined systematically The present study provides evidence for the extensive disorder of Caskin1 and also for the class
of scaffold proteins in general Overall, the level of dis-order exceeds that of the functional class so far consid-ered to be the most disordconsid-ered: RNA chaperones [27]
It is known that, in proteins associated with signal transduction, transcription and RNA chaperone activi-ties, the ratio of amino acids in locally disordered regions is very high, on the order of 50–60% These high levels are thought to result from the functional advanta-ges provided by disorder, which enables functions that cannot be carried out by globular proteins One advan-tage of the extended, disordered conformation is an
Table 1 Results from the two-hybrid screen using a fragment of
Caskin1 (amino acids 280–963) as bait and a human fetal cDNA
library The numbers in parentheses represent the number of
identi-cal clones obtained.
1 (12) Abl-interactor-2 (Abi2) Adaptor protein
molecule
8 (1) Stathmin-like
3 protein
Stathmin family protein
9 (1) Synaptotagmin Mediator of Ca 2+ -regulated
vesicle fusion
Trang 8enhanced interaction capacity of the protein [12], which
is also manifested in the elevated level of disorder of
hub proteins [13–16] and the increase in disorder with
complex size [17] As disordered regions are often
directly involved in protein–protein interactions [42],
these points help us to interpret the possible role of
disorder in Caskin1 and in other scaffold proteins To
obtain a balanced view of structural disorder, it should
also be taken into consideration that it may also pose a
danger to the cell, such as the occurrence of oncogenic
fusion proteins in cancer and amyloid aggregates in
neurodegenerative diseases [4] It is probably a result of these adverse effects that the cellular level of IDPs is tightly regulated by several mechanisms [43]
Caskin1 is present in the PSD of neuronal cells Within its N-terminal half, it contains some well-char-acterized domains, which are involved in the interac-tion with Cask [1], but the C-terminal, proline-rich region has never been examined According to our structural studies, this entire region is intrinsically dis-ordered, and proline-rich regions are known to interact with SH3, WW and other domains of cognate proteins [31,44] Indeed, PRD of Caskin1 contains several consensus SH3 binding sites, and we postulate that it
is involved in multiple interactions with other PSD proteins In this study, we have shown that the proline-rich regions interact with the Abi2 protein, which have SH3 domains (we have also found the
in vivo association of Abi2 with Caskin1; A Balazs,
V Csizmok, P Tompa, R Udupa & L Buday, unpublished results) In this sense, PRD of Caskin1 might function in a manner similar to the long, central, disordered region of BRCA1, which harbours binding motifs for multiple partners in DNA repair [19] A fur-ther point on the function of PRD of Caskin1 is that all of our studies point to a local tendency of ordering
in the middle PRD2 segment (amino acids 805–1199) The level of hydration of this fragment is lower than that of the other two and the results of CD analysis also show some deviation from a fully disordered, random coil-like state By gel filtration chromatogra-phy, this region also shows less extended conformation than the rest of PRD As a local tendency for ordering
is a sign of sites poised for interactions [45,46], it
Fig 6 Schematic representation of the domain structure of selected scaffold proteins The scheme shows the domain architecture of 20 selected scaffold proteins representing 20 families described in detail in the literature established by PFAM Long grey lines connecting the domains are regions with no recognizable similarity to known proteins.
Fig 7 Length distribution of domains and linker regions in scaffold
proteins The numbers of occurrences of domains (light grey) and
linker regions (dark grey) with their indicated lengths in the 74
scaf-fold proteins (Table S1) are given The occurrence was calculated
for 50 amino acid length bins, always including the upper limit At
the end of the scale, linkers above 800 amino acids in length are
grouped (their maximum length extends to 1579 amino acids).
Trang 9is conceivable that this middle segment of PRD in
Caskin1 is a primary site of interaction with multiple
partners in PSD, especially as PRD2 is the major
binding site for Abi2 All of these inferences on the
function of Caskin1 are perfectly in line with the
organization of PSD PSD is a dynamic multiprotein
complex attached to the post-synaptic membrane,
composed of several hundred proteins, including
recep-tors and channels, cell adhesion proteins, cytoskeletal
proteins, G-proteins and their modulators, and
signal-ling molecules including kinases and phosphatases [47]
A variety of scaffold proteins, such as members of the
MAGUK, Shank and Homer families, serve to
orga-nize PSD As a result of its modular character and
ability to form multiprotein interactions, we suggest
that Caskin1 is a novel scaffold protein in PSD
Previous scattered observations with other scaffold
proteins [4,18,19], our novel data on Caskin1 and the
noted functional advantages of disorder related to
molecular recognition [12,42,48] point towards the
gen-eral role of disorder in scaffold proteins This inference
was underscored by the prediction of disorder for a
collection of 74 scaffold proteins: on average, 53.6%
of their amino acids were in locally disordered regions
Disorder, however, is not evenly distributed in the
sequences, as shown by the consideration of only the
regions connecting PFAM domains The predicted
average disorder for these regions is 64.5%, which
sug-gests that scaffold proteins are constructed as beads on
a string from globular domains connected by
occasion-ally very long linker regions Because these linkers
cover 65.8% of the total length of scaffold proteins on
average, and their average length far exceeds that of
the globular domains, there is no doubt that disorder
in these proteins fulfils very important functions,
prob-ably commensurable in importance with that of
ordered domains
Actual data on some scaffold proteins provide
evidence that these regions are much more than mere
passive linkers of functional globular domains For
example, BRCA1 contains an approximately
1500-amino-acid-long central region between the N-terminal
RING domain and C-terminal BRCT domain [19]
Although it lacks stable structural elements or
recog-nizable domains, this region is implicated in binding
not only DNA, but numerous proteins involved in
DNA damage response and repair [49,50] Another
scaffold protein, Mypt1, also contains a long
disor-dered segment in its N-terminal region, and this
segment is involved in binding to the type 1 protein
phosphatase [51] CBP has also been amply
character-ized in this respect This protein contains seven
globu-lar domains and intervening disordered regions At
least two regions of specific partner-binding function, the nuclear receptor interaction domain and the nuclear receptor co-activator-binding domain, reside in the disordered regions of the protein [4] In the case of Ste5, the scaffold protein that binds several kinases of the MAPK pathway, binding of Fus3 has been shown
to fall into a locally disordered region [18]
These data on scaffold proteins suggest that their long disordered regions present binding sites for their partners As a result of their extended conformation, they have a large potential binding capacity, being able
to anchor multiple partners next to each other Inter-action sites in disordered regions, termed preformed structural elements [45], linear motifs [48], primary contact sites [52] or molecular recognition features [46], usually only constitute a few residues, and thus enable a very economical and high-capacity binding of partners Furthermore, these regions are often the sites
of post-translational modifications [48,53], and may themselves affect the activity of the bound partner [18], which sug-gests a rather elaborate and complex binding⁄ organiz-ing role in the function of scaffold proteins We hope that this suggestion provides novel insight into the function of scaffold proteins, and will instigate the design of novel experimental approaches aimed at resolving the structure and function of these important proteins
Materials and methods
DNA constructs
The full-length rat Caskin1 cDNA was kindly provided by Thomas Su¨dhof (University of Texas Southwestern Medical Center, Dallas, TX, USA), and the full-length Abi2 cDNA was donated by Ann Marie Pendergast (Duke University Medical Center, Durham, NC, USA) Caskin1 cDNA was amplified by a high-fidelity DNA polymerase and subcloned into the pcDNA 3.1⁄ V5-His TOPO vector (Invitrogen, San Diego, CA, USA) The full-length Abi2 was amplified by PCR and subcloned into the BamHI site of the pEGFP-C1 vector (BD Biosciences Clontech, San Jose, CA, USA) cDNAs corresponding to the ankyrin repeats and SH3
domains (SAM-GST, amino acids 347–610), proline-rich region 1 (PRD1-GST, amino acids 603–804), proline-rich region 2 (PRD2-GST, amino acids 804–1199), proline-rich region 3 (PRD3-GST, amino acids 1200–1430) and the full-length PRD of Caskin1 (PRD-GST, amino acids 603–1430) were amplified by PCR and subcloned into the EcoRI⁄ SalI sites of the pGEX-4T1 vector (Amersham Biosciences, Fairfield, CT, USA) as GST fusion proteins
Trang 10For pull-down experiments, GST fusion proteins were
puri-fied by binding to glutathione–agarose (Sigma, St Louis,
MO, USA) without elution Protein purification was
moni-tored on Coomassie blue-stained SDS–PAGE gels: the
majority of the GST proteins gave single bands
The full-length proline-rich region of Caskin1 (PRD-His)
and its fragments (PRD1-His, PRD2-His, PRD3-His) were
also subcloned into the BamHI⁄ XhoI sites of the expression
vector pQE2 (Qiagen, Venlo, the Netherlands) with a
C-terminal His tag PRD2, because of poor expression of
the protein, was further subcloned into the NdeI⁄ XhoI sites
of the expression vector pET20b (Novagen, San Diego,
CA, USA) with a C-terminal His tag In all cases, the
con-structs were verified by DNA sequencing (MWG-Biotech,
Ebersburg, Germany)
Protein purification
For structural characterization, the full-length PRD of
Caskin1 and its fragments were expressed in the E coli strain
BL21 Star The expression of the proteins was induced by
0.5 mm isopropyl thio-b-d-galactoside at 30C for 3 h The
proteins were purified to homogeneity from cellular extracts
by heat treatment of the supernatants (5 min· 100 C),
fol-lowed by nickel nitrilotriacetic acid affinity chromatography
(Qiagen) For further purification, the dialysed proteins were
loaded onto an SP-Sepharose ion exchange chromatograph
(Amersham) in a buffer of 20 mm Tris, 1 mm EDTA, pH 7.5,
and then eluted by a linear salt gradient (50–500 mm NaCl)
Fractions with the highest level of protein were pooled,
dialy-sed into 20 mm Tris, 150 mm NaCl, 1 mm EDTA, pH 7.5
and stored frozen at )20 C in aliquots The purity of the
constructs was demonstrated by SDS-PAGE (Fig S1, see
Supporting information)
CD measurements
CD spectra were recorded at a protein concentration of
0.1 mgÆmL)1in 10 mm Na2HPO4, 150 mm NaCl, pH 7.5 in
a cuvette (path length, 1 mm) on a Jasco J-720
spectropola-rimeter (Jasco, Oklahoma City, OK, USA) in a continuous
mode with a bandwidth of 1 nm, response time of 8 s and
scan speed of 20 nmÆmin)1 All spectra shown were
obtained by subtracting the buffer spectrum and averaging
10 separate scans
Gel filtration chromatography
The unfolded nature of PRD and its fragments was also
characterized by gel filtration chromatography The
proteins (200 lL) were run on an Amersham Biosciences
Superdex 200 (1· 30 cm) column at 0.5 mLÆmin)1 in a
buffer of 50 mm Na2HPO4, 150 mm NaCl, pH 7.0 on an
Amersham Biosciences FPLC system The proteins were
detected at 280 nm The column was calibrated using the following globular proteins (m in parentheses): ribonuclease
A (13.7 kDa), chymotrypsinogen A (25.0 kDa), ovalbumin (43.0 kDa), BSA (67 kDa) and alcohol dehydrogenase (146.8 kDa) The m values of the proteins were determined from the calibration curve constructed by plotting log m values of calibration proteins vs the elution volume The hydrodynamic dimension was characterized by the ratio of
mapp, determined by gel filtration chromatography, and the absolute value of m, calculated from the amino acid sequence of the protein
NMR spectroscopy 1
H-NMR spectra of PRD1, PRD2 and PRD3 were recorded at 500 MHz on a Bruker DRX instrument (Bruker, Billerica, MA, USA); 16 000 complex data points were acquired in the direct dimension at 300 K using a spectral width of 12 p.p.m Data were zero-filled and processed with a shifted quadratic sinbell plus exponential window function For water suppression, the 3-9-19 pulse sequence with gradients was used [54]
Wide-line NMR spectrometry
The mobile proton (water) fraction was measured directly
by two1H-NMR methods: by measuring the free induction decay signal or recording Carr–Purcell–Meiboom–Gill echo trains The determination of the mobile water fraction is based on the comparison of the signal intensity or echo amplitude extrapolated to t = 0 with the corresponding values measured at a temperature at which the whole sam-ple is in the liquid state Details of the applied method have been described elsewhere [32,33,55]
The effect of freezing on protein solutions was controlled
by the comparison of NMR parameters obtained before and after a freeze–thaw cycle at temperatures above 0C
We found that the freeze–thaw cycle caused no observable changes for the studied samples as far as the measured NMR parameters were concerned The temperature was controlled by an open-cycle Oxford cryostat with a stability
of ± 0.1C; the uncertainty of the temperature scale was
± 1C.1H-NMR measurements and data acquisition were accomplished using a Bruker SXP 4-100 NMR pulse spec-trometer at x0⁄ 2p = 82.55 MHz with a stability of better than ± 10)6 The data points in the figures are based on spectra recorded by averaging signals to reach a signal to noise ratio of 50 The number of averaged NMR signals was varied to achieve the desired signal quantity for each sample and for unfrozen water quantities The sensitivity of the NMR spectroscope on sample change was controlled
by measuring the length of the p⁄ 2 pulse to obtain reliable
M0 values [55] The extrapolation to zero time was performed by fitting a stretched exponential