1. Trang chủ
  2. » Luận Văn - Báo Cáo

Báo cáo khoa học: Abundance of intrinsic disorder in SV-IV, a multifunctional androgen-dependent protein secreted from rat seminal vesicle pot

12 339 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 12
Dung lượng 684,48 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Keywords bioinformatics; disorder prediction; intrinsically disordered proteins; seminal vesicle protein no.. Abbreviations HCA , hydrophobic cluster analysis; IDPs, intrinsically disord

Trang 1

androgen-dependent protein secreted from rat seminal

vesicle

Silvia Vilasi and Raffaele Ragone

Dipartimento di Biochimica e Biofisica, Naples, Italy

The view that a protein must fold into the correct

shape, as encoded in the amino acid sequence, before

it can function has been deeply rooted in protein

sci-ence, even before the three-dimensional structure of a

protein was first solved However, for some proteins,

especially those involved in signalling and regulation

[1], the unstructured state has been suggested to be

essential for basic cellular functions and recognized as

a separate functional and structural category [2,3]

These are proteins or domains that, in their native

state, are either completely disordered or contain large

disordered regions, and therefore do not fit the

stan-dard sequence–structure–function paradigm, because

intrinsic disorder, whether local or extended to the

entire protein length, is crucially important for their function Dunker and Obradovic [4] categorized func-tional intrinsically disordered regions in molten glob-ule-like and random coil-like structural forms, and Uversky [5] suggested the existence of an additional pre-molten globule form, whose peculiarity is the pres-ence of unstable secondary structure Betraying still imperfect categorization, these systems are currently classified as ‘intrinsically disordered proteins’ (IDPs), but the use of other synonymous expressions, such as

‘intrinsically unstructured proteins’, is widespread in the literature [6] More than 100 such proteins are known, including Tau, Prions, Bcl-2, p53, 4E-BP1 and eIF1A [5,7]

Keywords

bioinformatics; disorder prediction;

intrinsically disordered proteins; seminal

vesicle protein no 4; structure–function

relationship

Correspondence

R Ragone, Dipartimento di Biochimica

e Biofisica, Seconda Universita` di Napoli,

via S Maria di Costantinopoli 16,

80138 Naples, Italy

Fax: +39 081 294136

Tel: +39 081 294042

E-mail: raffrag@tiscali.it;

raffaele.ragone@unina2.it

(Received 30 October 2007, revised 5

December 2007, accepted 13 December

2007)

doi:10.1111/j.1742-4658.2007.06242.x

The potent immunomodulatory, anti-inflammatory and procoagulant properties of protein no 4 secreted from the rat seminal vesicle epithelium (SV-IV) have previously been found to be modulated by a supramolecular monomer–trimer equilibrium More structural details that integrate experi-mental data into a predictive framework have recently been reported Unfortunately, homology modelling and fold-recognition strategies were not successful in creating a theoretical model of the structural organization

of SV-IV It was inferred that the global structure of SV-IV is not similar

to that of any protein of known three-dimensional structure Reversing the classical approach to the sequence–structure–function paradigm, in this paper we report novel information obtained by comparing the physico-chemical parameters of SV-IV with two datasets composed of intrinsically unfolded and ideally globular proteins In addition, we analyse the SV-IV sequence by several publicly available disorder-oriented predictors Overall, disorder predictions and a re-examination of existing experimental data strongly suggest that SV-IV needs large plasticity to efficiently interact with the different targets that characterize its multifaceted biological function, and should therefore be better classified as an intrinsically disordered protein

Abbreviations

HCA , hydrophobic cluster analysis; IDPs, intrinsically disordered proteins; PDB, protein data bank; SV-IV, rat seminal vesicle protein no 4; SVM, support vector machine.

Trang 2

Of the proteins studied in our laboratory, SV-IV

(seminal vesicle protein no 4, so identified according to

its electrophoretic mobility in SDS-PAGE; precursor

SWISS-PROT ID, SVP2_RAT) is a basic (pI = 8.9),

thermostable protein of 90 residues (Mr= 9758)

secreted from the rat seminal vesicle epithelium under

strict androgen transcriptional control, which has been

found to possess potent non-species-specific

immuno-modulatory, anti-inflammatory and procoagulant

prop-erties [8] It has been purified to homogeneity and

characterized extensively [8–10] It is encoded by a gene

that has been isolated, sequenced and expressed in

Escherichia coli [11–14] On the basis of its biological

and biochemical characteristics, SV-IV appears to be a

molecule of obvious pharmacological interest

SV-IV-immunorelated proteins have been discovered in several

rat tissues, as well as in human seminal fluid and

semi-nal vesicle secretion [13,14] The segment 3–41 of SV-IV

has been found to have a high amino acid sequence

similarity with the C-terminal segment 34–66 of

utero-globin, a secreted protein from rabbit displaying

phospholipase A2 inhibitory activity in vitro and

anti-inflammatory effects in vivo [15,16] Others have also

been able to prepare potent anti-inflammatory peptides

from the region of highest similarity between

uteroglo-bin and lipocortin I, a protein that has been suggested

to mediate the anti-inflammatory effects of

glucocortic-oids [17] It is therefore highly desirable to obtain as

complete structural information as possible

From a structural standpoint, early circular

dichro-ism and fluorescence polarization data indicated scarce

structural organization [18] This agreed with a

predic-tor of local flexibility [19], although other predictive

algorithms contrastingly have suggested either the

pres-ence [18] or lack [20] of an appreciable amount of

sec-ondary structure Recently, it has been found that, in

the range of physiological concentrations (2–48 lm

[20,21]), the peculiar biological properties of SV-IV are

probably modulated by a supramolecular equilibrium

in which a trimeric form competes with monomeric

protein for binding to a large variety of SV-IV targets

[20] Eventually, Caporale et al [22] found agreement

between the amounts of predicted and experimental

helical structure present in the monomeric form

(20 and 24%, respectively), and attempted to create a

theoretical model of the structural organization of

SV-IV However, on noting that homology modelling and

fold-recognition strategies were not able to provide

detailed structural information, they concluded that

‘SV-IV assumes a global structure that is not similar

to any protein of known three-dimensional structure’

[22] Indeed, such an occurrence suggests that SV-IV

could violate the standard sequence–structure–function

paradigm, but the authors did not investigate this pos-sibility

We have verified that, in terms of disorder- and order-promoting amino acid subsets [23,24], the com-position of SV-IV does not strictly conform to trends previously found to occur in IDPs, except for a very high content of serine (24%) Furthermore, a search of the DisProt database [25] did not return any hits for SV-IV, indicating that no DisProt sequence resembles this protein However, novel information obtained by publicly available disorder-oriented predictors empha-sizes that the functional state of SV-IV lacks significant structural organization This evidence is sufficient to confidently state that SV-IV can be classified amongst IDPs Incidentally, the present work also confirms that homology modelling and fold-recognition strategies are best suited to obtain information on the architecture

of ordered proteins, but the study of IDPs as if they were ordered can prove to be highly frustrating Thus, when dealing with proteins of uncertain three-dimen-sional structure, it would be more correct and less time-expensive to look for disorder before attempting modelling procedures

Results

Survey of existing structural information

In addition to fluorescence polarization and both far-and near-UV circular dichroism data from our labora-tory [18,20,22], experimental evidence that regular structure is scarce in SV-IV comes from SDS-PAGE, which is routinely used to assess the Mrvalues of pro-teins Because of their unusual amino acid composi-tion, IDPs bind less SDS than usual and their apparent Mrvalue is often 1.2–1.8 times higher than the real value calculated from sequence data or mea-sured by mass spectrometry [7] Indeed, the mobility of SV-IV in SDS-PAGE is compatible with an Mrvalue

of about 15 000–18 000 [9], which can be compared with an Mrvalue of 9758 calculated from the sequence Size-exclusion chromatography also indicates that the hydrodynamic radius of SV-IV resembles that

of an IDP [7], because purified SV-IV elutes well behind chymotrypsinogen (Mr= 25 600) and slightly ahead of RNase A (Mr= 13 600) [9] Finally, diges-tion of SV-IV with trypsin suggests that all but Lys80

of the potential proteolytic sites represented by nine lysine and seven arginine residues are able to efficiently interact with the catalytic site of the enzyme [22], as expected for an IDP-like polypeptide [7] This piece of information has prompted us to perform predictive analyses aimed at clarifying whether or not the SV-IV

Trang 3

sequence is compatible with the classical sequence–

structure–function paradigm

Analysis of physicochemical parameters

It has recently emerged that protein disorder tends to

be related to general chemical properties, rather than

to the abundance or scarcity of specific amino acids

[26] Indeed, like early analyses of protein disorder that

were based on the reasoning that protein folding is

governed by a balance between hydrophobic forces

(attractive) and electrostatic forces between similarly

charged residues (repulsive) [23], disorder-oriented

predictors largely use physicochemical parameters,

such as hydrophobicity [24,27–33], the absolute value

of the net charge [24,27–29,33], C-a B-factors [24,27–

29,32,34] and number of contacts [35–38] Accordingly,

we obtained preliminary information on the structural

preference of SV-IV by comparing values per residue

of these parameters with those of two protein

data-bases composed of ideally globular [35] and natively

unfolded [39] proteins, respectively Visual inspection

of two-dimensional plots obtained by considering all

possible combinations of two parameters suggests that

SV-IV has a strong preference to conform to the

gen-eral structural features expected for IDPs, because in

no case do SV-IV data points fall in regions populated

by ordered proteins (Fig 1)

General prediction analysis

Owing to increased interest in the structure–function

relationships of IDPs, disorder-related literature is

increasing, as witnessed by several recent reviews

[40–43] To obtain prediction reliability, two general

options are presently available: (a) the combined use

of ab initio algorithms, such as a recent scheme based

on well-known predictors [23]; or (b) recent programs

with improved performance on some benchmarks, such

as those based on expected packing density [36–38] or

support vector machine (SVM) methods [44–46] (see

Materials and methods for further details) However,

as the SV-IV sequence comprises amino acid subsets

different from those previously found to occur in IDPs

[23,24] and does not resemble any known sequence

included in the DisProt database [25], it may be

valu-able to proceed with caution and investigate both

options

The first procedure comprises a preliminary search

for low-complexity regions through the seg algorithm

[47], followed by a thorough analysis benefiting from

the combined use of several ab initio methods, such as

pondr (VSL1 and VL-XT) [24,27–29], hydrophobic

cluster analysis (hca) [30], prelink [31], globplot [32], disembl [34], ronn [48], iupred [49], disopred2 [50] and norsp [51] When applied to SV-IV, seg resulted in a long non-globular region spanning the entire sequence, but few amino acids in the N- and C-termini (amino acids 1–4 and 84–90, respectively) Other structural peculiarities, such as disulfide-forming cysteine residues, zinc fingers and leucine zippers [52], are absent from the SV-IV sequence On the functional side, SV-IV is predicted to be a metal binding protein [53], but the expected probability of correct classifica-tion is about 60%, which is lower than the actual clas-sification accuracy based on the analysis of 9932 positive and 45 999 negative samples of proteins [54] The vast majority of the other methods also converged

to indicate an abundance of intrinsic disorder in SV-IV, but few amino acids in the C-terminal region

In particular, hydrophobic clusters, which are typical

of secondary structure elements, were almost totally absent from the hca plot, and prelink predicted the whole sequence as disordered By contrast, some regu-lar structure was predicted by X-ray-based algorithms, such as various disembl routines and disopred2 (seg-ments 31–39, 49–59 and 77–90), and discrepancies also affected globplot analyses, depending on the particu-lar order–disorder propensity set chosen to obtain pre-dictions, but in no cases were potential globular domains predicted When subjected to norsp, the

SV-IV protein did not appear to conform to criteria fixed for identifying non-regular secondary structure (NORS) regions, although about 70% of residues were predicted to be in loopy regions We suspect that no NORS region can be predicted in SV-IV because the recommended length of the sequence window used to calculate the structural content (70 amino acids) is close to the protein length (90 amino acids) Finally, a vanishingly small probability of coiled-coil regions was also predicted by multicoil [55] and coils [56] algo-rithms (not shown) The above results are summarized

in Fig 2

Another set of predictions was performed using algorithms that have been reported to predict protein disorder more accurately than other methods, namely the foldunfold predictor [36–38] and the SVM-based poodle suite [44–46] According to foldunfold, SV-IV is probably fully disordered, because the aver-age value of the disorder parameter over its sequence

is less than the disorder threshold Moreover, the aver-age value of the disorder parameter over regions 1–34, 36–57 and 59–80 is less than the disorder threshold and the regions are greater than the reliable frame (11 residues), which means that these regions are predicted as fully disordered (Fig 3A) Similarly,

Trang 4

poodle predictions suggest that: (a) the entire SV-IV

sequence corresponds to a long disorder region

(poodle-l); (b) a few residues (amino acids 39–40

and 85–90) do not belong to short disorder regions

(poodle-s); and (c) disorder characterizes the whole

protein because of the high disorder propensity of all

residues (poodle-w) (Fig 3B)

Other predictions

To complete our analysis, we verified whether or not

SV-IV possesses biased amino acid composition and

can be maximally separated from globular proteins

Both features have been found to occur in IDPs On the first point, Weathers et al [26,57] have recently examined the contribution of various vectors to recog-nizing proteins that contain disordered regions through

an SVM trained on naturally occurring disordered and ordered proteins They found that high recognition accuracy can be obtained by an SVM that incorporates only amino acid composition, and very good recogni-tion accuracy was retained using reduced sets of amino acids based on chemical similarity Overall, this sug-gests that composition alone and general physicochem-ical properties, rather than specific amino acids, are sufficient to accurately recognize disorder We applied

0

0.2

0.4

0.6

0.8

Hydrophobicity

Hydrophobicity

0 0.2 0.4 0.6

Number of contacts

0

0.2

0.4

0.6

–0.1

0.1 0.2 0.3 0.4 0.5 0.6

B factors

0.15 0.30 0.45

18

–0.15 –0.05 0.05 0.15 0.25

Number of contacts

0.15

0.30

0.45

0.60

0.05 –0.15 –0.05 0.15 0.25

16.5 18.0 19.5 21.0 22.5

B factors

Fig 1 Two-dimensional plots The SV-IV datum (red symbol) is compared with the two sets of 90 natively unfolded and 80 ideally globular proteins (black and grey symbols, respectively) using the mean values of physicochemical parameters computed from the sequence (A) Number of contacts versus hydrophobicity (B) Number of contacts versus net charge (C) Number of contacts versus C-a B-factors (D) Net charge versus hydrophobicity (E) Net charge versus C-a B-factors (F) Hydrophobicity versus C-a B-factors.

Trang 5

Fig 2 Analysis of the SV-IV sequence using well-known predictors The original graphic output of each method and the corresponding inter-pretation are shown In HCA , the protein sequence is shown on a duplicated a-helical net with hydrophobic clusters identified by solid con-tours and amino acid numbers indicated on the top ,¤, h and refer to proline, glycine, threonine and serine, respectively.

Trang 6

the SVM method to compare the SV-IV sequence with

the primary structures of 80 ideally folded and

90 natively unfolded proteins Fig 4A shows the mean

values of the disorder score for all of these proteins

Although the regions covered by the two protein

data-sets overlap to some extent, the SV-IV datum clearly

belongs to the region populated by natively unfolded

proteins With regard to the second point, other

authors [35] have devised an optimal set of artificial

parameters for 20 amino acid residues by Monte Carlo

algorithm, by which they have obtained maximal

sepa-ration between sets of natively unfolded and ideally globular proteins Following the same rationale as above, we compared the mean value of the artificial parameter for SV-IV and the two sets of proteins Even in this case, the SV-IV datum unequivocally falls amongst natively unfolded proteins, whose data points are well separated from those of globular proteins (Fig 4B) Finally, Fig 4C summarizes the results obtained by other algorithms, such as dispro [58], some additional methods not included in the pondr package developed by Dunker et al [59,60], and

aa 39–40 and 85–90 have borderline disorder (probability very close to 0.5) The remaining regions are predicted as disordered

POODLE-S POODLE-L

The whole protein is predicted as disordered

POODLE-W

FOLDUNFOLD

The whole protein is predicted as disordered

0 10 20 30 40 50 60 70 80 90

Residue position

17

18

19

20

21

22

A

B

Residue positions

0

0.5

1

0 20 40 60 80

Residue positions

0 0.5

1

0 20 40 60 80

Fig 3 Analysis of the SV-IV sequence using improved performance programs Graphic output of FOLDUNFOLD [36–38] (A) and POODLE [44–46] (B) predictors.

Trang 7

drippred [61] All of these algorithms agreed in

predicting that 100% amino acids in the SV-IV

sequence are disordered, except drippred, which

resulted in 32% of residues scoring as regular

structure

Discussion

The structural information re-examined here indicates

that intrinsic disorder is abundant in SV-IV Thus, it

was to be expected that homology modelling and

fold-recognition strategies would be unable to create

a theoretical model of the structural organization of

SV-IV [22] Indeed, we have used several disorder

predictors to obtain novel evidence that the odd

behaviour of SV-IV is not compatible with the

classi-cal sequence–structure–function paradigm Our

predic-tions suggest that: (a) the entire SV-IV sequence does

not encode any region with globular organization;

(b) a few isolated segments (mostly the C-terminal

region) may possess some regular structure; (c) the

prediction of regular structure almost exclusively

comes from methods based on Protein Data Bank

(PDB) missing coordinates (disembl routines, dis-opred2 and drippred) and secondary structure-derived propensities (globplot with Deleage–Roux and Russell–Linding parameters); and (d) the mean physicochemical properties of SV-IV are typical of IDPs, as suggested by methods based on visual inspection This could provide a clue for the clarifica-tion of the still obscure aspects of the SV-IV struc-ture–function relationships

Lack of consensus affecting disorder prediction in some regions of SV-IV may result from the different sensitivity displayed by disorder predictors towards the various functional properties that are encoded in sepa-rate segments of the protein sequence Indeed, integrity

of the primary structure was found to be necessary for immunomodulation, whereas all of the procoagulant and anti-inflammatory properties were located in the fragment 1–70, which is devoid of any immunomodu-latory activity, but possesses the same procoagulant and anti-inflammatory activity as the native protein Moreover, the fragment 8–16 was the shortest N-ter-minal-derived peptide that possessed equivalent

or slightly higher anti-inflammatory activity than

DISpro

1–90

VL2 1–90

–9 –6 –3

0

3

400 600 800

Number of residues in protein Number of residues in protein

–4

–2

0

2

4

6

8

A

C

B

Fig 4 Additional predictions of disorder Comparison of the SV-IV sequence with the primary structures of 90 natively unfolded and 80 ide-ally globular proteins (same symbols as in Fig 1) using the SVM method [26,57] (A) and an optimal set of artificial parameters [35] (B) (C) Results obtained by other algorithms.

Trang 8

the native protein, but did not possess any

immuno-modulatory or procoagulant activity Finally, CNBr

cleavage of SV-IV at the single Met70 residue

gener-ated the biologically inactive 71–90 peptide [16],

suggesting that the immunomodulatory properties of

SV-IV are strictly governed by the cooperation

between this and the 1–70 region

Concerning the organization of SV-IV, the results

reported here are in substantial agreement with

pre-vious secondary structure predictions, at least with

regard to the 1–70 region In fact, the self-association

process that underlies the overall functional

behav-iour of the protein induces conformational changes

mainly in this region, which has been suggested to

be without secondary structure in the monomer, but

to contain some a-helix in the trimer [22] However,

minor discrepancies amongst disorder predictions, as

well as between disorder and secondary structure

predictions, suggest that several peptide segments

within the protein sequence might display chameleon

structural behaviour In this regard, previous

experi-ments in buffer solution [18] have shown that a

structural rearrangement of SV-IV takes place after

treatment with 0.2–6.0 mm SDS As this interval

includes the critical micellar concentration of the

sur-factant (2.6 mm) [62,63], it may be inferred that

SV-IV interacts with the membrane-like environment

of SDS micelles, either through direct formation of a

protein–surfactant complex or by an indirect process

in which the micelle is formed first and the protein

is then inserted into it This process is totally

differ-ent from the non-specific massive cooperative binding

of SDS to proteins at submicellar concentrations,

and mimics the situation that SV-IV experiences in

most cell-based biological assays, where its

multi-faceted biological function involves efficient binding

to the plasma membrane of its target cells

(macro-phages, T lymphocytes and polymorphonuclear cells)

at specific sites (Kd@ 10)7–10)8) [16], and can be

obtained only through large plasticity of the

structure

Materials and methods

Protein databases

The database of disordered proteins was created using a list

of natively unfolded proteins [39] and the SWISS-PROT

protein sequence data bank [64] The ideal database of

globular proteins is available at the address http://phys

protres.ru/resources/folded_80.html [35,37], as selected by

inspecting the four general classes in the SCOP database

(1.63 release) [65]

Physicochemical parameters The mean protein hydrophobicity was calculated using the Kyte–Doolittle Scale [66], rescaled to a range of 0–1 [33] The expected average number of contacts per residue in the globular state was calculated according to [35] The mean net charge was defined as the absolute value of the differ-ence between the numbers of positively and negatively charged residues at pH 7.0, divided by the total residue number, according to [39] The average structural B-factor (isotropic temperature factor) scale (2.0 SD) was obtained from [32], where only the B-factors for the C-a atoms were considered to minimize influence by crystal packing and other structural artefacts

Predictors of disorder Below, we list all predictors used in this study, pointing out their salient features A detailed description of each predic-tor is outside the scope of this paper, and the reader inter-ested in more details is invited to refer to the relevant article(s) The segalgorithm (http://mendel.imp.ac.at/ METHODS/seg.server.html), based on the rationale that compact globular structures exhibit quasi-random statistical properties, is designed to detect regions of biased amino acid composition using mathematically defined properties [47] The stringency of the search for low-complexity segments is determined by three user-defined parameters [trigger window, W; trigger complexity, K(1); extension complexity, K(2)], using the seg sequences 45, 3.4, 3.75 and

25, 3.0, 3.3 for long and short non-globular domains, respectively Predictors of natural disordered regions (PONDRs) included in the pondr collection (http:// www.pondr.com) are typically feed-forward neural net-works trained on non-redundant sets of ordered and disor-dered sequences that help to ensure modest predictor biases and to enable the predictors to generalize to new sequences [27–29] PONDRs come in several versions depending on the sequence attributes taken over windows of 9–21 amino acids These attributes, such as the fractional composition

of particular amino acids, hydropathy or sequence com-plexity, are averaged over these windows, and the values are used to train the neural network during predictor con-struction The same values are used as inputs to make pre-dictions The regional order neural network (ronn) software, originally developed to identify protease cleavage sites, is a method based on sequence alignment available at http://www.strubi.ox.ac.uk/RONN [48] The iupred server

at http://iupred.enzim.hu estimates favourable pairwise con-tacts in protein sequences and assigns order⁄ disorder status based on the assumption that intrinsically unstructured⁄ disordered proteins and domains (IUPs) have special sequences that do not fold because of their inability to form sufficient stabilizing inter-residue interactions [49] The disembl software available at http://dis.embl.de is

Trang 9

based on artificial neural networks trained to assign

disor-der by using three different definitions of disordisor-der: residues

within loops⁄ coils, residues within loops with a high degree

of mobility as determined from X-ray temperature factors

(B-factors), and residues with PDB missing coordinates as

defined by Remark465 entries in PDB [34] The disopred2

disorder prediction server at http://bioinf.cs.ucl.ac.uk/

disopred restrains the definition of disorder to those

resi-dues that appear in the sequence records but with

coordi-nates missing from the electron density map, and an SVM

was trained to specifically recognize these [50] globplot

(http://globplot.embl.de) is a web service based on the

ten-dency of residues to be in an ordered or disordered state,

and uses different propensity sets based on amino acid

hydrophobicities (Kyte–Doolittle and Hopp–Woods),

B-fac-tors, PDB missing coordinates and secondary

structure-derived propensities (Deleage–Roux and Russell–Linding)

[32] norsp is an on-line predictor of NORS regions that is

not trained on any dataset and predicts segments in which

the content in regular secondary structure is below

12% over at least 70 consecutive residues, and at least

10 consecutive residues are predicted to be exposed It can

be accessed at http://cubic.bioc.columbia.edu/services/

NORSp [51] The identification of hydrophobic clusters was

performed by hca available at http://bioserv.rpbs.jussieu.fr,

which allows the easy identification of globular regions

from non-globular ones and, in globular regions, the

identi-fication of secondary structures [30] prelink (http://

genomics.eu.org/spip/PreLink) is an hca-derived method

that calculates the amino acid distributions in structured

and unstructured regions, the probability that a given

sequence fragment is part of either a structured or an

unstructured region, and the distance of each amino acid to

the nearest hydrophobic cluster Using these three values

along a protein sequence, unstructured regions can be

pre-dicted with very simple rules [31] The multicoil program

(http://groups.csail.mit.edu/cb/multicoil/cgi-bin/multicoil.cgi)

predicts the location of coiled-coil regions in amino acid

sequences and classifies the predictions as dimeric or

tri-meric [55] coils (http://ch.embnet.org/software/COILS_

form.html) is a program that compares a sequence with a

database of known parallel two-stranded coiled-coils and

derives a similarity score By comparing this score with the

distribution of scores in globular and coiled-coil proteins,

the program then calculates the probability that the

sequence will adopt a coiled-coil conformation [56]

Predictions with improved performance were carried out

by the foldunfold web server available at http://skuld

protres.ru/~mlobanov/ogu/ogu.cgi, based on the

observa-tion that disorder is connected to a weak expected packing

density, as evaluated by the observed number of contacts

within 8 A˚ for each amino acid residue in the globular state

[35–38], and the SVM-based poodle (prediction of order

and disorder by machine learning, http://mbs.cbrc.jp/

poodle) system The poodle suite predicts protein disorder

from amino acid sequences and provides three types of pre-dictions: poodle-l and poodle-s predict long disorder regions (mainly longer than 40 consecutive amino acids) and short disorder regions, respectively; poodle-w is for binary prediction of whole protein disorder [44–46]

Another SVM method for recognizing IDPs was applied according to the procedure described in [26,57], using the mySVM implementation of SVM theory by Ru¨ping [67] The set of artificial parameters for 20 amino acid residues calculated by the Monte Carlo algorithm to maximally sep-arate natively unfolded and ideally globular proteins was obtained from [35] Additional predictions were performed by: dispro software (http://www.igb.uci.edu/servers/psss html), which relies on machine learning methods and lever-ages evolutionary information as well as predicted second-ary structure and relative solvent accessibility [58]; the VL2 and VL3 predictors available at http://www.ist.temple.edu/ disprot/predictor.php, which rely on partitioning protein disorder into flavours based on competition amongst increasing numbers of predictors [59] and on an ensemble

of feed-forward neural networks based on the same attri-butes as VL2 [60], respectively; and the drippred server (http://www.sbc.su.se/~maccallr/disorder), developed for sequence profile visualization and contact map prediction, which predicts structural disorder by looking for sequence patterns that are not typically found in the PDB [61]

Acknowledgements

This paper is dedicated to the memory of the unforget-table Harold C Helgeson (a.k.a Hal), founder of the Laboratory of Theoretical Geochemistry and Biogeo-chemistry at U C Berkeley (a.k.a Prediction Central), who is probably sailing off the coast near Margarita-ville The authors are grateful to V N Uversky for his help in creating the list of natively unfolded proteins

References

1 Dunker AK, Brown CJ, Lawson JD, Iakoucheva LM & Obradovic Z (2002) Intrinsic disorder and protein func-tion Biochemistry 41, 6573–6582

2 Wright PE & Dyson HJ (1999) Intrinsically unstruc-tured proteins: re-assessing the protein structure–func-tion paradigm J Mol Biol 293, 321–331

3 Dyson HJ & Wright PE (2005) Intrinsically unstruc-tured proteins and their functions Nat Rev Mol Cell Biol 6, 197–208

4 Dunker AK & Obradovic Z (2001) The protein trinity – linking function and disorder Nat Biotechnol 19, 805– 806

5 Uversky VN (2002) Natively unfolded proteins: a point where biology waits for physics Protein Sci 11, 739– 756

Trang 10

6 Radivojac P, Iakoucheva LM, Oldfield CJ, Obradovic

Z, Uversky VN & Dunker AK (2007) Intrinsic disorder

and functional proteomics Biophys J 92, 1439–1456

7 Tompa P (2002) Intrinsically unstructured proteins

Trends Biochem Sci 27, 527–533

8 Metafora S, Esposito C, Caputo I, Lepretti M, Cassese

D, Dicitore A, Ferranti P & Stiuso P (2007) Seminal

vesicle protein IV and its derived active peptides: a

pos-sible physiological role in seminal clotting Semin

Thromb Hemost 33, 53–59

9 Ostrowski MC, Kistler MK & Kistler WS (1979)

Purifi-cation and cell-free synthesis of a major protein from

rat seminal vesicle secretion A potential marker for

androgen action J Biol Chem 254, 383–390

10 Pan Y-CE & Li SSL (1982) Structure of secretory

pro-tein IV from rat seminal vesicles Int J Pept Propro-tein Res

20, 177–187

11 Harris SE, Mansson P-E, Tully DB & Burkhart B

(1983) Seminal vesicle secretion IV gene: allelic

differ-ence due to a series of 20-base-pair direct tandem

repeats within an intron Proc Natl Acad Sci USA 80,

6460–6464

12 Kandala C, Kistler MK, Lawther RP & Kistler WS

(1983) Characterization of a genomic clone for rat

semi-nal vesicle secretory protein IV Nucleic Acids Res 11,

3169–3186

13 McDonald C, Williams L, McTurck P, Fuller F,

McIntosh E & Higgins S (1983) Isolation and

charac-terisation of genes for androgen-responsive secretory

proteins of rat seminal vesicles Nucleic Acids Res 11,

917–930

14 D’Ambrosio E, Del Grosso N, Ravagnan G, Peluso G

& Metafora S (1993) Cloning and expression of the rat

genomic DNA sequence coding for the secreted form of

the protein SV-IV Bull Mol Biol Med 18, 215–223

15 Metafora S, Facchiano F, Facchiano A, Esposito C,

Peluso G & Porta R (1987) Homology between rabbit

uteroglobin and the rat seminal vesicle sperm binding

protein: prediction of structural features of glutamine

substrates for transglutaminase J Protein Chem 6,

353–359

16 Ialenti A, Santagada V, Caliendo G, Severino B,

Fiorino F, Maffia P, Ianaro A, Morelli F, Di Micco B,

Cartenı` M et al (2001) Synthesis of novel

anti-inflam-matory peptides derived from the amino-acid sequence

of the bioactive protein SV-IV Eur J Biochem 268,

3399–3406

17 Miele L, Cordella-Miele E, Facchiano A & Mukherjee

AB (1988) Novel anti-inflammatory peptides from the

region of highest similarity between uteroglobin and

lipocortin I Nature 335, 726–730

18 Stiuso P, Ragone R, De Santis A, Metafora S, Peluso

G, Ravagnan G & Colonna G (1989) Structural

properties of rat seminal vesicle protein IV: effect of

sodium dodecylsulfate In Biochemical Aspects on the

Immunopathology of Reproduction(Spera G, Mukherjee

AB, Ravagnan G & Metafora S, eds), pp 105–111 Acta Medica, Rome

19 Ragone R, Facchiano F, Facchiano A, Facchiano AM

& Colonna G (1989) Flexibility plot of proteins Protein Eng 2, 497–504

20 Stiuso P, Metafora S, Facchiano AM, Colonna G & Ragone R (1999) The self association of protein SV-IV and its possible functional implications Eur J Biochem

266, 1029–1035

21 Tufano MA, Porta R, Farzati B, Di Pierro P, Rossano F, Catalanotti P, Baroni A & Metafora S (1996) Rat seminal vesicle protein SV-IV and its transglutaminase-synthesized polyaminated derivative Spd2-SV-IV induce cytokine release from human rest-ing lymphocytes and monocytes in vitro Cell Immunol

168, 148–157

22 Caporale C, Caruso C, Colonna G, Facchiano A, Ferr-anti P, Mamone G, Picariello G, Colonna F, Metafora

S & Stiuso P (2004) Structural properties of the protein SV-IV Eur J Biochem 271, 263–271

23 Ferron F, Longhi S, Canard B & Karlin D (2006) A practical overview of protein disorder prediction meth-ods Proteins 65, 1–14

24 Romero P, Obradovic Z, Li X, Garner EC, Brown CJ

& Dunker AK (2001) Sequence complexity of dis-ordered protein Proteins 42, 38–48

25 Sickmeier M, Hamilton JA, LeGall T, Vavic V, Cortese

MS, Tantos A, Szabo B, Tompa P, Chen J, Uversky

VN et al (2007) DisProt: the database of disordered proteins Nucleic Acids Res 35, D786–793

26 Weathers EA, Paulaitis ME, Woolf TB & Hoh JH (2004) Reduced amino acid alphabet is sufficient to accurately recognize intrinsically disordered protein FEBS Lett 576, 348–352

27 Romero P, Obradovic Z & Dunker AK (1997) Sequence data analysis for long disordered regions prediction in the calcineurin family Genome Inform 8, 110–124

28 Li X, Romero P, Rani M, Dunker AK & Obradovic Z (1999) Predicting protein disorder for N-, C-, and inter-nal regions Genome Inform 10, 30–40

29 Obradovic Z, Peng K, Vucetic S, Radivojac P & Dun-ker AK (2005) Exploiting heterogeneous sequence prop-erties improves prediction of protein disorder Proteins

61 (Suppl 7), 176–182

30 Gaboriaud C, Bissery V, Benchetrit T & Mornon JP (1987) Hydrophobic cluster analysis: an efficient new way to compare and analyse amino acid sequences FEBS Lett 224, 149–155

31 Coeytaux K & Poupon A (2005) Prediction of unfolded segments in a protein sequence based on amino acid composition Bioinformatics 21, 1891–1900

32 Linding R, Russell RB, Neduva V & Ginson TJ (2003) GlobPlot: exploring protein sequences for globularity and disorder Nucleic Acids Res 31, 3701–3708

Ngày đăng: 23/03/2014, 07:20

TỪ KHÓA LIÊN QUAN

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

🧩 Sản phẩm bạn có thể quan tâm