1. Trang chủ
  2. » Luận Văn - Báo Cáo

Báo cáo khoa học: What determines the degree of compactness of a calcium-binding protein? pdf

12 370 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 12
Dung lượng 357,42 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

The linker average hydrophilicity procedure discriminated well between all the extended and non-extended forms of the known-structure calcium-binding proteins, and its prediction concern

Trang 1

calcium-binding protein?

Liliane Mouawad1, Adriana Isvoran2, Eric Quiniou1and Constantin T Craescu1

2 Department of Chemistry, West University of Timisoara, Romania

Calcium transport and⁄ or regulation are important

events for the normal morphology and metabolism of

the cell and play significant roles in the mechanisms of

many disease processes [1] The proteins that interact

with the calcium ions involved in these events are

called calcium-binding proteins (CaBPs) They form

two main subfamilies: the EF-hand CaBPs and the

non-EF-hand CaBPs EF-hand CaBPs, whose

proto-type is calmodulin [2], are characterized by the

pres-ence of structural motifs called ‘EF-hands’ Non

EF-hand CaBPs do not use this structural motif to

bind calcium; they may be found in the cytoplasm (similar to C2 domain proteins) [3], in the extracellular medium [4] or associated with the membrane (similar

to annexins) [5]

For the EF-hand CaBPs, each EF-hand motif con-tains two helices connected by the calcium-binding loop, a highly conserved region that binds the metal ion Many CaBPs exhibit two domains, each contain-ing two EF-hand motifs; the N-terminal (helices A, B,

C and D) and C-terminal (helices E, F, G and H) domains are connected by a linker region (Fig 1)

Keywords

calcium-binding proteins; centrin; EF-hand;

hydrophobicity; predicted form

Correspondence

Curie-Recherche, Centre Universitaire Paris-Sud,

Baˆtiment 112, 91405 Orsay Cedex, France

Fax: +33 1 69 07 53 27

Tel: +33 1 69 86 71 51

E-mail: liliane.mouawad@curie.u-psud.fr

(Received 8 September 2008, revised 8

December 2008, accepted 10 December

2008)

doi:10.1111/j.1742-4658.2008.06851.x

The EF-hand calcium-binding proteins may exist either in an extended or a compact conformation This conformation is sometimes correlated with the function of the calcium-binding protein For those proteins whose structure and function are known, calcium sensors are usually extended and calcium buffers compact; hence, there is interest in predicting the form of the pro-tein starting from its sequence In the present study, we used two different procedures: one that already exists in the literature, the sosuidumbbell algorithm, mainly based on the charges of the two EF-hand domains, and the other comprising a novel procedure that is based on linker average hydrophilicity The linker consists of the residues that connect the domains The two procedures were tested on 17 known-structure calcium-binding proteins and then applied to 59 unknown-structure centrins The sosui-dumbbell algorithm yielded the correct conformations for only 15 of the known-structure proteins and predicted that all centrins should be in a closed form The linker average hydrophilicity procedure discriminated well between all the extended and non-extended forms of the known-structure calcium-binding proteins, and its prediction concerning centrins reflected well their phylogenetic classification The linker average hydrophilicity cri-terion is a simple and powerful means to discriminate between extended and non-extended forms of calcium-binding proteins What is remarkable

is that only a few residues that constitute the linker (between 2 and 20 in our tested sample of proteins) are responsible for the form of the calcium-binding protein, showing that this form is mainly governed by short-range interactions

Abbreviations

CaBP, calcium-binding protein; LAH, linker average hydrophilicity; PDB, Protein Data Bank.

Trang 2

EF-hand CaBPs are divided into two broad classes [6]:

those that bind calcium to regulate its concentration

(calcium-buffering and calcium-transporting proteins)

and those that bind calcium to decode its signal

(cal-cium-sensor proteins) The two functional classes also

have different structural features: calcium-buffering

and calcium-transporting proteins, such as

parvalbu-min [7] or the Nereis diversicolor sarcoplasmic

calcium-binding protein [8], usually have a compact tertiary

structure and are not conformationally sensitive to

cal-cium-binding, whereas calcium sensor proteins, such as

calmodulin [2] and troponin C [9], have extended

ter-tiary structures and show important conformational

changes upon calcium-binding In the extended form,

the linker between the two domains may be structured

in a straight helix, whereas, in the non-extended form,

the linker is unstructured leading to either a floppy

conformation or a very compact one (Fig 2) [10] It is

important to understand the physical reasons for these

differences This would provide tools to predict the

form of the CaBPs from their sequences, and therefore indicate their biological function

Recently, a protein classification tool, sosuidumb-bell [11], was developed to predict the degree of com-pactness of proteins starting from their amino acid sequences This tool is based on studies undertaken on all the monomers of the Protein Data Bank (PDB) [12], and not just CaBPs, indicating that the electro-static repulsion between the domains is a dominant factor in the stabilization of the extended structures, in addition to the amphiphilic character of the central flexible region By contrast, globular proteins are pre-dicted to be stabilized by a hydrophobic core built by residues from the two domains Using the sosuidumb-bell algorithm, we have analyzed 17 CaBPs with known 3D structures (Table 1) Fifteen of them were predicted in the correct form but, unfortunately, two structures were incorrectly predicted Indeed, human calmodulin-like protein (1GGZ) [13] and human cen-trin 2 (2GGM) [14], which are extended proteins, were predicted to be compact These exceptions represent a non-negligible percentage (12%) and they emphasize the need for a more detailed analysis of the sequence– structure relationship in the case of CaBPs

In the present study, we have developed a novel pro-cedure based on the linker average hydrophilicity (LAH), which we applied to our sample of 17 known-structure CaBPs and to unknown known-structures of cent-rins Centrins, a subfamily of CaBPs, are essential components of microtubule-organizing centers in organisms ranging from algae and yeast to humans [15,16] They are EF-hand calcium-binding proteins with a sequence similarity to calmodulin but distinct calcium-binding properties [15] They were shown to

be involved in centrosome duplication [17] and the contraction of centrin-based fiber systems [18] and to play a functional role in nuclear export pathways [19] The Ca2+ dependence of the centrin interactions with their targets suggests that centrins play a regulatory role by activating or changing the conformation of various target proteins Analyses of amino acid sequences of centrins from different organisms reveal

at least four phylogenetic families and several phyloge-netic subfamilies [20,21] The centrins that we consider

in the present study are listed in Table 2: (a) the Chla-mydomonas reinhardtii-like family (CrCen-like), which contains centrins from the subfamilies of green algae and vertebrate isoforms Cen1 and Cen2; (b) the higher plants Arabidopsis-like family (AtCen-like); (c) the yeast Saccharomyces cerevisiae-like family (Cdc31-like), which contains mainly two subfamilies, fungal centrins and the vertebrate isoform Cen3; and (d) the Parame-cium tetraureliainfraciliary lattice family (PtICL1-like),

Loop I Loop II Loop III Loop IV

Linker Fig 1 The hand protein schematic representation Each

EF-hand motif consists of two helices linked by a calcium loop (black

dots represent calcium ions) Two motifs constitute one EF-hand

domain The N- and C-domains are bound by a linker (bold line).

Fig 2 View of the 3D structures of two CaBPs: (A) the extended

form of calmodulin (PDB code: 1CLL) and (B) the non-extended

form of guanylate cyclase activating protein 2 (PDB code: 1JBA).

The helices are in cyan, the b-sheets are in yellow and the linker is

in red The linker in 1CLL is structured, whereas it is a loop in

Trang 3

organized in ten subfamilies that contain 35 identified

isoforms [22] The 3D structure of the entire protein in

complex with its target polypeptide is known for only

two centrins: the human centrin: HsCen2 (2GGM) [14]

and the Saccaromyces cerevisiae centrin, ScCdc31

(2DOQ) [23]

The functional diversity of centrins should depend

on their sequence and their Ca2+ binding properties

However, we may ask whether the global

conforma-tion or the conformaconforma-tional preference of individual

centrin molecules also play a role in the target

recogni-tion and the plasticity of heteromolecular complexes

This idea is supported by the recent observation that

yeast ScCdc31 bound to a ScSfi1 fragment shows a

bent conformation [23], whereas human HsCen2 in

complex with an XPC peptide is completely extended

[14] In the present study, we present a new and simple

theoretical procedure for the global shape prediction

of EF-hand proteins that allows us to analyze the

pos-sible shape diversity of centrins presented in Table 2

Results and Discussion

Utilization of theSOSUIDUMBBELLalgorithm

We first applied the sosuidumbbell algorithm (http://

bp.nuap.nagoya-u.ac.jp/sosui/sosuidumbbell/dumbbell_

submit.html) to all the CaBPs with known 3D

struc-tures (Table 1) In this algorithm, a structure is pre-dicted to be extended if it obeys four criteria: (a) the absolute value of the net charge of the entire protein is higher than 20 (|Qprot| > 20); (b) the absolute net charge density (|Qprot|⁄ N, where N is the total number

of residues) is higher than 0.14 (dQ> 0.14); (c) there

is a charge balance between the two domains (|QNQC| > 100); and (d) there is a high amphiphilicity

at the center of the linker region and a high hydropa-thy at its termini [11] Based on these four criteria, the results yielded 15 well-predicted structures and two incorrectly predicted ones The latter are human cal-modulin-like protein (1GGZ) and human centrin 2 (2GGM), the structures of which are extended but pre-dicted as non-extended Therefore the question remained as to which of the four criteria described above is responsible for this misprediction To address this question, we verified initially the first two criteria For this purpose, we calculated the absolute net charge and the charge density of the entire protein for all the investigated CaBPs (Table 3), with known and unknown structures (Tables 1 and 2) First, we fol-lowed exactly the procedure described by Uchikoga

et al [11], namely that histidine residues were consid-ered as positively charged (although at the pH values corresponding to the great majority of the experiments, they are deprotonated) and the calcium ions that might bind to the protein were omitted The results

Table 1 Features of the known-structure CaBPs used in the present study, showing the name of the protein, its code in the PDB, its code

bp.nuap.nagoya-u.ac.jp/sosui/sosuidumbbell/dumbbell_submit.html) CIB, and-integrin-binding protein; SCBP, sarcoplasmic calcium-binding protein.

Protein

PDB code

SwissProt code

Experimental structure

Structure predicted by the SOSUIDUMBBELL algorithm

Trang 4

Table 2 Phylogenetic classification of centrins All centrins considered in the present study (with known and unknown structures) are classi-fied by families and subfamilies The PDB codes of the known structures of fragments (*) or the entire protein are given.

Trang 5

(Fig 3A,B and Table 3) show that, as indicated above,

only five known-structure proteins are predicted to be

extended instead of the seven expected (1GGZ and

2GGM are mispredicted) and all centrins with

unknown structures are predicted in a non-extended

form In a second step, the histidines were considered

neutral (CaBPs usually contain very little His) and the

Ca+2ions were added, but the results were even worse

(data not shown) because the net charge was

dimin-ished and therefore the structures were predicted to be

even more compact The first two criteria appear to be

responsible for the misprediction of the form of 1GGZ

and 2GGM Moreover, concerning centrins with

unknown structures, some experimental results (C T

Craescu & S Miron, unpublished data) in addition to

the phylogenetic classification indicate that at least the

CrCen family proteins should be in an extended form,

which is not the case in the prediction based on the

first two criteria

The last two criteria in the sosuidumbbell

algo-rithm are strongly dependent on the definition of the

domains and the inter-domain linker The

delimita-tion of this linker is not always obvious: in the

extended structures, it forms a helix in the continuity

of helices D and E, whereas, in some compact

con-formations, it is a very short unstructured region

(Fig 2) In the sosuidumbbell algorithm, the linker

considered may be too long and, consequently, the

domains too short, as for calmodulin, where helices

D and E, which belong to the N- and C-domains,

respectively, are considered as parts of the linker

[11] In the present study, to determine the linker,

we identified first the calcium-binding loops (Fig 1),

then we counted ten residues after loop II

(corre-sponding to helix D) and ten residues before loop III

(corresponding to helix E), and the remaining

resi-dues inbetween were considered as the inter-domain

linker Ten residues were considered for helices D

and E because the experimental structural data show

that a helix belonging to an EF-hand motif contains

ten residues on average Consequently, in the

pro-teins investigated in the present study, the linker was

between two and 20 residues long (Table 3), corre-sponding to 0.96% and 10.26%, respectively, of the protein sequence length

Based on this definition of the linker, the charges of the N- and C-domains were calculated without consid-ering the calcium ions In Fig 3C, we report the abso-lute value of the product of these charges, |QNQC|, which represents the charge balance between the domains With the exception of troponins, all the investigated proteins are characterized by products

|QNQC| lower than 100, and therefore are predicted to

be non-extended

From these results, it is clear that, for CaBPs, the charges of the entire protein or of the separated domains are not responsible for the extended or com-pact form of the protein This assertion is obvious in the case of human centrin 2 (HsCen2) In this protein, the first 25 amino acids, corresponding to a disordered region, are highly charged [24,25], with the net charge

of this peptide being equal to 6 (it contains seven basic and one acidic residues) The X-ray structure of this protein was obtained in the presence [14] and in the absence [25] of these residues (PDB codes 2GGM and 2OBH, respectively) In both cases, HsCen2 adopts an extended conformation, showing that the charge bal-ance of the domains does not play an important role for this protein Nevertheless, in both cases, the sosui-dumbbell algorithm predicts a non-extended form, which is not correct Moreover, the structure of all the extended forms of the CaBPs considered in the present study was determined experimentally in the presence of calcium ions Knowing that these ions reduce signifi-cantly the charges of the domains and therefore their electrostatic repulsions, calcium-binding should favor the compact structure of CaBPs, which is not the case The fourth criterion of the sosuidumbbell tool refers to the hydrophobicity of the central linker region, which is calculated using the Kyte & Doolittle Scale [26] Ushikoga et al [11] described the linker region of an extended protein as having an important negative hydrophobicity in its center (i.e to be signifi-cantly hydrophilic), whereas its edges (helices D and

Table 2 Continued.

Trang 6

QC

(dQ

QN

Qc

Qlink

Qprot

QC

dQ

Q jjprot N

Linker length

n 

Trang 7

QN

Qc

Qlink

Qprot

QC

dQ

Qprotjj N

Linker length

n 

Trang 8

E) are hydrophobic In the present study, the same

calculations were applied to all known-structure

pro-teins, and it was observed that, in some cases,

non-extended proteins (e.g recoverin; 1REC) present the

same hydropathy profile around the linker as extended

proteins, such as calmodulin or troponin C (1OSA and

4TNC; Fig 3D) Therefore, none of the criteria

retained in the sosuidumbbell algorithm are

com-pletely reliable to predict the form of the CaBPs This

motivated our search for other criteria

Utilization of other criteria

Contact area

We analyzed the contact area between the domains of

known-structure non-extended CaBPs As expected,

most of the residues at the interface were found to be

hydrophobic In most compact structures, a trypto-phan (or less frequently a phenylalanine) located in one domain was buried in a hydrophobic cavity in the other domain, which would stabilize the compact structure Unfortunately, this observation cannot be used as a predictive tool starting from the sequence because the aromatic residue is not located in a specific part of it Indeed, the sequence of the linker and its close vicinity (three more residues from each side of the linker) does not always contain tryptophan or phenylalanine residues for compact forms (see 1REC, 1JBA, 1BJF and 2SCP in Table 3)

The presence of helix breakers Prolines and, to a lesser extent, glycines, are well-known helix breakers We investigated the presence of

0

5

10

15

20

25

30

Q prot

Protein number

Protein number

Protein number –0.05

0 0.05 0.1 0.15 0.2

d Q

150

200

1 2 3 4

Linker Helix E Helix D

0

50

100

Q C

–4 –3 –2 –1 0

Relative residue number

struc-tures (filled circles), the known non-extended strucstruc-tures (open diamonds) and the unknown strucstruc-tures of centrins (filled triangles) It can be seen that two extended structures are mispredicted (1GGZ and 2GGM) and that all the unknown-structure centrins are predicted to be non-extended (B) The absolute net charge density (dQ) with a horizontal line limit at 0.14 (C) The absolute value of the product of the two domain charges (|QNQC|) in the absence of calcium ions with a horizontal line limit at 100 In this case, only tropnin C molecules are pre-dicted to be extended (D) The hydrophobicity profile of the linker region and its surroundings using the Kyte & Doolitle Scale for two extended structures (dotted lines, 1OSA and dashed line, 4TNC) and for a non-extended one (solid line, 1REC) For convenience of compari-son, the three sequences were renumbered and centered on the linker The zero point corresponds to residue number 92 in 4TNC, 81 in 1OSA and 98 in 1REC, which represents the center of the linker in each case.

Trang 9

such residues in the linker or its vicinity (i.e plus three

residues from each side of the linker) The results

pre-sented in Table 3 show that, as expected, the presence

of a Pro yields a non-extended form by breaking the

central helix that constitutes the linker, but the reverse

is not true because all the compact CaBPs do not

con-tain a Pro in the linker Therefore, this criterion

can-not constitute a predictive rule Moreover, concerning

glycines, it was observed that, in both troponin C

pro-teins (4TNC and 1TN4), which are extended, there is

one Gly in the linker, as in bovine recoverin (1REC),

guanylate cyclase activating protein 2 (1JBA) and

bovine neurocalcin d (1BJF), which present very

com-pact structures

Net electric charge of the linker

It might be assumed that the net electric charge of the

linker plays a role if there is repulsion between this

lin-ker and the adjacent domains Thus, this property was

investigated (Table 3) but did not yield a good

discrim-inating criterion because, in HsCen2 (2GGM), which

is extended, the linker is neutral as in bovine

neurocal-cin d (1BJF) or amphioxus sarcoplasmic

calcium-bind-ing protein (2SAS), which are non-extended structures

Hydrophilicity of the linker

The criterion that yielded the best results was based on

the hydrophilicity of the linker It was obtained by the

procedure detailed below First, the hydrophilicity (hi)

of each residue i of the protein was calculated using

the Hopp & Woods Scale [27] with a nine-residue

slid-ing window In this scale, positive values correspond

to hydrophilic positions

Second, the linker was determined as described

above: if the last residue of the calcium-binding loop II

is denoted J and the first residue of the

calcium-bind-ing loop III is denoted K, the linker consists of all

resi-dues comprised in the interval ]J + 10, K) 10[

Finally, the LAH was calculated:

i2 Jþ10;K10  ½

hi n

where n is the number of residues in the linker and hi

is the hydrophilicity at position i of the linker

This procedure was applied to all proteins in

Tables 1 and 2 The results are presented in Fig 4

Remarkably, the LAH values discriminated well

between the extended and non-extended forms of the

known structures of the CaBPs, with two distinct sets

of points, where LAH was greater than 1.6 for the

extended forms and < 1.2 for the others Therefore,

an average value of 1.4 was considered as the thresh-old above which a two-domain EF-hand protein is extended Moreover, one of the reviewers of the pres-ent study suggested the case of calcineurin B-like pro-tein 2 from Arabidopsis (SwissProt code: Q8LAS7, PDB code: 1UHN), which we omitted to consider in our sample The protein consists of 226 residues and the linker of five residues (residues 117–121) The cal-culated LAH value is 0.2978, predicting a compact structure in good agreement with the 3D structure of the protein Considering centrins with unknown struc-tures, it can be seen that the LAH values reflect well the phylogenetic classification, although this classifica-tion is based on the entire sequence, whereas LAH is based on only few residues in the linker region

To determine whether the discrimination potency of the linker average hydrophilicity is fortuitous or not, LAH values were reported versus the radius of gyra-tion of the known structures in Fig 5 A clear correla-tion is demonstrated between these two features, with

a correlation coefficient equal to 0.82 and a Student coefficient of 36.98 (for 16 degrees of freedom that cor-respond to 17 points), indicating that the probability

of this correlation to be random is < 0.001 The LAH algorithm is available at: http://u759.curie.u-psud.fr/ modelisation/LAH

The predictive potency of the present method depends on the determination of the linker limits, which must be defined objectively To find such a defi-nition, several delimitations were tested, including the

–0.5 0 0.5 1 1.5 2 2.5

Protein number

Cen2

AtCen

ICL11

ICL1e

ICL5 ICL10 Cdc31

ICL8 ICL9

Fig 4 The LAH for the investigated proteins The horizontal line delimits between the predicted extended structures (LAH > 1.4)

delimit between the known extended structures (filled circles), the known non-extended structures (open diamonds) and the unknown structures of centrins (filled triangles) For the unknown-structure centrins, we indicate the phylogenetic subfamilies.

Trang 10

one used in the sosuidumbbell tool We have

observed that considering long linkers, which overlap

adjacent helices, does not allow us to discriminate

between the different forms of CaBPs because the

results were polluted by the nature of the extra

resi-dues, whereas the shortest possible linkers provided

the most reliable way to discriminate between the

extended and compact forms However, it must be

noted that the influence of four neighboring residues

at both ends of the linker are taken indirectly into

account because of the nine-residue window used in

the calculations of hydrophilicity Raw hydrophilicity

data (equivalent to a one-residue window) were also

tested to check the importance of this influence The

results were qualitatively similar to those obtained with

the nine-residue window with respect to the prediction

of the form of the protein, but the correlation between

LAH and the radius of gyration was less evident

Moreover, this discrimination was possible when

calcu-lating LAH with the Hopp & Woods Scale for

hydrop-athy Three other scales were tested (Kyte & Doolittle

[26], Miyazawa & Jernigen [28] and Janin [29]) but did

not provide satisfactory results This is mainly due to

the scores attributed to the Asn, Gln and Trp residues,

which are considered to be much more hydrophilic in

these scales than in the Hopp & Woods Scale

Applying the LAH method to centrins showed that

the CrCen-like proteins are predicted to be extended,

which is in good agreement with the known structure

of one member of this family, HsCen2 [14,25] The

Cdc31-like family is predicted to be in the

non-extended form, which is also in good agreement with

the known structure of ScCdc31 [23] There are no

experimental information about the other centrins, but

we predict that members of the AtCen family are in an

extended form, similar to the CrCen family, and that

the PtICL family is divided into two sets: the extended proteins (ICL1a, ICL3a and ICL11 subfamilies) and the non-extended ones (ICL1e, ICL3b, ICL5, ICL7, ICL8, ICL9 and ICL10 subfamilies)

Conclusions

The results obtained in the present study indicate that the extended and compact forms of EF-hand proteins

do not necessarily depend on the electric charge of the domains, but they are mainly determined by the hydro-philicity (as determined by the Hopp & Woods Scale)

of the residues that link the two domains The definition

of the linker is very important and should not include residues from the adjacent helices What is remarkable

is that, once the linker is defined objectively, the nature

of its residues appears to determine the form of the CaBP, whatever the length of this linker; it can be as long as 20 residues, as in calcineurin B homologous protein 1, 2CT9 (representing approximately 10% of the protein length; Table 3), or as short as two residues,

as in P tetraurelia infraciliary lattice centrins 9, PtICL9 (< 1% of the protein length) However, the length of the linker in the set of proteins considered in the present study is approximately five residues on average, which

is rather short This indicates that the form of CaBPs is likely governed by short-distance interactions

Experimental procedures

Seventeen CABPs with known structures, two of them com-prising centrins, in addition to 59 centrins with unknown structures, were considered in the present study

Choice of the proteins CaBPs with known structures were taken from the PDB [12] Only proteins containing four EF-hand motifs were considered The chosen structures had to obey to several criteria

First, the proteins had to be in their unbound state (i.e not in complex with their target peptides because peptide binding may cause conformational changes of the entire protein) There were, however, two exceptions: human cen-trin 2 (2GGM) and yeast cencen-trin (2DOQ), in which the peptide interacts with only one domain (C-domain) and therefore does not modify the relative position of the two domains In addition, these two structures were the only ones available in the PDB for this family of proteins Second, the EF-hand proteins, which had an extended structure resolved by NMR, were discarded because they did not provide enough information concerning the relative positions of their domains

16

18

20

22

LAH

Fig 5 The radius of gyration of the known-structure CaBPs versus

their LAH The straight line shows the linear fit of the points The

correlation coefficient is 0.82.

Ngày đăng: 30/03/2014, 02:20

TỪ KHÓA LIÊN QUAN

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

🧩 Sản phẩm bạn có thể quan tâm