The assessment of the performance of the methods was made difficult by at least two fac-tors: a the experimentally determined function of the targets was not available at the time of asse
Trang 1Marialuisa Pellegrini-Calace1,*, Simonetta Soro1,* and Anna Tramontano1,2
1 Department of Biochemical Sciences ‘A Rossi Fanelli’, University ‘La Sapienza’, Rome, Italy
2 Istituto Pasteur, Fondazione ‘Cenci-Bolognetti’, University ‘La Sapienza’, Rome, Italy
Modern biology strongly exploits the rapid progress in
generation of experimental data and development of
computational methods The identification and
charac-terization of proteins on a genome-wide scale is
accomplished by proteomics projects, and
computa-tional biology plays a pivotal role in the determination
of their structure–function relationships and in the
pre-diction of their biological functions [1] A number of
computational and experimental methods have been developed in the last few years to try to address the function prediction issue using available information, from sequence homology to orthology, gene context and structural features [2,3] Despite the complexity of the relationship between protein fold and protein func-tion and the existence of proteins with multiple differ-ent functions [4,5], global structure similarity can often
Keywords
Critical Assessment of Techniques for
Protein Structure Prediction (CASP);
function prediction; protein function;
structural genomics
Correspondence
A Tramontano, Department of Biochemical
Sciences ‘A Rossi Fanelli’, University ‘La
Sapienza’, P.le Aldo Moro, 5-00185 Rome,
Italy
Fax: +39 06 4440062
Tel: +39 06 49910556
E-mail: anna.tramontano@uniroma1.it
*These authors contributed equally to this
work.
(Received 21 February 2006, revised
11 April 2006, accepted 5 May 2006)
doi:10.1111/j.1742-4658.2006.05309.x
The ability to predict the function of a protein, given its sequence and⁄ or 3D structure, is an essential requirement for exploiting the wealth of data made available by genomics and structural genomics projects and is there-fore raising increasing interest in the computational biology community
To foster developments in the area as well as to establish the state of the art of present methods, a function prediction category was tentatively introduced in the 6th edition of the Critical Assessment of Techniques for Protein Structure Prediction (CASP) worldwide experiment The assessment
of the performance of the methods was made difficult by at least two fac-tors: (a) the experimentally determined function of the targets was not available at the time of assessment; (b) the experiment is run blindly, pre-venting verification of whether the convergence of different predictions towards the same functional annotation was due to the similarity of the methods or to a genuine signal detectable by different methodologies In this work, we collected information about the methods used by the various predictors and revisited the results of the experiment by verifying how often and in which cases a convergent prediction was obtained by methods based on different rationale We propose a method for classifying the type and redundancy of the methods We also analyzed the cases in which a function for the target protein has become available Our results show that predictions derived from a consensus of different methods can reach an accuracy as high as 80% It follows that some of the predictions submitted
to CASP6, once reanalyzed taking into account the type of converging methods, can provide very useful information to researchers interested in the function of the target proteins
Abbreviations
BIND, binding feature number indicating the nature of putative interacting partners; BS, binding site residue identifier; CASP, Critical Assessment of Techniques for Protein Structure Prediction; GOC, GO cellular component number; GOF, GO molecular function number; GOP, GO biological process number; PT, post-translational modification number; RESIDUE-ROLE, free comment on residues with a putative peculiar role.
Trang 2help in assigning a function to proteins [6,7] This
abil-ity is important for the many ongoing structural
genomics projects that will provide us with more and
more structures of proteins with very low sequence
identity with proteins of known structure
The task of predicting the function of a protein is
exceptionally interesting but very challenging The
existence of paralogous relationships implies that a
common evolutionary origin does not guarantee
com-mon function [8,9] Moreover, the discovery of
moon-lighting proteins, able to perform different functions in
different conditions or environments, makes the
prob-lem even more complex [10,11]
The Critical Assessment of Techniques for Protein
Structure Prediction (CASP) [12] community
recog-nized the relevance of this issue and tried to foster
novel developments by setting up a function prediction
category in addition to the well-known structure
pre-diction ones The question addressed was whether and
in which cases computational methods are able to
pro-vide useful information about the molecular or
biologi-cal function of an unknown protein, with the aim of
providing researchers with potentially useful
informa-tion [12]
This category is intrinsically different from the
other CASP categories because, at the end of the
experiment, the function of the target protein is likely
still to be unknown However, the analysis of the
submitted predictions made the assessors conclude
that [13]:
(a) groups predicting the 3D structure of a protein
only rarely used this information to predict its
function as well, and vice versa;
(b) in a substantial fraction of cases, the same
predic-tion was submitted by different groups and
there-fore a ‘prediction consensus’ could be derived for
some targets
CASP is run blindly This implies that the assessor
should not know the identity of the predicting groups
and therefore cannot take into account the method
used for deriving a prediction Indeed it was suggested
and accepted that, in subsequent editions of the
experi-ment, a general description of the method used should
be made available to the assessor
After the experiment was concluded, we revisited the
data collected and reassessed the results, taking into
account the methods used We took advantage of the
knowledge of the identity of the predicting groups as
well as of functional annotations that have become
available in the mean time Our results show that a
basic knowledge of the methods is important for
understanding the level at which predictions can be
trusted and for assessing the reliability of the predicted
functions They also show that predictions derived by
a consensus among groups using different, non-redun-dant, methods can reach an accuracy of 80%
Results
The protein set at the beginning of CASP6 contained
87 targets, 23 of which were discarded during the experiment because of practical issues, such as early or late release of the 3D protein structure Therefore, the set that was considered contained 64 protein targets,
29 of which had no functional annotation in any data-base at the time of the experiment A function predic-tion for at least one of the 64 targets was submitted by
23 of the total 172 predictors
Within the CASP6 experiment, seven classes of tion predictions were considered: GO molecular func-tion numbers (GOFs), GO biological process numbers (GOPs), GO cellular component numbers (GOCs), binding feature numbers indicating the nature of puta-tive interacting partners (BINDs), binding site residues identifiers (BSs), free comments on residues with a putative peculiar role (RESIDUE-ROLEs) and post-translational modification numbers (PTs) In the pre-sent study, the BS and RESIDUE-ROLE subsets were not analyzed because of the high variability in the type and format of the submitted predictions
We considered 1590 total function predictions sub-mitted by 18 groups for 64 protein targets, as GOFs (568), GOPs (445), GOCs (363), BINDS (150), and PTs (64)
Method classification The first step was to recover the information about the method used by each predictor, by inspecting the abstracts submitted to CASP, performing literature searches and, in some cases, directly contacting the predictors
Each method was assigned to one or more of five categories (F1 to F5), here called features, on the basis
of the type of information used by predictors The use
of sequence information corresponded to the F1 cate-gory F2 indicated the use of structural features Meth-ods using the GO database for any reason other than deriving GO numbers for submission were assigned to F3 Literature-based methods and manual methods were indicated as F4 and F5, respectively (Table 1) Therefore, each method could be classified by a five bit binary code indicating the presence (1) or absence (0)
of each of the five features For instance, the 1000 code corresponds to completed automated (F5 ¼ 0) methods based on sequence information (F1 ¼ 1)
Trang 3which do not take advantage of structural (F2¼ 0),
GO (F3¼ 0) and literature (F4 ¼ 0) information
The distribution of single features and binary
combi-nations are shown in Fig 1A and Fig 1B,
respect-ively
Although the possible theoretical combinations of
features are 25 (52), the F1 feature was used by all the
18 predictor groups, reducing the possible binary
com-binations to 20 The observed binary codes were only
8, showing a lower than expected variability in the combination of used information More than 50% of the possible combinations were not found, and the majority of predictors used canonical feature combina-tions In fact, all groups used sequence information, nine took advantage of literature information, and six used structure information, but only four predictors exploited the GO database and only one of them com-bined it with structural features Moreover, predictors using GO never took literature information into account and vice versa It is worth highlighting here that 11 groups used an approach developed in-house for features F1, F2 or F3
For each of the function prediction classes, we com-puted a consensus value and a redundancy value within the consensus [F(red)] (Table 2 and Fig 2) A consensus is defined as the number of identical predic-tions submitted by at least two predictors The redund-ancy value F(red) indicates the method variability in terms of feature combinations within a consensus and
is calculated as follows:
F(red)¼N(red)
N(tot) where N(red) and N(tot) are the number of methods with the same binary identifier, i.e the number of redundant methods, and the total number of methods generating the consensus, respectively Lower F(red) values reflect a higher variability in the type of meth-ods generating the consensus
For the GOF, GOP and GOC classes, we also com-puted a consensus value among the GO parents of the submitted predictions for up to three levels of the GO ontology (P1, P2 and P3), to verify whether less
speci-fic consensus could be achieved by different methods The results are shown in Table 2
A consensus was never found for prediction categor-ies other than GOF, GOP and BIND, although the number of GOF and GOP consensus was significantly
Table 1 List of features used to classify methods exploited for the
predictions.
F1 Sequence information BLAST , PFAM , INTERPRO , CHOP ,
CHIEFC , PROSITE , SMART ,
PRINTS , PHYRE , others F2 Structure information 3 D - JURY , HMAP , PROFUNC ,
COLUMBA , PHUNCTIONER
F4 Literature information
25.00
20.00
15.00
10.00
5.00
0.00
F1 (seq) F2 (str) F3 (GO) F4 (lit) F5 (man)
Own Unknown
A
4
5 2
2
2
1
10011 10100 10001 10000 11100 10101 11001 B
Fig 1 Distribution of method features (A) Number of methods
including F1 (sequence information), F2 (structure information), F3
(use of GO database for any other reason than deriving GO
numbers for submission), F4 (literature information), F5 (manual
intervention), number of methods developed in-house (own) and
number of methods for which no description is available
(unknown) (B) Distribution of binary method class identifiers
among the 18 methods submitting predictions (in-house developed
methods and methods for which no information is available are not
included).
Table 2 Number of GO identifiers predicted by at least two groups GOF, Number of GO function predictions; GOP, number of
GO process predictions; GOC, number of GO cellular component predictions; BIND, number of binding predictions; PT, number of post-transcriptional modification predictions; NA, not applicable The number of predictions corresponding to the ‘unknown’ annotation
is shown in parentheses.
Trang 4lower than the number of submitted predictions per
targets (70 out of 450 and 19 out of 457 for GOF and
GOP, respectively) The fraction of targets for which a
consensus could be found is high for the BIND class,
accounting for about one third of the total submitted
predictions (31 out of 150)
About half of the consensus predictions were
gener-ated by two predictions only, except for the GOP class,
where about 40% of the consensus predictions were
obtained by three independent methods It should be
noticed that some of the consensus predictions also
included annotations such as ‘unknown molecular
func-tion’ or ‘unknown biological process’, highlighted in
parentheses in Table 2 The exclusion of ‘unknown
bio-logical process’ predictions left only two of the 19 GOP
consensus predictions that were generated by three
sub-missions corresponding to three different methods
Figure 2B shows a histogram of the fraction of
redundancy for the three functional classes
Interest-ingly, redundancy values between 0 and 0.2 were often
observed, corresponding to a variability of at least
80% in the combinations of features generating the
consensus
When annotations were grouped according to parent levels of the respective GO terms (one level, P1; two levels, P2; three levels, P3), neither the number of consensus predictions nor the fraction of redund-ancy changed significantly (supplementary material, Fig S1) This is most likely due to the somewhat lim-ited depth of the GO graph, so that the existence of a common node between two predictions does not neces-sarily provide additional information
Target function annotation versus time
At the beginning of the CASP6 experiment, 42, 32 and
9 targets had a molecular function, a biological process and a cellular component annotation, respectively In
23 cases, information about interaction partners was available, whereas no annotation about post-transla-tional modifications was present in the databases One year later (October 2005), the available annotations decreased by 5% to 10%, showing that the knowledge
of the 3D structure of a protein allows its function annotation to be improved, even if this can just imply removing a previous annotation (supplementary mater-ial, Fig S2) In fact, for 11 targets at least one molecu-lar function annotation was either modified or deleted between the end of 2004 and the end of 2005 (supple-mentary material, Table S1) In the same period, the number of non-annotated targets decreased by 8, 6, 2 and 1 for GOF, GOP, GOC and BIND, respectively These data confirm that the process of assigning a function to proteins is still a very difficult task for both experimental and computational biologists and suggest that there is still a long way to go to fully exploit genome-scale data
Interestingly, only in about half of the cases did pre-dictions agree with function assignments that were sub-sequently removed, suggesting that taking into account different functional predictions for a given protein might be helpful in avoiding errors in database annota-tion
Predictions versus target function annotation
We can reliably assess the correctness of a prediction only for those cases where a subsequently released annotation is available for targets that had no annota-tion at the time of CASP6 (supplementary material, Fig S2) This subset was made of only 11 targets and included 24, 7, 2 and 1 GOF, GOP, GOC and BIND annotations, respectively Because of the sparseness of the data, the analysis was limited to GOF functional assignments only and included the eight targets listed
in Table 3, five of which belong to the comparative
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
N(c-preds)
GOFs GOPs BINDs
A
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0.0-0.2 0.2-0.4 0.4-0.6 0.6-0.8 0.8-1.0
F(red)
B
GOFs GOPs BINDs
Fig 2 (A) Fraction of consensus of prediction [F(preds)] as a
func-tion of the number of contributing predicfunc-tions [N(c-preds)] (B)
Frac-tion of consensus of predicFrac-tion [F(c-preds)] as a funcFrac-tion of the
redundancy of the contributing methods [F(red)], i.e of the number
of methods exploiting the same source of information for the
pre-diction (see text for a detailed definition).
Trang 5model (CM) and three to the fold recognition (FR)
classes Sixty-eight groups submitted 85 predictions, 32
of which converged into 12 consensus predictions
These predictions were compared with the current
tar-get function annotation (October 2005) present in the
UniProt [13], Entrez Gene [14] and InterPro [15]
data-bases (Tables 3 and 4)
Among the predictions submitted (85 by 68 groups),
about 20% (18) were correct, i.e overlapped with the
current annotation of the corresponding targets
Inter-estingly, 17 of them were consensus predictions
More-over, although a significantly higher number of
consensus predictions was observed for CM than for
FR targets (11 against 1, respectively), the only FR
target (T0243) consensus matched the corresponding
function annotation
Even if the size of the dataset is too small to derive general conclusions, we believe that the success rate for these predictions supports both the usefulness of the experiment and the validity of our method for deri-ving the consensus prediction Moreover it suggests that an ‘easy’ structure prediction does not necessarily correspond to an ‘easy’ function prediction The defini-tion of ‘difficulty’ of a target has been the subject
of many debates in the structure prediction field Although clearly needed as well, equally complex is defining the difficulty of a function prediction Here
we took the view that a target not annotated in any database at the time of prediction is difficult to pre-dict Given the time constraints imposed by the CASP experiment, most of the methods used in the experi-ment are automatic ones and it is unlikely that any of the predictions include a large amount of human inter-vention, unlike the case for curated databases
We feel it is inappropriate to try to derive a ranking
of the various methods on the basis of such a limited dataset, but if, as we expect, participation in future experiments increases, it will be possible to derive con-clusions about the quality of different methods More importantly, the number of consensus predictions will also increase, and this will allow a substantial number
of correct functional predictions to be produced The CASP6 assessor highlighted five cases where a consensus could be derived by comparing the different submitted predictions, although the design of the experiment did not allow the redundancy of the meth-ods to be taken into account at the time Three of these targets (T0226, T0243 and T0263) were annotated between the end of CASP6 and the time that this analy-sis was performed For T0243 and T0263, the newly deposited annotations match the consensus prediction T0243 was predicted and proved to bind DNA, and T0263 was predicted to have oxidoreductase activity
Table 3 List of targets annotated after October 2004 and the corresponding submitted predictions Target, CASP6 target identifier; Class, target classification (CM, comparative modeling; FR, fold recognition; NF, new fold); Ann DB, annotation database (EG, Entrez Gene; IP, InterPro; UP, UniProt); GOF, GO molecular function identifier, according to GO database definition; Pred (Sub) , number of predictions submit-ted; NP, number of predictors; N(Cons), number of consensus predictions found; Pred(Cons), number of predictions generating the consensus Bold, Annotations correctly predicted by at least one group.
Table 4 GO function predictions by method class (October 2005
annotated targets only) Bin, Class binary identifier; Sub Pred,
num-ber of predictions submitted by methods belonging to the class;
Sub GOFN, number of GOF submitted by methods belonging to
the class; Exact Pred, number of predictions (out of a total of 14)
corresponding to annotations found in UniProt, Entrez Gene and
InterPro databases; Exact GOFN, number of predicted GOF
num-bers (out of a total of 12) corresponding to annotations found in
UniProt, Entrez Gene and InterPro databases Numbers in
paren-theses indicate the predictions that can be clustered in terms of
common GO parents (up to three levels).
Bin
Sub Pred
(ConsP1-P3)
Sub GOFN (ConsP1-P3)
Exact Pred (ConsP1-P3)
Exact GOFN (ConsP1-P3)
Trang 6and is indeed annotated as 3-isopropyl malate
dehy-drogenase Both consensus predictions were achieved
by predictions submitted by two different methods
(type 10100 and 10011 for T0243 and type 11011 and
11100 for T0263) For T0226, there were three
consen-sus predictions: isomerase, transferase and sugar
bind-ing; the current annotation suggests that the protein
has a structural role, which may or may not be
compat-ible with sugar binding
In the light of these findings, we can confidently
conclude that consensus predictions, normalized on the
basis of redundancy of the methods, could be useful to
researchers in narrowing the number of biological
assays needed for protein function assignments and
speed up the difficult and challenging functional
anno-tation process As we anticipate that this will prove to
be useful to our research colleagues, the consensus
pre-dictions for targets with no current function
annota-tion are reported in Table 5
Conclusions
The prediction of protein function is one of the major
challenges of protein bioinformatics The growing
number of completely sequenced genomes has allowed
the development of a number of new approaches
com-plementary to the use of sequence analysis, which can
be combined to elucidate complete functional networks
and biochemical pathways
The CASP community set up a new function
predic-tion category aimed at understanding whether and in
which cases computational methods are able to pro-vide useful information about the molecular or biologi-cal function of an unknown protein and to provide useful information to researchers working on the target proteins
Here we revisited the results of this CASP6 cate-gory with two aims: (a) to verify how relevant a knowledge of the methods used by predictors for assessing the results is; (b) to see to what extent infor-mation made available after the end of the experiment could contribute to our understanding of which are the best strategies for providing useful information to researchers
Our results show that consensus predictions gener-ated by diverse methods, i.e methods exploiting differ-ent sources of information, are more reliable than predictions obtained by a single method and can be used as indicators of the reliability of the prediction A general knowledge of the methods used is therefore important for understanding the level at which predic-tions can be trusted and should be made available to the CASP assessor in the next round of the experi-ment
The conclusions are based on a small number of cases as we can only use the few cases for which annotation is available now and not at the time of the CASP experiment in order to properly assess the correctness of the predictions However, it is interest-ing to note that, if we trust that the submitted
annotations, and therefore include predictions of already annotated targets in our analysis, a molecu-lar function and biological process was correctly pre-dicted for about half of the protein targets, and more than 80% of the exact predictions were within
a consensus (data not shown) On the other hand, more than 30% of the 64 CASP protein targets still have no molecular function annotation in any data-base and more than half of them have no biological process annotation Clearly therefore the assignment
of functional data to proteins is still a very difficult task not only from a computational point of view, but also experimentally We hope that the present analysis, and especially the observation of the high reliability of consensus predictions, will encourage predictors to participate in the next CASP functional prediction experiments as well as convince research-ers to take into account the results in designing their experiments It is our opinion that the CASP func-tion predicfunc-tion experiment can provide a significant contribution and promote important and useful development in the area of protein function predic-tion
Table 5 Consensus of predictions for non-annotated targets (as in
October 2005) Target, Target identifier; GOF, predicted GO
molecular function identifier; Function, molecular function
descrip-tion; NP, number of predictors; N(comb), number of different method
binary identifiers, i.e of nonredundant methods, that contribute to
the consensus.
T0222 50825 Ice binding (antifreeze activity) 4 1
T0232 4364 Glutathione transferase activity 9 4
T0237 3793 Defense (immunity protein activity) 2 1
30528 Transcription regulator activity 4 3
3700 Transcription factor activity 3 2
3677 DNA binding (functional
hypothesis: transcription factor)
Trang 7Experimental procedures
Submitted predictions are available at the CASP web site
(http://www.predictioncenter.org)
All analyses were performed using in-house built scripts
in the PERL programming language
References
1 Wolfson HJ, Shatsky M, Schneidman-Duhovny D,
Dror O, Shulman-Peleg A, Ma B & Nussinov R (2005)
From structure to function: methods and applications
Curr Protein Pept Sci 6, 171–183
2 Jones S & Thornton JM (2004) Searching for functional
sites in protein structures Curr Opin Chem Biol 8, 3–7
3 Gabaldon T & Huynen MA (2004) Prediction of protein
function and pathways in the genome era Cell Mol Life
Sci 61, 930–944
4 Todd AE, Orengo CA & Thornton JM (1999)
Evolu-tion of protein funcEvolu-tion from a structural perspective
Curr Opin Chem Biol 3, 548–556
5 Thornton JM, Todd AE, Milburn D, Borkakoti N &
Orengo CA (2000) From structure to function:
approaches and limitations Nat Struc Biol 7, 991–994
6 Dietmann S & Holm L (2001) Identification of
homol-ogy in protein structure classification Nat Struct Biol 8,
953–957
7 Orengo CA, Jones DT & Thornton JM (1994) Protein
superfamilies and domain superfolds Nature 372, 631–
634
8 Devos D & Valencia A, (2000) Practical limits of
func-tion predicfunc-tion Proteins: Structure, Funcfunc-tion,
Bioinfor-matics 41, 98–107
9 Rost B (2002) Enzyme function less conserved than
anticipated J Mol Biol 318, 595–608
10 Jeffery CJ (2003) Moonlighting proteins: old proteins
learning new tricks Trends Genet 19, 415–417
11 Jeffery CJ (2003) Multifunctional proteins: examples of
gene sharing Ann Med 35, 28–35
12 Moult J, Fidelis K, Rost B, Hubbard T & Tramontano
A (2005) Proteins: Structure, Function, Bioinformatics Supplement 7, 3–7
13 Bairoch A, Apweiler R, Wu CH, Barker WC, Boeck-mann B, Ferro S, Gasteiger E, Huang H, Lopez R, Magrane M, et al (2005) The Universal Protein Resource (UniProt) Nucleic Acids Res 33, D154–D159
14 Maglott D, Ostell J, Pruitt KD & Tatusova T (2005) Entrez Gene: gene-centered information at NCBI Nucleic Acids Res 33, D54–D58
15 Mulder NJ, Apweiler R, Attwood TK, Bairoch A, Bat-eman A, Binns D, Bradley P, Bork P, Bucher P, Cerutti
L, et al (2005) InterPro, progress and status in 2005 Nucleic Acids Res 33, D201–D205
Supplementary material
The following supplementary material is available online:
Fig S1 Percentage of consensus predictions as a func-tion of the redundancy [F(red)] of the contributing methods (A) GOF functional class; (B) GOP func-tional class; (C) GOP funcfunc-tional class, ‘‘unknown bio-logical process’’ prediction excluded; (D) GOC functional class
Fig S2 (A) Number of annotated targets versus time: the dotted blue line indicates the number of annotated targets for which the annotation did not change between December 2004 and October 2005 (B) Num-ber of non-annotated targets versus time
Table S1 Submitted predictions for targets for which there was at least one GOF annotation in October
2004, subsequently removed
This material is available as part of the online article from http://www.blackwell-synergy.com