Báo cáo khoa học: Ten years of predictions…and counting docx

The goals of the experiments are: to evaluate the accu-racy of current methods for protein structure prediction; to identify bottlenecks and to indicate the directions where efforts can

Trang 1

R E V I E W A R T I C L E

Domenico Cozzetto1, Adele Di Matteo1and Anna Tramontano2

1 Department of Biochemical Sciences, University of Rome ‘La Sapienza’, Italy

2 Istituto Pasteur – Fondazione Cenci Bolognetti, University of Rome ‘La Sapienza’, Italy

In 2004, the Critical Assessments of Techniques for

Pro-tein Structure Prediction (CASP), celebrates its tenth

anniversary The initiative, notwithstanding its relatively

long tradition, remains lively and challenging It is

organized by John Moult (Center for Advanced

Research in Biotechnology, Rockville, MD, USA),

Krzysztof Fidelis (Lawrence Livermore National

Labor-atory, Livermore, CA, USA), Tim Hubbard (Sanger

Institute, Hinxton, UK), Burkhard Rost (Columbia

Uni-versity, New York, NY, USA) and Anna Tramontano

(University of Rome, Italy) with the invaluable help of

Andriy Kryshtafovych (Lawrence Livermore National

Laboratory) and Volker Eyrich (Columbia University)

The goals of the experiments are: to evaluate the

accu-racy of current methods for protein structure prediction;

to identify bottlenecks and to indicate the directions

where efforts can best be focused The scheme is simple:

the organizers collect sequences of ‘targets’ i.e of

pro-teins, the structure of which are likely to be solved within

a few weeks These sequences are made available to the

community of computational biologists who attempt to

predict their three-dimensional structures as well as other

relevant biological properties, e.g domain boundaries,

long range inter-residue contacts, disordered regions

and, when not previously known, function Once the

experimental structures of the targets are available, they

are compared with the collected predictions using a large variety of numerical measures, and the data generated are stored in a database in the Livermore Laboratory Prediction Center Experts in the ﬁeld of protein struc-ture prediction are asked to critically evaluate the results and highlight progress and bottlenecks in the ﬁeld

In 2004, the community selected Alfonso Valencia (Centro Nacional de Biotecnologia, Madrid, Spain), Roland Dunbrack (Fox Chase Cancer Center, Philadel-phia, PA, USA) and B K Lee (National Institutes of Health, Bethesda, MD, USA) The process, lasting from spring to winter of each even-numbered year, is conclu-ded by a meeting where the community convenes to dis-cuss the results This year, for the ﬁrst time, the meeting was held in Europe, in Gaeta on 4–8 December

During its ten-year history, CASP has been instru-mental in convincing both the computational and experi-mental communities that the prediction of the structure

of proteins non-evolutionarily related to proteins of known structure is not completely out of reach Indeed, fold recognition methods (i.e methods that try to iden-tify which of the known topologies is the most likely for

an unknown protein); effective techniques for predicting secondary structure and, more recently, methods able to assemble fragments of proteins of known structure to construct the structure of proteins the architecture of

Keywords

automatic prediction servers; CASP; model

evaluation; protein structure prediction

Correspondence

A Tramontano, University of Rome ‘La

Sapienza’, Department of Biochemical

Sciences, 5 Piazzale Aldo Moro,

Rome 00185, Italy

E-mail: anna.tramontano@uniroma1.it

(Received 15 December 2004, accepted 24

December 2004)

doi:10.1111/j.1742-4658.2005.04549.x

The CASP experiment has been run every other year since 1994 Its objec-tive is to subject the available structure prediction methods to a blind test This is a short report of the highlights of its last edition

‘Men who wish to know about the world must learn about it in its partic-ular details’ (Heraclitus of Ephesus, 535–475 bc)

Trang 2

which is completely novel, have all been fostered and

popularized by CASP – a fairly major contribution to

molecular biology in the postgenomic era

Indeed, the contribution of CASP has not only been

to evaluate the quality of the approaches and promote

cross-fertilization between them, but also to validate

which of the tools are sufﬁciently mature and reliable to

become part of the standard suite of methods that

experimental biologists can use regularly This year,

among the participants selected by the assessors to

des-cribe their strategy and results there were ‘the usual

sus-pects’, namely, David Baker (Washington University,

USA), Jeff Skolnick (Univeristy of Buffalo, NY, USA),

David Jones (UCL, UK), Kevin Karplus (UC Santa

Cruz, USA), Krzyzstof Ginalski, Janusz Bujnicki and

Andrzej Kolinski (Warsaw University, Poland), but also

new participants such as Mayuko Takeda-Shitaka

(Kitasato University, Japan), Kentaro Tomii (National

Institute of Advanced Industrial Science and

Technol-ogy, Tokyo, Japan), Yaoqi Zhou (University of

Buf-falo), Ming Li (University of Waterloo, Canada) The

results and a description of the methods can be found at

the CASP website (http://predictioncenter.llnl.gov/

casp6) It is fair to say here that, thanks to the insights

and efforts of these groups, as well as to the hard work

of many others, the problem of predicting the overall

topology of many proteins is clearly within reach, and

this is certainly good news for many experimentalists

Figure 1 shows one example where the experimentally

determined structure of the ﬁrst domain of target 272, a

hypothetical protein from Thermus thermophilus that is

a protein with no detectable sequence similarity with

any known structure, is compared with the model

pro-duced by the group of David Baker There is a clear cor-relation between the quality of a model and its range of application Even in the most difficult cases, these mod-els are usually sufficient to understand the general prop-erties of the molecule thus identifying solvent-exposed regions, flexible parts and, in some cases, to reveal unex-pected evolutionary relationships useful for function assignment However, for other applications such as drug design or the prediction of substrate specificity, the level of detail required is much higher Furthermore, methods usually produce alternative models and, in the most difficult cases, distinguishing which one is closer to the real structure represents a serious bottleneck There

is a consensus in the field that this latter task is easier when a model very close to the native structure is pre-sent in the ensemble of models It is not surprising, therefore, that discussions at the meeting focused on how to push the field towards devoting more effort to the refinement of the models, to the extent that the com-munity is discussing ways to set a required minimum quality of a model below which it would not be consid-ered at all in the assessment This can be, for example, the quality of the best model obtained by automatic pre-diction servers, some of which have obtained results comparable to those of the best research groups The appearance of the above mentioned fragment-based methods for predicting the structure of proteins with a new fold had the undesirable effect of discour-aging ab initio methods for protein structure predic-tions, which could not compete with the quality achieved by the heuristic methods However, at least one example of successful ab initio prediction by the group of Harold Scheraga (Cornell University, Ithaca,

NY, USA) was reported in this meeting for target T0215, a 48 residue-long protein This, together with the possibility of using ab initio energy-based methods for more accurate reﬁnement of the modelled struc-tures should, we hope, revive the interest of the ‘fold-ers’ to the initiative We are convinced that models obtained by combining energy-based and knowledge-based methods will ﬁnally set the foundations for a solution to the protein folding problem

Acknowledgements The Sixth Edition of the CASP meeting was sponsored

by National Institutes of Health, NLM and NIGMS, Istituto Pasteur – Fondazione Cenci Bolognetti, Euro-pean Molecular Biology Organization (EMBO), Bio-Sapiens Network of Excellence funded by the European Commission FP6 Programme, contract number LHSG-CT-203-503265, Lawrence Livermore National Laboratory, Italtech Solutions and IBM

Fig 1 Comparison between the experimental structure (left) and a

model (right) of the CASP target T272 domain 1 The structure was

solved by A Ebihara, M Yao, S Yokoyama, and S Kuramitsu (RIKEN

Genomic Sciences Center, Yokohama, Japan) (PDB code: 1WJ9), and

the model was submitted by D Baker (Washington University, USA).

Ten years of predictions … and counting D Cozzetto et al.

Định dạng
Số trang	2
Dung lượng	107,5 KB