DSpace at VNU: Squash inhibitors: From structural motifs to macrocyclic knottins

However, Momordica cochinchinensis Trypsin Inhibitor-I and -II MCoTI-I and -II, 34-residue squash inhibitors isolated from seeds of a common Cucurbitaceae from Vietnam, were recently sho

Trang 1

Squash Inhibitors: From Structural Motifs to Macrocyclic Knottins

Laurent Chiche1,*, Annie Heitz1, Jean-Christophe Gelly1, Jérơme Gracy1, Pham T.T Chau2,

Phan T Ha2, Jean-François Hernandez3 and Dung Le-Nguyen4

1

Centre de Biochimie Structurale, CNRS UMR5048, INSERM UMR554, Université Montpellier I, Faculté de Pharmacie, 15, Avenue Flahault, 34093 Montpellier-France; 2 Center for Biotechnology, Vietnam National University,

90, Nguyen Trai Street, Hanọ, Vietnam; 3 Laboratoire des Aminoacides, Peptides et Protéines CNRS UMR5810 Universités Montpellier I & II Faculté de Pharmacie, 15, avenue Flahault, 34093 Montpellier-France; 4 INSERM U376, CHU Arnaud-de-Villeneuve, 371, rue du doyen Giraud, 34295 Montpellier-France

Abstract: In this article, we will first introduce the squash inhibitors, a well established family of highly potent canonical

serine proteinase inhibitors isolated from Cucurbitaceae The squash inhibitors were among the first discovered proteins with the typical knottin fold shared by numerous peptides extracted from plants, animals and fungi Knottins contain three knotted disulfide bridges, two of them arranged as a Cystine-Stabilized Beta-sheet motif.

In contrast to cyclotides for which no natural linear homolog is known, most squash inhibitors are linear However,

Momordica cochinchinensis Trypsin Inhibitor-I and -II (MCoTI-I and -II), 34-residue squash inhibitors isolated from

seeds of a common Cucurbitaceae from Vietnam, were recently shown to be macrocyclic In these circular squash inhibitors, a short peptide linker connects residues that correspond to the N- and C-termini in homologous linear squash inhibitors.

In this review we present the isolation, characterization, chemical synthesis, and activity of these macrocyclic knottins The solution structure of MCoTI-II will be compared with topologically similar cyclotides, homologous linear squash inhibitors and other knottins, and potential applications of such scaffolds will be briefly discussed.

Keywords: Macrocyclic proteins, knottins, inhibitor cystine knots, structural motifs, squash inhibitors, disulfide bridges, drug

design, serine proteinases

INTRODUCTION

Nature has many secrets that remain to be discovered,

and the discovery of macrocyclic proteins has revealed one

such new area that is set to become an important field in

structural biology Several examples of macrocyclic proteins

have appeared in the last few years strongly suggesting that

many are still to come in the near future

The cyclotide family reviewed in the preceding articles in

this issue [1-3] is particularly remarkable in that it comprises

a large number of proteins, all being cyclic Although the

stability afforded by the circular feature is clear, it is unclear,

however, if linear counterparts exist in nature and will be

discovered in the future In this field there is still much to be

explained

In this article we focus on a protein family that was

discovered more than twenty years ago and, until recently,

and in contrast to cyclotides, comprised only linear

compounds, the squash inhibitors The recent unexpected

discovery of circular squash inhibitors from Momordica

cochinchinensis, MCoTI-I and -II [4], supports the idea that

macrocyclization of proteins may not be uncommon in

nature and provides interesting perspectives for structural

*Address correspondence to this author at the Centre de Biochimie

Structurale, CNRS UMR5048, INSERM UMR554, Université Montpellier

I, Faculté de Pharmacie, 15, Avenue Flahault, 34093 Montpellier, France;

Tel: (33)[0]-4670-43432; Fax: (33)[0]-4675-29623; E-mail: chiche@

cbs.cnrs.fr

proteomics, since end-to-end cyclization, like many other post-translational modifications, cannot be simply deduced from genomic sequences

We will first present the main historical and structural highlights of the squash inhibitors This will be extended to the intriguing structural class of proteins known as knottins that share a similar scaffold Our recent efforts to organize and standardize knottin data for improved analyses and comparisons will be briefly discussed

Then starting from this background, we will describe the

macrocyclic squash inhibitors from seeds of Momordica

cochinchinensis, including their discovery, isolation,

sequence and structure They will be compared to cyclotides,

to linear homologs and to structurally similar linear knottins Chemical synthesis and possible applications of this scaffold will be discussed

THE FAMILY OF SQUASH INHIBITORS OF SERINE PROTEINASES

The so-called 'canonical' inhibitors of serine proteinases interact with their target enzyme in a substrate-like

mechanism via a binding loop with characteristic

conformation [5, 6] Among these, the squash inhibitors of serine proteinases are small (27-34 residues) disulfide-rich proteins discovered in the late 1970s in seeds of Winter squash [7] So far, all known homologs originate from the Cucurbitaceae plant family [8] Squash inhibitors were for

Trang 2

342 Current Protein and Peptide Science, 2004, Vol 5, No 5 Chiche et al.

some time the smallest known natural serine protease

inhibitors until the recent discovery of SFTI-1 a 14-residue

long circular peptide, which is discussed in detail by

Korsinczky et al [9] in this issue Association constants with

various serine proteinases may be as high as 10-12 M-1

making these inhibitors among the most potent ones [8] As

with other plant protease inhibitors, squash inhibitors are

presumed to participate in defense mechanisms by conferring

resistance to pests, pathogens or insects [10]

Squash inhibitors contain six cysteines involved in three

disulfide bridges with I-IV, II-V, III-VI connectivity The

first three-dimensional (3D) structure determinations

revealed a very specific knotted scaffold achieved when one

disulfide bridge (III-VI) crosses the macrocycle formed by

the two other disulfides and the interconnecting peptide

backbone [11-13] This remarkably stable knotted topology,

was previously observed in only one compound, the

carboxypeptidase A inhibitor from potato PCI [14] Since

these pioneering observations, nearly one hundred proteins

have been explicitly shown through structural studies to

share this specific knotted scaffold

THE KNOTTIN FOLD AND THE CSB MOTIF

Small disulfide-rich proteins sharing the disulfide

connectivity and topology of the squash inhibitors are now

known as knottins [15] or Inhibitor Cystine Knots [16]

However, despite similar overall topologies, it soon became

apparent that the I-IV disulfide bridge is not structurally

conserved between different knottin families and that only

two disulfides (II-V and III-VI) were highly conserved [17]

This observation was later supported by folding experiments

on the squash inhibitor EETI-II in which it was shown that

two disulfides are necessary and sufficient to stabilize most

of the native structure [18-20]

The synthesis and biophysical study of the truncated

EETI-II peptide Min-23, comprising only cysteines II, III, V

and VI, confirmed that the elementary two-disulfide motif

we called the Cystine Stabilized Beta-sheet (CSB) motif is

an autonomous folding unit and is the elementary structural

motif in knottins [22] Interestingly enough, although this

motif has been shown to display high stability (Tm ~ 100°C),

it has never been observed in nature alone, without

supplementary disulfide bridges The CSB motif is not only

found in knottins, but also in numerous other small disulfide

rich folds (e.g EGF-like motifs or scorpion toxins), and is

actually the most widespread disulfide motif (data not

shown) It may thus be hypothesized either that an ancestral

CSB-based protein once existed but was lost during

evolution, or that the different CSB-based folds appeared

independently as a result of convergent evolution

Relationships between the CSB motif and knottins are

summarized in (Fig 1).

As new knottins are discovered, it is becoming more and

more apparent that nature has used this stable scaffold in

very different contexts to achieve various biological roles At

present more than 12 protein families, with virtually no

sequence identity, share the knottin fold Due to its small

size, its well-defined structure, and its high stability, this

scaffold is thought to be an appealing structural template in

drug design developments Therefore, to facilitate analyses

and comparisons, we have recently proposed a simple knottin nomenclature based on loop lengths between cysteines, and a unique knottin numbering based on cysteine connectivity and structural conservation of the CSB motif

[24] These are illustrated in (Fig 2) and used throughout the

rest of the paper

Moreover, a KNOTTIN database gathering information

on knottins has been set up [24] and can be freely accessed

on the Internet (http://knottin.cbs.cnrs.fr or http:knottin com) Database searches by keyword, sequence, nomenclature, or geometrical pattern can be carried out and various displays are proposed Renumbered sequences and structures, as well as structurally fitted PDB files are available All these tools greatly facilitate knottin analyses, and particularly sequence and structure comparisons, as shown in the succeeding sections A sequence alignment between representative knottin sequences is shown in (Fig

3A).

Fig (1) From elementary CSB motif to macrocyclic knottins.

The figure indicates structural relationships between knottins and non-knottin CSB-based proteins The images of the CSB motif, the linear and circular squash inhibitors, and the cyclotides were prepared with the MOLMOL [21] and POV-Ray (http://www.povray.org/) programs using coordinates of Min-23, EETI-II, MCoTI-II and kalata B1 The (a)b.c(d)e[f] nomenclature

is explained in Fig (2) No cyclic knottins have yet been discovered for which c=0 Note that relationships in this figure do not necessarily imply evolutionary relationships (see discussion).

Trang 3

Fig (2) Knottin numbering and nomenclature An automatically drawn two-dimensional (2D) Collier de Perles representation [23, 24] of

MCoTI-II is shown The line between residues 43 and 58 is a result of the 2D representation and does not indicate chain break The cysteines involved in the knot are displayed with a black or grey background Roman and Arabic numbers indicate the order in the sequence and the new unique numbering of cysteines involved in the knot, respectively Cysteine IV (grey background) does not have a fixed number Letters a-f refer to successive loops between cysteines of the knot, and to the number of amino acids therein The latter values are used to establish the nomenclature {e.g MCoTI-II: (6)5.3(1)5[8]} Numbers between round brackets refer to peptide segments involved in the disulfide macrocycle whereas the number between square brackets refers to the C-to-N linker in cyclic squash inhibitors or cyclotides.

Fig (3) A Sequence alignment between knottins One representative sequence is shown for most knottin families except conotoxins (GVIA

and gm9a) and squash inhibitors (CPTI-II, EETI-II and MCoTI-I, -II and -III) The two-disulfide peptide Min-23 corresponding to the CSB motif is shown at the bottom Disulfide bridges of cysteines I-IV and of the CSB motif are shown on top as thin and thick lines, respectively.

Additional disulfide bridges are shown as thin boxes and lines Numbering is according to [24] and Fig (2) The X letter in the PCI sequence

stands for the ambiguous Glu/Gln residue The "<" in the MCoTI-III sequence stands for a N-terminal pyroglutamic acid B Proteolytic

fragments identified during MCoTI-II characterization in comparison with the EETI-II sequence.

Trang 4

MACROCYCLIC SQUASH INHIBITORS FROM

MOMORDICA COCHINCHINENSIS

Isolation and Characterization

MCoTI-I and II were isolated from dormant seeds of the

squash Momordica cochinchinensis (MCo), a common

Cucurbitaceae in Vietnam [25] These trypsin inhibitors

(TIs) once extracted from homogenized seeds, were purified

using a series of chromatographic steps including gel

filtration and ion-exchange chromatography, TIs being

detected by testing the collected fractions for trypsin

inhibitory activity (TIA) [25] Different TIs were separated

at this stage and were further analyzed and purified using

reverse-phase HPLC Finally, six species were isolated on a

mono-S column and characterized [4] Results are

summarized in (Fig 4) The sequence of the most abundant

TI (i.e MCoTI-II) was first determined Amino acid analysis

of the compound showed that MCoTI-II was composed of 34

residues To allow sequencing, half-cystines were reduced

and alkylated Mass spectrometry analysis of the resulting

species showed that MCoTI-II contained three disulfide

bonds, as is the case with all known squash inhibitors

However, all attempts to sequence the alkylated peptide

remained unsuccessful, indicating a blocked N-terminus

Proteolytic digestion was thus performed with the

endo-Lys-C protease since amino acid analysis revealed the presence

of three lysine residues in the sequence Proteolysis yielded

two fragments (Fig 3B), a small one, which could be

directly sequenced, and a large one, which was sub-digested

using chymotrypsin (data not shown) Sequence alignment of the fragments with known squash inhibitors, indicated that the twenty N-terminal residues of the large fragment significantly matched the C-terminal part of the TIs consensus, while its C-terminal sequence was homologous to the N-terminal portion of squash TIs This surprising result was strongly indicative of a macrocyclic structure The two fragments comprised 33 residues together, thus lacking one residue According to amino acid analysis and protease specificity, this residue had to be a lysine Calculation of the molecular weight of a linear peptide composed of these three portions would give a mass being 18 units above that measured, suggesting again the macrocyclic nature of this

TI This characteristic as well as the sequence were fully confirmed by digestion of the reduced/alkylated MCoTI-II

using endo-Asp-N as shown in (Fig 3B) As there was no

first or last residue, numbering was based on alignment with linear squash TIs The last residue was considered to be the glycine residue corresponding to the conserved C-terminal glycine in linear squash inhibitors It is likely that the

sequence shown in (Fig 3A) is contained as such in the

linear precursor Indeed, the first residues in the sequence of MCoTI-II (SGSDGGV) are clearly similar to the corresponding pro-sequence of the towel gourd trypsin inhibitor TGT-II (SGRHGGI) [26]

Other species were identified using a similar approach Firstly, the species isolated from peak D with a mass identical to that of MCoTI-II, and that isolated from peak F with a mass 18 units below that of MCoTI-II were shown to

be isomeric forms of MCoTI-II They were found to arise from rearrangement of an Asp-Gly peptide bond located in the C-to-N linker, when compared with linear homologs The former contained a β-Asp-Gly bond The latter species corresponded to the succinimide cyclic intermediate (aspartimide) formed during conversion of the α-Asp-Gly peptide bond (MCoTI-II) into the β-Asp-Gly bond The unusual stability of the succinimide moiety might arise from the overall constrained structure of the macrocyclic peptide The species contained in peak B, MCoTI-I, was also

shown to be macrocyclic As shown in (Fig 3), its sequence

differed from that of MCoTI-II at two positions close to the reactive site As for MCoTI-II, a species with an identical mass to MCoTI-I was isolated from peak A Although it has not been fully characterized, it is likely that this species also derived from an Asp-Gly bond rearrangement

Finally, a third inhibitor, MCoTI-III, was identified from peak F Although the reduced/alkylated species could not be directly sequenced, this appeared not to be due to macrocyclization but to the presence of an N-terminal pyroglutamate residue After removal of pyroglutamate using

a pyroglutamyl aminopeptidase, MCoTI-III was shown to be

a regular linear member of the squash inhibitor family (Fig

3A).

At this time MCoTI-I and -II are the only known cyclic squash inhibitors and the reason why remains to be determined The cyclization of MCoTIs might depend on the presence of a specific but unknown transpeptidase in

Momordica cochinchinensis seeds that would be absent in

other sources of squash inhibitors Alternatively, minor macrocyclic TIs in other Cucurbitaceae might have not been

Fig (4) Isolation of MCoTI-I, -II, and -III Ion-exchange

chromatography on a mono-S column using a NaCl gradient was

performed on compounds with TIA eluted from gel filtration

column [4, 25] Several peaks containing TIA indicated A to F were

collected Average masses of the major compounds contained in

these peaks have been measured by electro-spray mass

spectrometry.

Trang 5

detected due to sequencing difficulties In fact, contrarily to

cyclotides, circularization was not expected in squash

inhibitors and could have been missed before the discovery

of MCoTIs, raising the possibility that Curcubitaceae

produce both linear and cyclic inhibitors, with the former

only being directly sequenceable Indeed, just after we

determined the sequence of MCoTI-II, another group

reported the partial sequence of a TI isolated from

Momordica cochinchinensis [27] The major mass measured

on this sample was close to 3480 corresponding to that of

MCoTI-I, probably the major inhibitor in the sample

However, the reported sequence was that of a contaminant

corresponding to a cleaved form of MCoTI-II Clearly, even

if the macrocyclic TI was the major species in the sample,

only a linear sequenceable species was reported

Chemical Synthesis of MCoTI-I

The potential interest in cyclic MCoTIs as scaffolds for

drug design (see below) prompted us to perform chemical

synthesis Chemical synthesis and folding of circular

peptides containing multiple disulfide bonds is a complicated

process Two main strategies have been reported depending

on which of the disulfide bridges or of the backbone

cyclization is performed first: (i) After classical chain

assembly and work-up, the linear precursor peptide is first

oxidized, then cyclization is achieved via conventional

coupling procedures [28] This approach in which folding

favors cyclization by bringing N- and C-termini in close

proximity, is not fully compatible with Lys and

Asp-containing peptides as these residues may undergo

undesirable couplings during the cyclization step (ii)

Cyclization is achieved prior to oxidation, either directly on

the resin or via a C-terminal thioester [29] Whether in vivo

cyclization of MCoTIs occurs after or before disulfide bond

formation remains unknown, although the former process,

which would bring the N- and C-termini of the linear

precursor in close proximity, appears more likely.

Nevertheless, we have synthesized cyclic MCoTI-I using the

second approach and then using a new simpler protocol [30]

Our first synthesis of MCoTI-I was based on the second

approach, and used the thioester ligation procedure described

by Tam & Lu [29] This approach consists of several steps

that can be summarized as follows: (i) Peptide elongation

was initiated with Boc-Ile79-CO-S-CH2CH2CO-MBHA

resin Position 79 was selected in order to favor the

intramolecular transthio-esterification (via the thiol of Cys80)

with the thioester leading to the head-to-tail cyclization

Cysteine side-chains were protected by methylbenzyl (Meb)

groups at positions 40, 80, 60, 100 (the cysteines of the CSB

motif), and acetamidomethyl (Acm) groups at positions 20

and 78 (Figs 2 and 3) (ii) After cleavage with HF and

classical work-up the linear deprotected peptide (except

Cys(Acm)) was submitted to cyclization (iii) Oxidation of

cysteines of the CSB motif (40, 80, 60, 100) was performed

in the presence of DMSO followed by a rapid HPLC

purification (iv) The last disulfide bridge (cysteines 20 and

78) was formed (I2/MeOH), and MCoTI-I (major peak) was

finally obtained by HPLC purification (Mass: 3479.29,

expected: 3478.51) The trypsin inhibitory activity of the

synthetic product was comparable to that of native MCoTI-I,

as assessed by a qualitative agar-agar dish assay (with

edestin as substrate) based on the method described by Leluk

& Pham [31].

In our second preparation of MCoTI-I, Boc-Ala63

-CO-S-CH2CH2CO-MBHA resin was used as starting material All six Cys residues were introduced as Meb derivatives Elongation of the peptide was achieved according to the method described earlier [32] After HF cleavage, the crude peptide was dissolved in water containing 10% CH3CN, and the solution was stirred 24 h (pH 8) The major peak obtained by HPLC purification coeluted with the sample of MCoTI-I obtained from the first synthesis (mass: 3479.57)

In this method, cyclization and oxidation of all six cysteines was performed in a unique step Despite its simplicity, this new approach afforded a slightly better yield than the tedious step by step approach of Tam & Lu [29] The good yield in native protein is likely a result of the strong tendency of MCoTI-I to form native-disulfide bridges, as previously observed for EETI-II [32] Much simpler and cost-effective, this approach opens new routes to the chemical synthesis of cyclic squash inhibitors It would be interesting to see if this can be applied to other macrocyclic knottins as well

Three-dimensional Structure of MCoTI-II, and Comparison with the Cyclotide Kalata B1

Sufficient quantities of MCoTI-II were gathered from natural source for structural studies and the solution structure was solved by NMR simultaneously by our group and by David Craik's group [33, 34] Not surprisingly, the structure

of MCoTI-II is very close to that of linear homologs As

shown in (Fig 1), all typical structural elements of the

squash inhibitors are present in MCoTI-II: the triple-stranded

β-sheet and the two disulfide bridges that define the CSB motif, as well as the short 310 helix and the two β-turns The root mean square (rms) deviation for superimposition of the backbone atoms of residues 40, 60-61, 79-81 and 99-100

(knottin numbering, (see Fig 2) of the core CSB motif of

MCoTI-II [33] onto the reference X-ray knottin structure (CPTI-II, PDB ID 2btc, chain I) is as low as 0.35 Å (see the KNOTTIN database) Extending the superimposition to the

whole segment 40-100 (i.e from the second to the last cysteine) or to the 20-100 segment (i.e from the first to the

last cysteine, including the somewhat flexible inhibitory loop) leads to rms deviations of 0.76 Å and 0.85 Å, respectively These values are sufficiently low to claim that the C-to-N cyclization in MCoTI-II has no significant impact

on the protein structure

Actually, the C-to-N linker appeared as the most flexible part of the molecule, rather than as a stabilizing element [33, 34], and macrocyclization is not likely to be an essential element of MCoTI-II stability However, no linearization has yet been reported for MCoTI-II for comparison, and the details of the impact of macrocyclization on structure and stability of MCoTI-II remain to be determined Nevertheless, the isolation of the linear homologous trypsin inhibitor, MCoTI-III (77% of identity with MCoTI-II), also isolated

from seeds of Momordica cochinchinensis, clearly

demonstrates that cyclization is unnecessary for trypsin inhibition Thus the cyclic feature of MCoTI-II is a determinant of neither the stability nor the activity of the molecule It is worth noting here that pseudo-cyclizations do

Trang 6

occur in most squash TIs via salt-bridging between

side-chain of an N-terminal arginine and the C-terminal

carboxylate [11, 35-38] Nevertheless, the biological role of

cyclization in MCoTI-I and -II remains unclear The melting

temperature of EETI-II was shown to be approximately

140°C, revealing the extremely high stability of linear

squash TIs [22] Even the truncated two-disulfide Min-23

peptide (Fig 1 and 3) displays a high stability (Tm ~100°C)

[22] Therefore, the most likely significant impact of

cyclization would be on resistance to exoproteases by

removal of the protein N- and C-termini, and that might be

relevant for the biological role of MCoTIs

Interestingly, a quite different scenario has been reported

for the topologically very similar kalata B1 cyclotide It has

been shown that linearization of this peptide induces only

limited disruption of structural features but a total loss of

hemolytic activity [39], suggesting that, in this case,

cyclization affords a slight but necessary stabilization

Simple comparison of the loop [f] sequences in MCoTI-II

and kalata B1 supports these observations The 8-residue

linker in MCoTI-II contains four glycines and no prolines,

while the 7-residue linker in kalata B1 contains one proline

and only one glycine Since glycines and prolines are the

most and least flexible of all residues, respectively, this

simple sequence analysis is consistent with the linker in

MCoTI-II being more flexible Although these differences

appear very subtle, however, they might contain part of the

explanation why linear squash inhibitors are common,

whereas natural linear analogs of cyclotides are unknown

Although cyclic squash inhibitors and cyclotides share a

common topology, several differences can be observed apart

from the C-to-N linker By contrast to cyclotides, the

biological activity of the squash inhibitors is most certainly

inhibition of serine proteases Accordingly the canonical

inhibitory binding loop, i.e loop (a), is well defined and

conserved Although possible inhibitory activity for kalata

B1 has been once examined, no canonical loop can be

recognized in cyclotides, and loop (a) is shorter (3 residues),

playing a structural rather than functional role through H bonding of Glu22 side-chain with backbone amides [40] Conversely loop b is highly hydrophilic in squash inhibitors with a structural role due to H bonding of side chains of Asp43 and Asp59, but rather hydrophobic in cyclotides and with potential biological role These observations support previous analyses suggesting that different sequences and different stabilizing interactions can give rise to highly similar 3D structures [41]

WHY DO ALL KNOTTIN FAMILIES NOT HAVE CYCLIC MEMBERS?

Knottins define a very intriguing structural class of small disulfide-rich proteins Their scaffold is very small, yet remarkably stable thanks to three knotted disulfides The topological organization makes the N- and C-termini in rather close proximity since they lie at the same end of two adjacent strands of an anti-parallel β-sheet This proximity has allowed circularization by natural head-to-tail ligation in two knottin families, the cyclotides and the squash inhibitors Interestingly, it has been observed for a long time that a surprisingly high fraction of proteins have N- and C-termini close to each other [42] This feature has been used in many cases to perform non-native circularization [43, 44] or circular permutation of protein sequences [45, 46] Starting from a representative ensemble of 2169 protein structures [47], we have performed a brief analysis of distances between N- and C-termini The resulting distribution, shown

in (Fig 5), indicates that a significant number of proteins

have N- and C-termini within 15-20 Å, a distance that can be easily filled with linkers of just a few residues

Nevertheless, the discovery of cyclic MCoTIs has revealed the first family where nature has used this strategy

of circularization of few homologs by connecting proximal N- and C-termini through a short peptide linker, whereas most members remain linear It is striking, however, that similar circularization does not occur in other knottin families that contain only linear members Comparison of

Fig (5) Histogram of distances between N- and C-termini in a representative ensemble of proteins structures.

Trang 7

squash inhibitors and cyclotides with linear knottin families

provides potential clues on circularization due to different

location of Cys IV between families The two known

families that include circular members, the cyclotides and the

squash inhibitors, have a cysteine in position 78 (knottin

numbering, (Fig 2 and 3), i.e near Cys V, whereas it is in

position 61, i.e adjacent to Cys III in most other families

(Fig 3) Since Cys I, which is close to the N-terminus, is

disulfide linked with Cys IV, the displacement of Cys IV

from one end to the other end of loop c brings the

N-terminus in quite different locations, and modifies

significantly the distance between termini Thus, the distance

between the amide of residue 19 and the carbonyl of residue

101 in the linear (CPTI-II, PDB ID: 2btc, chain I) and cyclic

(MCoTI-II, PDB IDs: 1ha9, 1ib9) squash inhibitors is about

9 Å This distance is about twice as large (18 Å) in

omega-agatoxin IVB (Fig 3), PDB ID: 1agg), and this difference is

roughly conserved over other squash inhibitors and spider

toxins Thus it may be postulated that circularization is easier

for knottins with nomenclatures such that c>(d) {e.g 1ha9:

(6)5.3(1)5[8]} than when the reverse is true {e.g 1agg:

(7)6.0(4)10} Although chemical circularization of a

conotoxin has been reported [48], the detailed impact on the

conotoxin structure is not yet available

Nevertheless, the observation of macrocyclic squash

inhibitors opens new perspectives in the application of the

knottin scaffold in drug design, and these are discussed

shortly below

POTENTIAL APPLICATIONS OF (CYCLIC) SQUASH

INHIBITORS AND ANALOGS IN DRUG-DESIGN

Although the role of the macrocyclization remains

unclear, an obvious advantage is to confer resistance to

exopeptidases Together with knotted disulfide bridges, these

constraints render the macrocyclic peptides highly stable As

discussed by Craik et al in this issue [1], several cyclotides

were shown to be resistant to proteases and to boiling

treatment, and kalata B1, the active component of extracts

used in traditional medicine, appears to be orally active

Similarly, MCoTI-II was shown to be resistant to thermolysin at 50°C and to heat treatment of the seeds [4], and thus also represents a very interesting scaffold in drug-design approaches

Squash inhibitors are small but very potent serine proteinase inhibitors, and it has been shown that mutation at

or near the P1 site allows generation of potent inhibitors of serine proteinases of medical interest, e.g neutrophil elastase involved in several diseases (emphysema, cystic fibrosis or rheumatoid arthritis) [15, 49, 50] This could be extended to other serine proteinases of pharmacological interest, such as coagulation factors and other proteases of the clotting cascade, or matriptase involved in cancer Using the cyclic nature of MCoTIs in similar approaches would certainly improve the bioavailability of the new molecules

Although more speculative, a still much greater potential can be expected by using the MCoTI scaffold as a stable and protease resistant structural template on which new biological activities could be transferred Several strategies

in this direction have already been reported using the linear squash inhibitor EETI-II or the elementary CSB motif These studies suggest that the homologous cyclic MCoTI-I or -II peptides could be easily modified to engineer small, stable molecules with new, selected, activity These approaches are

summarized in (Fig 6).

A pioneering work transferred the C-terminal sequence

of PCI onto EETI-II, resulting in a double-headed inhibitor

of trypsin and carboxypeptidase [15, 17] More recently, the primary trypsin binding loop (a) of EETI-II was replaced by either a 13- or a 17-residue epitope from the Sendai virus L-protein or the human bone Gla-L-protein respectively, and the

chimeric peptide displayed on the Escherichia coli outer

membrane as fused proteins [51] In another work, the same binding loop was replaced by a sequence derived from the third domain of the turkey ovomucoid inhibitor and optimized to inhibit porcine pancreatic elastase [52] Finally, this same loop was also subjected to randomization in a mRNA display approach [53]

Fig (6) Summary of drug design reports based on the EETI-II squash inhibitor scaffold

Trang 8

The second β-turn of EETI-II, i.e loop e, has also been

the subject of several studies

First, selection of trypsin binders from a phage displayed

library with four randomized positions in loop e, showed that

this loop can be accommodated by few sequences [54]

Furthermore, circular permutation of EETI-II in which the

termini are linked by a (Gly)3 tripeptide and loop e is

cleaved, yielded a correctly folded compound with

native-like structure (unpublished results) This result shows that

loop e is not essential for folding, although its transfer into

homologous CMTI-III was shown to improve synthesis yield

[55] And last, grafting a new residue sequence, taken from

the SH3 RT loop of an HIV-1 nef binding kinase, onto the

same loop in Min-23 resulted in a correctly folded chimera

with a conserved CSB motif [56] All these studies clearly

demonstrate that loop (a) and e of linear squash inhibitors

can accommodate large sequence modifications Moreover,

comparison of linear and circular squash inhibitors strongly

suggests that the head-to-tail linker, i.e loop [f], is not

essential for correct folding of the circular compounds and

can also be varied significantly This was indeed verified by

the correct folding of the circular permutant of EETI-II with

a (Gly)3 loop [f] (unpublished results) Overall, these studies

clearly demonstrate that the scaffold of the squash inhibitors

(and probably of many other knottins) is indeed a very

promising template for drug design, and this fact is still

strongly reinforced by the recent discovery of cyclic squash

inhibitors There is no doubt that new molecules developed

using one of the above approaches would benefit from the

increased stability and protease resistance afforded by

cyclization In this context, it is worth noting that the CSB

motif itself has N- and C-termini in extremely close

proximity, as exemplified in the Min-23 peptide shown in

(Fig 1), but no cyclization of this kind of compound has yet

been tested

DIVERGENT OR CONVERGENT EVOLUTION?

An interesting question remains open to discussion, i.e.

the putative evolutionary relationship between different

knottin families and more specifically between cyclic

knottins, i.e squash inhibitors and cyclotides The first part

of the question has been tentatively addressed in a recent

paper [57] Based on gene organization and 3D structure the

authors suggest the existence of two different ancestors for

knottins from plants and from animals The structural

criterion used was that animal knottins have a c=0 loop,

whereas plant knottins have a c≠0 loop However, although

this is mostly verified, it can be seen from (Fig 1) that

several knottins contradict this proposal: on the one hand

conotoxin gm9a has a c=3 loop whereas on the other hand

Gurmarin from Gymnema sylvestre, α-amylase inhibitor

from Amaranthus hypochondriatus, and PAFP-S from

Phytolacca americana display a c=0 loop Nevertheless,

although many knottin families display sequences with no

detectable relationship, it is likely from sequence

comparisons (data not shown), that at least some knottin

families are evolutionary related as, for example, toxins from

cone snails and toxins from spiders Alternatively, it might

be considered that the two-disulfide CSB motif is simply a

stable structural arrangement often found in small proteins,

just as helix bundles or β-sheet Greek-key motifs are

observed in many unrelated globular proteins All known CSB-based proteins however display at least one additional disulfide But there may not be an infinite number of ways to add a supplementary disulfide, and creating a knottin may be one of the most stabilizing ways In other words, it is tempting to speculate that many knottins have actually evolved to the same fold as a result of convergent evolution Considering the absence of any significant sequence homology, the possibility that cyclotides from Rubiaceae and Violaceae plant families and circular squash inhibitors from Cucurbitaceae plant family, have evolved by convergent evolution cannot be ruled out Examples of evolutionary convergence of shape are well known at the morphological level, e.g between the succulent euphorbia of Africa and the cacti of the Americas Nevertheless, the intriguing structural proximity between the two families suggests evolutionary relationships [34] According to this, it can be noted that Violaceae and Cucurbitaceae families are particularly close

in taxonomy (Common taxonomy in SwissProt [58] for Violaceae and Cucurbitaceae: Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core eudicots; rosids; eurosids I; ), but a definite answer regarding the existence of

a common ancestor should probably await the discovery of many more protein and genetic sequences in these or related families

ABBREVIATIONS

2D = Two-dimensional 3D = Three-dimensional Acm = Acetamidomethyl CSB = Cystine-Stabilized Beta-sheet CPTI = Cucurbita pepo Trypsin Inhibitor

DMSO = Dimethylsulfoxide EETI-II = Ecballium elaterium Trypsin Inhibitor-II

HPLC = High performance liquid chromatography MBHA = Methylbenzhydrylamine

MCoTI = Momordica cochinchinensis Trypsin

Inhibitor Meb = Methylbenzyl rms = Root mean square NMR = Nuclear magnetic resonance PCI = Potato Carboxypeptidase Inhibitor PDB = Protein Data Bank

TCEP = Tris (2-carboxyethyl) phosphine

TI = Trypsin Inhibitor TIA = Trypsin inhibitory activity

REFERENCES

[1] Craik, D J., Daly, N L., Plan, M R., Trabi, M., and Mulvenna, J.

(2004) Curr Prot & Pept Sci., 5, 297-315.

[2] Göransson, U., Svangård, E., Clæson, P and Bohlin, L (2004)

Curr Prot & Pept Sci., 5, 317-329.

Trang 9

[3] Gustafson, K R., McKee, T C., and Bokesch, H R (2004) Curr.

Prot & Pept Sci., 5, 331-340.

[4] Hernandez, J F., Gagnon, J., Chiche, L., Nguyen, T M., Andrieu,

J P., Heitz, A., Trinh Hong, T., Pham, T T., and Le Nguyen, D.

(2000) Biochemistry, 39, 5722-30.

[5] Bode, W., and Huber, R (1992) Eur J Biochem., 204, 433-51.

[6] Laskowski, M., Jr., and Kato, I (1980) Annu Rev Biochem., 49,

593-626.

[7] Polanowski, A., Wilusz, T., Nienartowicz, B., Cieslar, E.,

Slominska, A., and Nowak, K (1980) Acta Biochim Pol., 27,

371-82.

[8] Otlewski, J., and Krowarsch, D (1996) Acta Biochim Pol., 43,

431-44.

[9] Korsinczky, M L., Schirra, H J., and Craik, D J (2004) Curr.

Prot & Pept Sci., 5, 351-364.

[10] Konarev, A V., Anisimova, I N., Gavrilova, V A., Vachrusheva,

T E., Konechnaya, G Y., Lewis, M., and Shewry, P R (2002)

Phytochemistry, 59, 279-91.

[11] Bode, W., Greyling, H J., Huber, R., Otlewski, J., and Wilusz, T.

(1989) FEBS Lett., 242, 285-92.

[12] Heitz, A., Chiche, L., Le-Nguyen, D., and Castro, B (1989)

Biochemistry, 28, 2392-8.

[13] Chiche, L., Gaboriaud, C., Heitz, A., Mornon, J P., Castro, B., and

Kollman, P A (1989) Proteins, 6, 405-17.

[14] Rees, D C., and Lipscomb, W N (1982) J Mol Biol., 160,

475-98.

[15] Le Nguyen, D., Heitz, A., Chiche, L., Castro, B., Boigegrain, R A.,

Favel, A., and Coletti-Previero, M A (1990) Biochimie, 72, 431-5.

[16] Pallaghy, P K., Nielsen, K J., Craik, D J., and Norton, R S.

(1994) Protein Sci., 3, 1833-9.

[17] Chiche, L., Heitz, A., Padilla, A., Le-Nguyen, D., and Castro, B.

(1993) Protein Eng., 6, 675-82.

[18] Heitz, A., Chiche, L., Le-Nguyen, D., and Castro, B (1995) Eur J.

Biochem., 233, 837-46.

[19] Heitz, A., Le-Nguyen, D., Castro, B., and Chiche, L (1997) Lett.

Pept Sci., 4, 245-9.

[20] Le-Nguyen, D., Heitz, A., Chiche, L., el Hajji, M., and Castro, B.

(1993) Protein Sci., 2, 165-74.

[21] Koradi, R., Billeter, M., and Wuthrich, K (1996) J Mol Graph.,

14, 51-5, 29-32.

[22] Heitz, A., Le-Nguyen, D., and Chiche, L (1999) Biochemistry, 38,

10615-25.

[23] Lefranc, M P., Giudicelli, V., Ginestoux, C., Bodmer, J., Muller,

W., Bontrop, R., Lemaitre, M., Malik, A., Barbie, V., and Chaume,

D (1999) Nucleic Acids Res., 27, 209-12.

[24] Gelly, J.-C., Gracy, J., Kaas, Q., Le Nguyen, D., Heitz, A., and

Chiche, L (2004) Nucleic Acids Res., 32, D156-D159.

[25] Pham, T C., and Nguyen, T M (1996) VNU Journal of Science,

Nat Sci (vietnamese, english summary), 33-41.

[26] Ling, M H., Qi, H Y., and Chi, C W (1993) J Biol Chem., 268,

810-4.

[27] Huang, B., Ng, T B., Fong, W P., Wan, C C., and Yeung, H W.

(1999) Int J Biochem Cell Biol., 31, 707-15.

[28] Daly, N L., Love, S., Alewood, P F., and Craik, D J (1999)

[29] Tam, J., and Lu, Y.-A (1997) Tetrahedron Lett., 38, 5599-602.

[30] Le Nguyen, D., Barry, L G., Tam, J P., Heitz, A., Chiche, L.,

Hernandez, J F., and Pham, T C (2002) in Peptides 2002

(Benedetti, E., and Pedone, C., Eds.) pp 182-183., Edizioni Ziino,

Napoli, Italy.

[31] Leluk, J., and Pham, T T C (1985) in XXIst Meeting of Polish

Biochemical Society pp 139., Krakow, Poland.

[32] Le-Nguyen, D., Nalis, D., and Castro, B (1989) Int J Pept.

Protein Res., 34, 492-7.

[33] Heitz, A., Hernandez, J F., Gagnon, J., Hong, T T., Pham, T T.,

Nguyen, T M., Le-Nguyen, D., and Chiche, L (2001)

[34] Felizmenio-Quimio, M E., Daly, N L., and Craik, D J (2001) J.

Biol Chem., 276, 22875-82.

[35] Helland, R., Berglund, G I., Otlewski, J., Apostoluk, W.,

Andersen, O A., Willassen, N P., and Smalas, A O (1999) Acta

Crystallogr D Biol Crystallogr., 55, 139-48.

[36] Huang, Q., Liu, S., and Tang, Y (1993) J Mol Biol., 229,

1022-36.

[37] Zhu, Y., Huang, Q., Qian, M., Jia, Y., and Tang, Y (1999) J.

Protein Chem., 18, 505-9.

[38] Thaimattam, R., Tykarska, E., Bierzynski, A., Sheldrick, G M., Jaskolski, M., Zhu, Y., Huang, Q., Qian, M., Jia, Y., Tang, Y., Liu, S., Bode, W., Greyling, H J., Huber, R., Otlewski, J., Wilusz, T., Helland, R., Berglund, G I., Apostoluk, W., Andersen, O A.,

Willassen, N P., and Smalas, A O (2002) Acta Crystallogr D

Biol Crystallogr., 58, 1448-61.

[39] Barry, D G., Daly, N L., Clark, R J., Sando, L., and Craik, D J.

(2003) Biochemistry, 42, 6688-95.

[40] Rosengren, K J., Daly, N L., Plan, M R., Waine, C., and Craik, D.

J (2003) J Biol Chem., 278, 8606-16.

[41] Laurents, D V., Subbiah, S., and Levitt, M (1994) Protein Sci., 3,

1938-44.

[42] Thornton, J M., and Sibanda, B L (1983) J Mol Biol., 167,

443-60.

[43] Iwai, H., and Pluckthun, A (1999) FEBS Lett., 459, 166-72.

[44] Goldenberg, D P., and Creighton, T E (1983) J Mol Biol., 165,

407-13.

[45] Graf, R., and Schachman, H K (1996) Proc Natl Acad Sci USA,

93, 11591-6.

[46] Hennecke, J., Sebbel, P., and Glockshuber, R (1999) J Mol Biol.,

286, 1197-215.

[47] Jones, D T (1999) J Mol Biol., 292, 195-202.

[48] Craik, D., Daly, N L., and Nielsen, K J (2000), PTC International

Patent Application WO 0015654.

[49] Rolka, K., Kupryszewski, G., Ragnarsson, U., Otlewski, J., Wilusz,

T., and Polanowski, A (1989) Biol Chem Hoppe Seyler, 370,

499-502.

[50] Rozycki, J., Kupryszewski, G., Rolka, K., Ragnarsson, U., Zbyryt, T., Krokoszynska, I., Wilusz, T., Otlewski, J., and Polanowski, A.

(1994) Biol Chem Hoppe Seyler, 375, 289-91.

[51] Christmann, A., Walter, K., Wentzel, A., Kratzner, R., and Kolmar,

H (1999) Protein Eng., 12, 797-806.

[52] Ay, J., Hilpert, K., Krauss, N., Schneider-Mergener, J., and Hohne,

W (2003) Acta Crystallogr D Biol Crystallogr., 59, 247-54.

[53] Baggio, R., Burgstaller, P., Hale, S P., Putney, A R., Lane, M., Lipovsek, D., Wright, M C., Roberts, R W., Liu, R., Szostak, J.

W., and Wagner, R W (2002) J Mol Recognit., 15, 126-34.

[54] Wentzel, A., Christmann, A., Kratzner, R., and Kolmar, H (1999)

J Biol Chem., 274, 21037-43.

[55] Rolka, K., Kupryszewski, G., Ragnarsson, U., Otlewski, J.,

Krokoszynska, I., and Wilusz, T (1991) in Peptides 1990 (Giralt,

E., and Andreu, D., Eds.) pp 768-771, ESCOM Science Publishers, Leiden, Netherland.

[56] Heitz, A., Le-Nguyen, D., Dumas, C., and Chiche, L (2000) in

Peptides 2000 (Martinez, J., and Fehrentz, J A., Eds.) pp

415-416., Editions EDK, Paris, France.

[57] Zhu, S., Darbon, H., Dyason, K., Verdonck, F., and Tytgat, J.

(2003) FASEB J., 17, 1765-7.

[58] Boeckmann, B., Bairoch, A., Apweiler, R., Blatter, M C., Estreicher, A., Gasteiger, E., Martin, M J., Michoud, K.,

O'Donovan, C., Phan, I., Pilbout, S., and Schneider, M (2003)

Nucleic Acids Res., 31, 365-70.

Received: February 15, 2004 Accepted: May 26, 2004

Định dạng
Số trang	9
Dung lượng	370,52 KB