Based on a sequence identity matrix of the deduced amino acid sequences, we divided the obtained sequences into five groups, naming them macrocypin 1–5.. Sequence identity at the deduced
Trang 1the basidiomycete Macrolepiota procera
Jerica Saboticˇ1, Tatjana Popovicˇ1, Vida Puizdar2and Jozˇe Brzin1
1 Department of Biotechnology, Jozˇef Stefan Institute, Jamova 39, Ljubljana, Slovenia
2 Department of Biochemistry and Molecular and Structural Biology, Jozˇef Stefan Institute, Jamova 39, Ljubljana, Slovenia
Introduction
Papain-like cysteine proteases are widespread in
organ-isms ranging from bacteria to humans They play
impor-tant roles in many facets of physiology, and
dysregulation of proteolytic activity can lead to a variety
of pathologies, including cancer, rheumatoid arthritis,
osteoarthritis and neurological disorders [1,2] The most
important regulators of protease activity are specific
protease inhibitors In addition to their considerable
potential in diverse medical applications, protease
inhib-itors have been studied as tools for analysing proteolytic mechanisms and protein–protein interactions, and as biocidal agents against various organisms There are several groups of inhibitors, mainly from animal and plant origins, that specifically inhibit papain-like cyste-ine proteases [3,4] The first cystecyste-ine protease inhibitor isolated from higher fungi was clitocypin from the basidiomycete Clitocybe nebularis [5] Clitocypin is a 16.8-kDa protein lacking cysteine and methionine
resi-Keywords
basidiomycetes; clitocypin; cysteine
protease; mycocypin; protease inhibitor
Correspondence
J Saboticˇ, Department of Biotechnology,
Jozˇef Stefan Institute, Jamova 39, 1000
Ljubljana, Slovenia
Tel: +386 1 477 3754
Fax: +386 1 477 3594
E-mail: Jerica.Sabotic@ijs.si
Note
The nucleotide sequences reported in this
paper have been deposited in the
DDBI ⁄ EMBL ⁄ GenBank databases under
accession numbers FJ495239, FJ495240,
FJ495241, FJ495242, FJ495243, FJ495244,
FJ495245, FJ495246, FJ495247, FJ495248,
FJ495249, FJ495250, FJ548751 and
FJ548752
(Received 25 March 2009, revised 22 May
2009, accepted 9 June 2009)
doi:10.1111/j.1742-4658.2009.07138.x
A new family of cysteine protease inhibitors from the basidiomycete Mac-rolepiota procera has been identified and the family members have been termed macrocypins These macrocypins are encoded by a family of genes that is divided into five groups with more than 90% within-group sequence identity and 75–86% between-group sequence identity Several differences
in the promoter and noncoding sequences suggest regulation of macrocypin expression at different levels High yields of three different recombinant macrocypins were produced by bacterial expression The sequence diversity was shown to affect the inhibitory activity of macrocypins, the heterolo-gously expressed macrocypins belonging to different groups showing differ-ences in their inhibitory profiles Macrocypins are effective inhibitors of papain and cysteine cathepsin endopeptidases, and also inhibit cathepsins
B and H, which exhibit both exopeptidase and endopeptidase activities The cysteine protease legumain is inhibited by macrocypins with the excep-tion of one representative that exhibits, instead, a weak inhibiexcep-tion of serine protease trypsin Macrocypins exhibit similar basic biochemical characteris-tics, stability against high temperature and extremes of pH, and inhibitory profiles similar to those of clitocypin from Clitocybe nebularis, the sole rep-resentative of the I48 protease inhibitor family in the MEROPS database This suggests that they belong to the same merops family of cysteine prote-ase inhibitors, the mycocypins, and substantiates the establishment of the I48 protease inhibitor family
Abbreviations
AMC, 7-amido-4-methylcoumarin; Clt, clitocypin; Mcp1, 2, 3, 4, 5, macrocypin 1, 2, 3, 4, 5; rMcp1, 3, 4, recombinant macrocypin 1, 3, 4; Z, benzyloxycarbonyl.
Trang 2dues, which, on account of its unique characteristics,
was assigned to a new family of cysteine protease
inhibi-tors, I48 in the merops database inhibitor classification,
also named mycocypins Its profile of inhibition differs
from those of other then known families of cysteine
pro-tease inhibitors Clitocypin inhibits papain, cathepsins L
and K, legumain and bromelain, but is inactive against
cathepsin H, trypsin and pepsin Clitocypin is encoded
by a small family of genes that show sequence
hetero-geneity, which does not affect its inhibitory activity
In addition to a defensive role, a regulatory role in
mushroom endogenous proteolytic systems was
pro-posed, based on the specific inhibition of several
puta-tive fungal cysteine proteases [5–7]
The proteolytic potential of higher fungi is
consid-erable in terms of the number and diversity of
prote-ases they contain [8], and basidiomycetes are a rich
source of novel proteolytic enzymes and their
inhibi-tors [5,8] As a result of a study carried out to
inves-tigate the extent and function of the I48 inhibitor
family that to date includes only one characterized
member, we report a novel family of cysteine
prote-ase inhibitors from basidiocarps, or fruiting bodies,
of the basidiomycete Macrolepiota procera, and have
characterized some of its natural and recombinant
members
Results
Isolation of cysteine-protease inhibitors from
M procera
Cysteine protease inhibitors were purified from
basid-iocarps of M procera by a method similar to that used
to purify clitocypin from basidiocarps of C nebularis,
which included hydrophobic interaction, ion-exchange
and papain affinity chromatographies [5,9] The
inhibi-tory proteins were separated on SDS–PAGE under
denaturing, nonreducing, conditions into two bands
corresponding to apparent molecular masses of 21 and
17 kDa (Fig 1) The 43-residue N-terminal sequence
for the former protein (Fig 2) was 42% identical and
58% similar to the N-terminal sequence of clitocypin
The new 21-kDa cysteine protease inhibitor has been
named macrocypin (Macrolepiota procera cysteine
pro-tease inhibitor, Mcp) The N-terminal sequence for the
lower-molecular-mass protein could not be determined,
possibly because of a blocked N terminus The
pres-ence of a protein of similar size to that of clitocypin
and purified by the same procedure, and with similar
biochemical properties, indicated the presence of a
clit-ocypin-like cysteine protease inhibitor in M procera
[5,6] This was confirmed at the genetic level by
geno-mic DNA dot-blot analysis and by amplification of partial clitocypin-like gene sequences The partial sequences of the clitocypin-like genes from M procera are more than 90% identical to those of clitocypin
at the nucleotide level (see Supplementary Data in Doc S1 and Doc S2 and Fig S1 and S2)
Cloning and analysis of macrocypin cDNA and gene sequences
The N-terminal sequence (H2N-GLEDGLYTIRHLVE
the sequence of an internal peptide fragment obtained
by digestion with cyanogen bromide (H2 N-YIP-RKVFK) were used to design degenerate primers (Table S1) Degenerate primers and M procera cDNA synthesized from total RNA as template were used to obtain a specific macrocypin sequence This was then used to design specific nested primers for use in Genome walking and 3¢ RACE methods
Two fragments were cloned from the two genomic libraries obtained by the Universal Genome Walker kit (Clontech, Heidelberg, Germany), using mcp gene-spe-cific antisense primers – Pr4 (1000 bp) and Pr3 (204 bp) Each corresponds to the mcp 5¢ UTR and promoter regions Sequences spanning the 5¢ coding region of the mcp gene were not identical (Fig S3), suggesting that more than one gene encoding macrocy-pin is present in the M procera genome, each of which has its corresponding promoter Both promoter sequences have a typical TATA box (TATAAAA) present at position )85, and a putative transcription initiation site (CTAGTCC) at position )55, indicating
Fig 1 SDS–PAGE analysis of the natural cysteine protease inhibi-tors clitocypin (nClt) and macrocypin (nMcp) Clitocypin (from Clito-cybe nebularis) and macrocypin (from Macrolepiota procera) purified from basidiocarps were analyzed under nonreducing, dena-turing conditions and stained with Coomassie Blue Lane M, protein molecular mass markers.
Trang 3transcriptional activity of macrocypin genes The
loca-tion of the TATA box at a posiloca-tion 30 nucleotides
upstream from the putative transcription-initiation site
is in accordance with the locations found in several
other fungal genes, which are usually within 30–60
nucleotides of the transcription start site [10]
To obtain the coding and 3¢ UTR regions of the mcp
mRNA, we used the 3¢ RACE method with mcp-specific
sense primers that differentiate between the two
sequences obtained by genome walking Partial
sequences corresponding to promoter Pr4 showed two
different lengths of the 3¢ UTR (66 nucleotides and 87
nucleotides) Sequences corresponding to promoter Pr3
showed even more variation in 3¢UTR length (64, 69, or
88 nucleotides), indicating the possibility of macrocypin
translation being regulated via the 3¢ UTR [11] There is
no typical polyadenylation signal present in the 3¢ UTR
The genomic sequence corresponding to the 3¢ UTR was
amplified from the two genomic libraries created using
the Universal Genome Walker kit (Clontech), using
mcp-specific sense primers Two fragments, spanning
175 and 151 bp downstream of the stop codon, were
obtained that were identical to the overlapping sequence
of the 3¢UTR region amplified from M procera cDNA,
corresponding to the promoter fragment Pr4
The full-length gene and cDNA sequences of macro-cypin were obtained, using primers annealing to the 5¢ UTR and 3¢ UTR regions and genomic DNA or first-strand cDNA synthesized from the total RNA of
M procera as templates Several different gene and cDNA sequences were amplified, all of which, how-ever, shared the same gene structure The mcp genes were found to be composed of four exons and three short introns with exon–intron boundaries matching the consensus splice sites predicted for eukaryotic genes [10] Based on a sequence identity matrix of the deduced amino acid sequences, we divided the obtained sequences into five groups, naming them macrocypin 1–5 Sequence identity at the deduced amino acid sequence level between the five macrocypin groups was 75–86%, while sequences within groups exhibited more than 90% sequence identity
The length of the macrocypin 1 (Mcp1) deduced amino acid sequence was 169 residues, with a molecu-lar mass of 19 193 Da, while the length of the repre-sentatives of the other four groups was 167 residues, with molecular masses between 18 770 and 19 031 Da Single cysteines were present in Mcp1, macrocypin 4 (Mcp4) and macrocypin 5 (Mcp5) (C106), none in macrocypin 2 (Mcp2) and none or one (C75) in
macro-Fig 2 Diversity in the cysteine protease inhibitor macrocypin family Amino acid sequences deduced from macrocypin (Mcp) genes (pref-aced by g) and from cDNA sequences (pref(pref-aced by c) belonging to different groups (macrocypins 1–5) are aligned with the N-terminal sequence determined for the natural macrocypin (nMcp) isolated from basidiocarps of Macrolepiota procera Identical residues in at least 9
of 11 sequences (80 %) are highlighted in dark gray and similar residues are highlighted in light gray Sequences corresponding to clones used for the heterologous expression of macrocypins are indicated in bold Residues subjected to positive evolution, as determined using the Datamonkey rapid detection of positive selection web server [20], are marked with an asterisk.
Trang 4cypin 3 (Mcp3) All the sequences showed high
con-tents of proline, of around 9.2% (the average overall
protein content is 4.7 %), of tryptophan, of 3.6%
(average 1.1 %) and of tyrosine, of 6.0–6.6% (average
2.9%), and low leucine contents of around 3.6%
(aver-age 9.7%) [12] With the exception of Mcp3,
macrocy-pin sequences also showed a high glycine content of
around 9.8% (average 6.9%)
Variability in the sequences of macrocypin genes
The diversity observed for the macrocypin coding
sequences was greater than that for clitocypins [7] The
coding sequence, composed of four exons, was 507 bp
for Mcp1 and 501 bp for the other macrocypins The
deduced amino acid sequence for Mcp1 was thus two
amino acids longer, because of an insertion near the N
terminus The lengths of the four exons were 147 or
153, 212, 70 and 75 bp, and of the three introns were
56, 49 or 51, and 50 or 54 bp The sequence diversity
was equally distributed in introns and in exons
(Fig S3) The second intron showed the greatest
diver-sity in sequence and was 2 bp shorter in Mcp3 and
Mcp5 The third intron was the most highly conserved
of the three, with the exception of Mcp3, where it was
4 bp shorter Diversity in the coding sequence was
dis-tributed throughout the sequence (Fig 2) and this was
caused by one, two or three nucleotide substitutions
that resulted in 47 variable codons at the level of the
deduced amino acid sequences Some of these codons
subjected to positive selection (codons 18, 85, 89, 105,
107 and 116) were identified at the P < 0.05 level (Fig 2) Codons under negative selection (31 codons identified) appeared to be evenly distributed along the whole mcp genes without particular clustering The few positively selected sites in the mcp genes may provide insights into the physiological role of macrocypins, as these sites are more likely to be involved in interac-tions with other proteins
Similarity searches for the macrocypin sequences against NCBI databases using blastn [13] revealed no significant similarities (cut-off e < 0.1) tblastn searches, by contrast, found significant sequence simi-larity to the clitocypin cysteine protease inhibitor annotated in the Laccaria bicolor S238N-H82 genome [14] There was 39–44% sequence similarity and 26–29% sequence identity for different macrocypin sequences, the highest being with that of Mcp5 No significant similarities were found for the macrocypin sequences in the completed and unfinished archaeal or bacterial genomes or in other eukaryotic genomes, which is probably the result of low overall sequence similarity between mycocypins
Alignment of macrocypin deduced amino acid sequences with clitocypin sequences showed 17–21% sequence identity (Fig 3) The N-terminal halves of the sequences showed more similarity The higher molecular masses relative to the clitocypins are caused
by a few insertions and deletions of two to seven amino acids, distributed along the sequence Another important difference between the macrocypin and clit-ocypin sequences was the presence of a cysteine residue
Fig 3 Alignment of mycocypin deduced amino acid sequences Deduced amino acid sequences of macrocypins belonging to each of the five groups are aligned with three representative clitocypin deduced amino acid sequences (GenBank accession numbers: gClt-Kras, AAZ78483.1; cClt-Kras, AAZ78481.1; and cClt-Vrh, AAZ78482.1) Sequences are prefaced by g or c to indicate whether they are deduced from genomic or cDNA sequences Identical residues in at least seven of the eight sequences (90 %) are highlighted in dark grey and similar residues are highlighted in light grey.
Trang 5in most macrocypins, and of several histidine and
methionine residues in all the macrocypin sequences,
all of which are absent in clitocypin High proline and
glycine contents were, by contrast, common to both
macrocypins and clitocypin
Heterologous expression of macrocypins
The cDNA clones that were available for Mcp1, Mcp3
and Mcp4 (marked in bold in Fig 2), were cloned into
expression vectors of the pET System (Novagen,
Madi-son, WI, USA) for heterologous expression in
Escheri-chia coli (Fig S4) Macrocypin 1a (Mcp1a) and
macrocypin 3a (Mcp3a) cDNA clones were each
cloned into two expression vectors (pET3a and
pET11a) to assess their expression in the bacterial
expression system Heterologous expression of
recom-binant Mcp1 (rMcp1) was higher with the pET11a
expression vector construct, while that of recombinant
Mcp3 (rMcp3) was higher using pET3a The amounts
of both recombinant proteins were highest 6 h after
induction (data not shown) The macrocypin 4a
(Mcp4a) cDNA clone was introduced into the pET14b
expression vector and expressed in two strains of
E coli The expression of recombinant Mcp4 (rMcp4)
was highest when using the pET14b::Mcp4a construct
in combination with the E coli BL21(DE3) pLysS
strain grown for 8 h after induction Recombinant
macrocypins rMcp1 and rMcp3 were expressed mainly
as insoluble inclusion bodies, and rMcp4 was
expressed as an equal distribution of protein between
the soluble form and inclusion bodies All three rMcps
were purified from the inclusion bodies, which were
almost completely solubilized in 3 m urea One-step
purification using size-exclusion chromatography
yielded purified recombinant macrocypins rMcp1 and
rMcp4, while rMcp3 still contained some impurities,
which were taken into consideration when calculating
the concentration
Characterization of macrocypins
Recombinant macrocypins rMcp1, rMcp3 and rMcp4
were resolved on SDS–PAGE under reducing
condi-tions as single 19 kDa bands (Fig 4A) Under
non-reducing conditions, however, they all showed an
additional band at 38 kDa, corresponding to a
dimer, probably formed between the single cysteines
present in each protein The calculated isoelectric
point of 5.1 for Mcp4 was confirmed by IEF and
was similar to the isoelectric point of the natural
macrocypin isolated from basidiocarps of M procera
For Mcp1 and Mcp3, the calculated isoelectric points
were 4.8, which was also confirmed by IEF (Fig 4B)
N-terminal sequences were confirmed for all three recombinant macrocypins That for rMcp1 [NH2 -(M)GFEDG] revealed N-terminal cleavage of methio-nine in approximately one-third of the molecules N-terminal sequences of rMcp3 and rMcp4 (NH2 -ALEDG) showed complete cleavage of N-terminal methionine N-terminal cleavage of methionine in
E coli is strongly influenced by the amino acid resi-dues at the P1¢, P2¢ and P3¢ positions (the P1 position being the first methionine) [15] In the case of macro-cypins, the NH2-MALE sequence favours N-terminal methionine cleavage in E coli, and the NH2-MGFE sequence of rMcp1 favours only partial cleavage The far-UV CD spectra of rMcp1 and rMcp4 con-firmed the expectation from the sequences that the conformations of the inhibitors are very similar (Fig S5A) The marked tryptophan bands, seen also
in clitocypin [9] and ascribed to interaction of buried tryptophan residues, prevent analysis of secondary structure, but underline the similarity in tertiary struc-ture, at least in this region, as well as of secondary structure
Clitocypin has been proven to be a very stable pro-tein [5,9] The temperature and pH stability of recom-binant macrocypins were determined by following their inhibitory activity (measured after return to native conditions) after heating and after incubation at extremes of pH The macrocypins rMcp1 and rMcp3 retained their inhibitory activity after heating at 75 C,
or even at 100C, for 15 min, whereas rMcp4 partially lost its inhibitory activity after heating at 75 C for
15 min and completely lost its inhibitory activity after
Fig 4 Comparison of natural and recombinant macrocypins by SDS–PAGE (A) and IEF (B) Purified natural (nMcp) and recombinant macrocypins (rMcp) were subjected to SDS–PAGE analysis under reducing denaturing conditions and to IEF Lane M, protein mole-cular mass markers; lane S, standard protein IEF markers; lane 1, natural macrocypin (nMcp); lane 2, recombinant macrocypin 4 (rMcp4); lane 3, recombinant macrocypin 1 (rMcp1); lane 4, recom-binant macrocypin 3 (rMcp3).
Trang 6heating at 100C Similarly, rMcp1 and rMcp3
retained their inhibitory activities after incubation in
acidic (pH 2) or alkaline (pH 11) conditions While
rMcp4 lost its inhibitory activity after incubation at
pH 2, incubation at pH 11 had no influence
Stability was also examined by following conforma-tion directly by CD rMcp1 and rMcp4 were unfolded
on thermal denaturation, with closely similar transi-tions, with temperature midpoints of 78C (Fig S5B) The similarity of this value to that for clitocypin shows that the differences in sequence between these three inhibitors are not critical for the stability of the pro-teins Combined, the above results show that the inhib-itors unfold reversibly Measurements of CD spectra at
pH 2.2 and pH 11 showed only a slight decrease in ellipticity after 24 h of exposure
Specificity of macrocypins The inhibitory specificities of the natural macrocypin isolated from basidiocarps and the three recombinant macrocypins were determined against several cysteine proteases Association (ka) and dissociation (kd) rate constants and equilibrium constants (Ki) were deter-mined by continuous assays for papain, cathepsins L and V, and legumain (Table 1), for which typical biphasic progress curves were obtained For inhibition
of cathepsins S, K, B and H and trypsin by macrocy-pins, only equilibrium constants were determined (Table 2) Macrocypins 1 and 3 were found to be effec-tive inhibitors of papain, cathepsin L and cathepsin V, with a mean Ki value of 0.5 nm Macrocypin 4 was also an effective inhibitor of papain but exhibited weaker inhibition of cathepsin L and V than the other two recombinant macrocypins The sample of natural macrocypin showed weaker inhibition for these
Table 1 Inhibition of cysteine proteases by natural and
recombi-nant macrocypins Kinetic and equilibrium constants for the
inhibi-tion of papain, cathepsins L and V, and legumain were determined
under pseudo-first-order conditions in continuous kinetic assays at
25 C and calculated by nonlinear regression analysis according to
Morrison [32] Standard deviation is given where appropriate; ND,
not determined; nMcp, natural macrocypin; rMcp1, recombinant
macrocypin 1; rMcp3, recombinant macrocypin 3; rMcp4,
recombi-nant macrocypin 4.
Enzyme 10)6k a ( M )1Æs)1) 104
k d (s)1) K i (n M ) nMcp
Cathepsin L 0.33 ± 0.11 1.25 ± 0.26 3.81 ± 1.66
Cathepsin V 0.08 ± 0.01 9.85 ± 1.54 12.6 ± 3.8
rMcp1
Cathepsin L 5.52 ± 0.51 35.1 ± 3.2 0.64 ± 0.22
Cathepsin V 1.48 ± 0.01 10.3 ± 0.7 0.69 ± 0.06
rMcp3
Cathepsin L 3.58 ± 0.45 11.1 ± 0.5 0.31 ± 0.06
Cathepsin V 1.88 ± 0.09 8.43 ± 0.76 0.45 ± 0.01
Legumain 0.063 ± 0.020 5.77 ± 1.27 9.17 ± 1.09
rMcp4
Cathepsin L 1.62 ± 0.55 45.1 ± 5.1 2.76 ± 0.92
Cathepsin V 2.29 ± 0.66 33.3 ± 6.3 1.44 ± 0.11
Table 2 Inhibition of various proteases by natural and recombinant mycocypins Equilibrium constants for the inhibition of different prote-ases were determined in continuous or stopped kinetic assays and analyzed according to Morrison [32] or Henderson [33], respectively Kinetic data for the interaction of clitocypin with papain, cathepsins L, K and H, and legumain were reported previously [6] Standard devia-tion is given where appropriate nMcp, natural macrocypin; rMcp1, recombinant macrocypin 1; rMcp3, recombinant macrocypin 3; rMcp4, recombinant macrocypin 4; nClt, natural clitocypin; rClt, recombinant clitocypin; n.i., no inhibition.
Enzyme
Ki(n M )
Trang 7enzymes than the recombinant macrocypins Inhibition
of cathepsins S and K by recombinant macrocypins
was somewhat weaker, with Ki values in the range of
5–25 nm, and 10 times higher for inhibition of
cathep-sin K by rMcp1 Cathepcathep-sin B, which has both
endo-peptidase and exoendo-peptidase activities, was inhibited by
macrocypins with Ki values in the micromolar range,
with the exception of rMcp3 (Ki> 1 lm) By contrast,
cathepsin H was inhibited by natural macrocypin and
rMcp1, with Ki values in the micromolar range, while
rMcp3 and rMcp4 inhibited with 10-fold lower values
Like clitocypin, recombinant macrocypins inhibited the
cysteine protease legumain, a member of the C13
fam-ily, with an average Ki value of 6 nm The natural
macrocypin exhibited only very weak inhibition of
legumain (Ki in the micromolar range), while rMcp4
showed no inhibition at all The serine protease trypsin
was not inhibited by natural macrocypin, rMcp1 or
rMcp3, while rMcp4 exhibited weak inhibition with a
Ki value in the micromolar range The aspartic
prote-ase pepsin was not inhibited by natural or recombinant
macrocypins
The ability of natural and recombinant clitocypins
[6] to inhibit cathepsins V, S and B was also
deter-mined for comparison (Table 2) Clitocypin is an
effec-tive inhibitor of cathepsins V and S The weak
inhibition of cathepsin B reported for the natural
clito-cypin, with a Ki value of 0.48 lm [5], was not
con-firmed with either recombinant or natural clitocypin,
for which Kivalues were > 1 lm
Discussion
A novel cysteine protease inhibitor, macrocypin, has
been isolated from basidiocarps of the parasol
mush-room (M procera), in addition to a putative
clito-cypin-like cysteine protease inhibitor Based on
similarities in genetic and biochemical characteristics,
macrocypin was assigned to the mycocypin family of
fungal cysteine protease inhibitors, family I48 in the
merops database, together with clitocypin from
clouded agaric (C nebularis) [6,7]
Based on the partial protein sequence, several
macr-ocypin-coding genes and corresponding cDNAs were
amplified and sequenced, together with the promoter
and 5¢ UTR and 3¢UTR sequences (Fig S3) The
diversity observed was even greater than that observed
for clitocypin-coding genes [7], the macrocypin
sequences being divided into five groups The
variabil-ity was distributed throughout the coding sequence
and comprised one to five consecutive amino acid
sub-stitutions (Fig 2) The variability was not limited to
the coding sequence, as two different promoter
sequences were cloned A few 3¢ UTR sequences were found that showed no sequence diversity, but differed
in their length Different promoter and 5¢UTR and 3¢UTR sequences, and their lengths, together with some differences found in intron sequences and their lengths, which could influence transcription, splicing, mRNA transport, stability and translation, all suggest complex regulation of macrocypin expression at diff-erent levels
Macrocypin- and clitocypin deduced amino acid sequences show similarities, despite a low overall sequence identity of 17–21% (Fig 3) They both have high contents of proline and tyrosine and low contents
of leucine A major difference between clitocypin and macrocypin sequences lies in the presence of sulfur-containing amino acids in macrocypins, but not in clitocypins In spite of very low overall sequence simi-larity, several amino acid residues are conserved in all macrocypin and clitocypin sequences, mainly in the N-terminal half (Fig 3) These are probably important for the inhibitory activity and⁄ or structure, and many
of them are proline residues
A similarity search using macrocypin deduced amino acid sequences against the translated nucleotide data-base at the NCBI server showed significant similarity
to the putative L bicolor clitocypin-like cysteine prote-ase inhibitor Alignment of the deduced sequences of the latter with those of macrocypin from M procera and with clitocypin from C nebularis revealed con-served amino acid residues (Fig S6), the majority of which were already present in the alignment of macro-cypin and clitomacro-cypin deduced amino acid sequences, confirming their functional and⁄ or structural imp-ortance
Macrocypins and clitocypins exhibit similar basic biochemical characteristics They have similar molecu-lar masses of 19 and 16.8 kDa and simimolecu-lar isoelectric points of around 4.8 They both exhibit stability against high temperature and extremes of pH The CD spectrum in the far-UV region for macrocypin, show-ing the characteristic peak around 232 nm that indi-cates an unusually strong contribution of tryptophan residues (Fig S5A), indicates further structural similar-ity to clitocypin [6,9]
The high degree of diversity in macrocypin gene sequences indicates a mixture of inhibitors in the natu-ral macrocypin sample isolated from basidiocarps of
M procera In order therefore to characterize the inhibitors further, recombinant macrocypins were pre-pared Three different macrocypin cDNA clones were used that belong to different macrocypin groups or isoforms Heterologous expression in the bacterial expression system proved successful for all three,
Trang 8inhibitory macrocypins (rMcp1, rMcp3, rMcp4) being
obtained by one-step purification from inclusion bodies
with high yields of 50–100 mgÆL)1
The inhibitory profiles of macrocypins and
clito-cypin for several cysteine proteases are similar, but not
identical Of the cysteine proteases tested, macrocypins
inhibit papain and cathepsins L and V most strongly
Compared with clitocypin [6], macrocypins are
stron-ger inhibitors of papain, as a result of higher rate
constants of association, but weaker inhibitors of
cathepsin L, mainly because of increased rate constants
of dissociation Cathepsins S and K are inhibited by
macrocypins with Ki values in the nanomolar range,
while clitocypin inhibits cathepsin K with Kivalues in
the picomolar range The most notable difference
between macrocypin- and clitocypin-inhibition profiles
for cysteine proteases is the weak inhibition by
macro-cypins of cathepsins B and H, the papain-like
prote-ases with endopeptidase and exopeptidase activities As
with clitocypin, legumain, a member of C13 family, is
inhibited by Mcp1 and Mcp3 with Ki values in the
micromolar range The inhibition profile of rMcp4,
with Ki> 1 lm for legumain inhibition and Ki values
in the nanomolar range for papain-like proteases,
con-trasts with those of other macrocypins that inhibit
both papain-like proteases and legumain with Kivalues
in the same range This strongly suggests different
binding sites for the inhibition of the two families of
cysteine proteases Similarly, two independent and
nonoverlapping binding sites have been reported for
family C1 and C13 inhibition by legumain-inhibiting
cystatins C, E⁄ M and F [16]
The ability of macrocypins to inhibit proteases of
other catalytic classes was tested Neither macrocypins
nor clitocypin [5] inhibit the aspartic protease pepsin
The serine protease trypsin is weakly inhibited by
rMcp4, which is the most significant difference in the
inhibitory profiles of macrocypins and clitocypin
The physiological function of the macrocypin
cyste-ine protease inhibitor family is proposed to be defence
against pathogen infection and⁄ or predation by insects
or other pests, analogously to the phytocystatins that
are involved in plant defence by inhibiting exogenous
cysteine proteases during herbivory or infection [17]
The sequence diversity includes amino acid sites of
positive selection The variations in inhibitory profile
between different members of the macrocypin family
reveal different specificities and strengths of inhibition
of cysteine proteases of different evolutionary families,
and even a serine protease These findings together
suggest an adaptation process and the selection of
appropriate inhibitor isoforms providing effective
defence In addition, a regulatory role in intracellular
proteolysis may also be considered for mycocypins, because cysteine protease activity is present in basidio-mycetes and its inhibition by clitocypin was shown in
a few selected basidiomycete species belonging to dif-ferent orders [8]
In conclusion, we have characterized a novel family
of fungal cysteine protease inhibitors – the macrocy-pins – from M procera, at the genetic and biochemical levels and analyzed their inhibition profiles Similarity
to clitocypin from C nebularis is evident at all these levels, suggesting that they belong to the same family
of cysteine protease inhibitors, the mycocypins, thus substantiating the establishment of the I48 family of protease inhibitors in the MEROPS classification that previously comprised only one member In addition to the high conformational stability of mycocypins, their other common characteristic is high genetic diversity, with sequence variability influencing the inhibitory activity in macrocypins, but not in clitocypins Myco-cypins could find use in medical research, as their unique inhibitory profiles could answer the challenge
of finding highly selective inhibitors against proteases important in certain stages of diseases, without affect-ing nontarget proteases Additionally, certain represen-tatives of the macrocypin family, showing inhibition of different classes of proteases, would have applications
in plant protection Double-inhibitory activity against two catalytic classes of proteases in one stable mole-cule could provide more effective protection of plants against insect pests
Experimental procedures
Isolation of cysteine protease inhibitors from
M procera Basidiocarps of the basidiomycete M procera were col-lected from their natural habitat and frozen at )20 C Thawed basidiocarps were homogenized in an equal volume
of 0.1 m Tris–HCl buffer, pH 7.5, containing 0.5 m NaCl, and centrifuged at 8000 g for 30 min Ammonium sulfate was added to the supernatant to a final concentration of 1.3 m before application to a column of Phenyl Sepharose (GE Healthcare, Uppsala, Sweden) After thorough wash-ing with 0.1 m Tris–HCl buffer, pH 7.5, containwash-ing 1.3 m ammonium sulphate, the bound proteins were eluted by a 1.3–0 m gradient of ammonium sulphate in the same buffer Inhibitory fractions, measured against papain, were pooled, concentrated by ultrafiltration (Amicon UM-10; Millipore, Vienna, Austria) and dialyzed against 0.02 m Tris–HCl, pH 7.5 The sample was then applied to a column of DEAE– Sephacel (Pharmacia-LKB, Uppsala, Sweden) equilibrated with 0.02 m Tris–HCl, pH 7.5 Bound proteins were eluted
Trang 9with a gradient of 0–0.4 m NaCl in the same buffer
Inhibi-tory fractions were pooled and subjected to an affinity
col-umn of carboxymethylpapain–Sepharose, prepared as
described previously [5], and equilibrated with 0.02 m Tris–
HCl, pH 7.5, containing 0.3 m NaCl Bound inhibitory
fractions were eluted with 0.02 m NaOH, neutralized with
dilute HCl, and pooled and concentrated by ultrafiltration
(Amicon UM-3)
N-terminal sequence analysis
Automated amino acid sequencing of purified natural and
recombinant macrocypins was performed as described
pre-viously [6] An internal peptide sequence was determined
after cleavage with cyanogen bromide in 80 % formic acid
at room temperature in the dark for 36 h The resulting
peptide fragments were separated using reverse-phase
HPLC, as described previously [6]
Isolation of genomic DNA and total RNA
Basidiocarps of the basidiomycete M procera, harvested
from their natural habitat, were frozen in liquid nitrogen,
homogenized and stored at )80 C until use
High-molecular-weight genomic DNA was isolated from frozen
powdered tissue as described previously [18] and total
RNA was extracted using an RNeasy Kit (Qiagen,
Vienna, Austria) according to the manufacturer’s protocol
for isolation of total RNA from plant tissues and
fila-mentous fungi
Cloning of the genomic and cDNA sequences
encoding macrocypins
First-strand cDNA was synthesized from the total RNA by
RT-PCR using a GeneAmp RNA PCR Core Kit (Applied
Biosystems, Foster City, CA, USA) with anchored
oli-go(dT)-adapter primer (dT(17)3¢RACE) (Table S1) Forward
and reverse degenerate primers (forward 1N-mar-Clit,
nested forward 2N-mar-Clit and reverse C-mar-Clit) were
designed based on the N-terminal amino acid sequence and
an internal peptide fragment sequence First-strand cDNA
was used for PCR to amplify the partial macrocypin cDNA
sequence, which was then used to design specific primers
Forward specific primer (Mp-CliHom-N-uni) was used to
amplify the 3¢ end of the cDNA sequence, using the 3¢
RACE method, together with the 3¢RACE adapter primer
The resulting PCR product was used in a secondary PCR
with two different nested forward primers
(Mp-CliHom-N-A12 and Mp-CliHom-N-A3), together with the 3¢ RACE
adapter primer
To amplify the complete macrocypin gene (mcp) with its
upstream and downstream regions, Genome Walker
libraries were constructed using the Genome Walker
Universal kit (BD Biosciences Clontech) according to the manufacturer’s instructions High-molecular-weight geno-mic DNA (2.5 lg) was digested separately with two restric-tion enzymes (PvuII and StuI) at 37C overnight and after purification by ethanol precipitation, Genome Walker Adaptors were ligated to the digested DNA at 16C over-night The resulting Genome Walker libraries were used as templates in genome walking PCR amplifications, using nested forward specific primers (Mp-CliHom-ter1A and 1) for downstream amplification and nested reverse specific primers (Mp-CliHom-pro 1 and 2) for upstream amplifica-tion, paired with Adaptor Primer 1 and Nested Adaptor Primer 2 provided by the manufacturer Advantage 2 Poly-merase Mix (Clontech) was used for amplification under the conditions suggested by the manufacturer
Complete mcp gene and cDNA sequences were obtained using pairs of primers annealing to the 5¢ UTR (Mcp-N-A12-1 and Mcp-N-A3-1 with nested primers Mcp-N-uni-1
or Mp-CliHom-N-uni) and the 3¢ UTR (Mcp-C-uni-1 with nested primers Mp-CliHom-C-uni, Mp-CliHom-C-A12 or Mp-CliHom-C-A3) regions in two-step PCR amplification, using nested primers in the secondary PCR with recombi-nant Taq polymerase (MBI Fermentas, Vilnius, Lithuania) All PCR products were cloned into the pGEM-T Easy Vector System (Promega, Vienna, Austria) for sequencing
by the Automated DNA Sequencing Service at MWG Biotech (Ebersberg, Germany)
Sequence analyses Sequence analysis and multiple sequence alignments were performed in the BioEdit Sequence Alignment Editor (http://www.mbio.ncsu.edu/bioedit/bioedit.html) Promoter analysis was performed using the Transcription Element Search System (TESS, [19]; http://www.cbil.upenn.edu/cgi-bin/tess/tess) Similarity searches were performed using blastnand tblastn algorithms [13] against different data-bases at the NCBI (http://www.ncbi.nlm.nih.gov/BLAST)
Evolutionary analysis The Datamonkey web interface [20] (http://www.datamon-key.org) was used to examine selective pressure acting upon individual sites of codon alignments Three methods were implemented to test for purifying or diversifying selection: the single likelihood ancestor counting (SLAC); fixed effects likelihood (FEL); and random effects likelihood (REL) [21] A P-value of 0.05 was used for inference of positive and negative selection for individual codons A hierarchical and information theoretic model selection procedure was applied to choose a model of nucleotide substitution HKY85 was selected as the optimal time-reversible nucleo-tide substitution model using the implementation in the HyPhy package [22]
Trang 10Expression and purification of recombinant
macrocypins
Whole-length cDNA clones of Mcp1a and Mcp3a were
used as templates in the PCR amplification with Pfu DNA
polymerase (Promega) and primers that introduced NdeI
and BamHI restriction sites to the 5¢ and 3¢ ends of the
insert, respectively After resequencing the resulting
prod-uct, and digesting the inserts and vectors with NdeI⁄ BamHI
(New England Biolabs, Frankfurt am Main, Germany), the
inserts were subcloned into pET3a and pET11a vectors
(Novagen) to generate recombinant proteins without tags
Similarly, NcoI and NdeI restriction sites were introduced
into the 5¢ and 3¢ ends of the whole-length cDNA of
Mcp4a After digestion of the Mcp4a insert and vector with
NcoI⁄ NdeI (New England Biolabs), the insert was
subcl-oned into the pET14b expression vector (Novagen) to
gen-erate protein without tags All constructed expression
vectors with inserts were transformed into E coli
BL21(DE3), which was grown in LB (Luria–Bertani)
med-ium supplemented with 100 lgÆmL)1of ampicillin at 37C
The construct pET14b::Mcp4a was also transformed into
the E coli BL21(DE3) pLysS strain, which was grown at
37C in LB medium supplemented with 100 lgÆmL)1 of
ampicillin and 34 lgÆmL)1 of chloramphenicol When the
attenuance (D) at 600 nm reached 1–1.2, the inducer
iso-propyl thio-b-d-galactoside was added to a final
concentra-tion of 0.4 mm for strains transformed with pET3a or
pET14b constructs and to a final concentration of 1 mm
for strains transformed with the pET11a construct Cells
were then grown for an additional 6 or 8 h They were
har-vested by centrifugation for 15 min at 6000 g, resuspended
in buffer A (50 mm Tris–HCl, pH 7.5, 0.1 % Triton X-100,
2 mm EDTA), frozen and thawed once, then sonicated at
4C The insoluble fraction was separated by
centrifuga-tion for 15 min at 10000 g, redissolved in the same buffer
containing 3 m urea and solubilized by stirring for 4 h at
4C The remaining insoluble material was removed by
centrifugation and the supernatant was applied to a
Sepha-rose S200 column (4· 110 cm) equilibrated with Tris–HCl
buffer, pH 7.5, containing 0.3 m NaCl Inhibitory fractions
were pooled and concentrated by ultrafiltration (Amicon
UM-3) to approximately 1 mgÆmL)1
SDS–PAGE and IEF
Proteins were separated by 12.5% SDS–PAGE under
dena-turing and reducing or non-reducing conditions, as
appro-priate, and stained using Coomassie Brilliant Blue The
molecular mass values of the separated proteins were
estimated using low-molecular-mass standard proteins of
14.4–97 kDa (LMW Calibration Kit; GE Healthcare) The
Phast System (Pharmacia) was used to perform SDS–
PAGE (precast 8–25 % gradient gels) and IEF (precast pH
3–9 gradient gels), as described previously [6] Theoretical
molecular mass and pI values were determined from sequences using the protparam tool available at the Exp-aSy server of the Swiss Institute of Bioinformatics (http:// www.expasy.org/tools/protparam.html)
Inhibition assay Inhibitory activities against papain were determined as described previously [5] using Bz-DL-Arg-2-naphthylamide (Sigma, Taufkirchen, Germany) as the substrate [23]
Enzymes and determination of kinetic constants Human cathepsin H (EC 3.4.22.16) and cathepsin L (EC 3.4.22.15) were purified as described previously [24] Papain (2· crystallized) (EC 3.4.22.2) was further purified by affin-ity chromatography [25] Beta-trypsin (EC 3.4.21.4) was prepared from type IX trypsin (Sigma), as described previ-ously [26] Legumain (EC 3.4.22.34) was isolated from pig kidney cortex following a previously described procedure [27] Recombinant human cathepsin K (EC 3.4.22.38) [28], cathepsin S (EC 3.4.22.27) [29] and cathepsin B (EC 3.4.22.1) [30], all expressed in E coli, were provided by Prof Boris Turk, and recombinant human cathepsin V (EC 3.4.22.43) [31] was provided by Professor Dusˇan Turk (both from the Department of Biochemistry and Molecular and Structural Biology, Jozˇef Stefan Institute, Ljubljana, Slovenia)
Inhibition kinetics for natural and recombinant macro-cypins and clitomacro-cypins were measured under pseudo-first-order conditions, as previously described [6] Continuous kinetic assays were performed for the cysteine proteases papain, cathepsin L and cathepsin V using benzyloxycar-bonyl (Z)-Phe-Arg-7-amido-4-methylcoumarin (AMC) as substrate, and for legumain with Z-Ala-Ala-Asn-AMC as the substrate, while stopped assays were performed for cathepsins K, S and B using Z-Phe-Arg-AMC as the sub-strate and for cathepsin H with Arg-AMC as the subsub-strate [6] Trypsin was assayed using the stopped kinetic assay with fluorogenic substrate Z-Phe-Arg-AMC in 0.1 m Tris– HCl buffer, pH 8.0, containing 1.5 mm EDTA and 0.02 m CaCl2 Data for continuous assays were analyzed by non-linear regression analysis according to Morrison [32], while kinetic constants for cathepsins K, S, B and H, and trypsin, were determined according to Henderson [33] Porcine pepsin (3.4.32.1) from Sigma was assayed in 0.1 m acetate buffer, pH 3.5, using the fluorogenic substrate fluorescein isothiocarbamoyl–hemoglobin, as described for fluorescein isothiocarbamoyl–casein [34]
CD spectroscopy
CD spectra measurements and thermal-unfolding transi-tions were performed on an Aviv model 60