Rossi Fanelli’, Universita` di Roma ‘La Sapienza’, 00185 Roma, Italy Fax: +39 06 49917566 Tel: +39 06 49917574 E-mail: Stefano.Pascarella@uniroma1.it Website: http://w3.uniroma1.it/bio_c
Trang 1freestanding enzymes belonging to the eucaryotic
nonribosomal peptide synthetase-like family
Leonardo Di Vincenzo1, Ingeborg Grgurina1and Stefano Pascarella1,2
1 Dipartimento di Scienze Biochimiche ‘A Rossi Fanelli’, Universita` di Roma ‘La Sapienza’, Roma, Italy
2 Centro Interdipartimentale di Ricerca per l¢ Analisi dei Modelli e dell’Informazione nei Sistemi Biomedici (CISB), Universita` di Roma
‘La Sapienza’, Roma, Italy
Nonribosomal peptide synthetases (NRPSs) are
multi-domain, multifunctional enzymes involved in the
bio-synthesis of many bioactive microbial peptides [1,2]
This class of natural products includes a variety of
compounds with interesting biological activities
(phyto-toxins, siderophores, biosurfactants, and antiviral
agents), as well as several clinically valuable drugs
[3,4] NRPSs are organized in iterative modules, one
for each amino acid to be built into the peptide
pro-duct The minimal module required for a single
mono-mer addition consists of a condensation domain (C),
an adenylation domain (A) and a peptidyl carrier
pro-tein (PCP) also denoted as thiolation (T) domain The
A domain is involved in the selection and activation of the amino acid substrate, which is then covalently attached to the enzyme via a thioester bond with the phosphopantetheine residue of the T domain C domains are localized between every consecutive pair
of A domains and PCPs and catalyze the formation of the peptide bond between the upstream amino acyl or peptidyl moiety tethered to the phosphopantetheinyl group and the free amino group of the downstream aminoacyl moiety, thus facilitating the translocation
of the growing chain onto the next module The structural diversity of NRPS products is enriched through the occasional presence of epimerization (E),
Keywords
nonribosomal peptide synthetase; homology
modelling; docking; specifity conferring
code; freestanding NRPSs
Correspondence
S Pascarella, Dipartimento di Scienze
Biochimiche ‘A Rossi Fanelli’, Universita` di
Roma ‘La Sapienza’, 00185 Roma, Italy
Fax: +39 06 49917566
Tel: +39 06 49917574
E-mail: Stefano.Pascarella@uniroma1.it
Website: http://w3.uniroma1.it/bio_chem/
homein.html
(Received 8 August 2004, revised 30
November 2004, accepted 9 December
2004)
doi:10.1111/j.1742-4658.2004.04522.x
This work presents a computational analysis of the molecular characteris-tics shared by the adenylation domains from traditional nonribosomal pep-tide synthetases (NRPSs) and the group of the freestanding homologous enzymes: a-aminoadipate semialdehyde dehydrogenase, a-aminoadipate reductase and the protein Ebony The results of systematic sequence com-parisons allow us to conclude that a specificity-conferring code, similar to that described for the NRPSs, can be recognized in such enzymes The structural and functional roles of the residues involved in the substrate selection and binding are proposed through the analysis of the predicted interactions of the model active sites and their respective substrates The indications deriving from this study can be useful for the programming of experiments aimed at a better characterization and at the engineering of this emerging group of single NRPS modules that are responsible for amino acid selection, activation and modification in the absence of other NRPS assembly line components
Abbreviations
A domain, adenylation domain; AASDH, a-aminoadipate semialdehyde dehydrogenase; ACV, synthetase [ L -d-(a-aminoadipoyl)- L -cysteine- D -valine] synthetase; AS, putative amine-selecting domain; GrsA, gramicidin S synthetase A; HMM, hidden Markov models; NRPS,
nonribosomal peptide synthetase; PQQ, pyrroloquinoline quinone; T domain, thiolation domain; RMSD, root mean square deviation.
Trang 2cyclization (Cy), N-Methylation (N-Met) and
oxida-tion (Ox) domains [1]
Intense research work carried out in the last decade
led to the characterization of a number of new gene
clusters and to the discovery of nonclassical NRPS
sys-tems [2,5] The crystallographic structure of three
members of the adenylate-forming enzyme family,
fire-fly luciferase of Photinus pyralis [6], the A domain of
the gramicidin S synthetase A (GrsA) from Bacillus
brevis [7] and, recently, DhbE (2,3-dihydroxy-benzoate
activating module) [8], have been solved Likewise, the
structure of VibH, representative of C domains, is now
available [9] The wealth of sequence and structure
information pertaining to the A domains has been
exploited to understand the molecular bases of their
substrate specificity [10,11] Systematic comparative
analyses identified 10 sequence positions lining the
act-ive site pocket that are responsible for substrate
recog-nition and selection The nature of the residues at such
positions was correlated with the known substrates
and a specificity-conferring code was proposed also
with predictive potential [10]
Recently, it was pointed out that modules composed
of an adenylation and a thiolation domain, followed
by a domain having a redox function and not inserted
in the context of a typical NRPS cluster, can be found
in eucaryotes [12] Indeed, a-aminoadipate
semialde-hyde dehydrogenase (AASDH) and a-aminoadipate
reductase (Lys2), enzymes involved in lysine
meta-bolism in eucaryotes, display a 3-domain architecture
where the two N-terminal domains are homologous
to the A and T domains from NRPS systems and the
C-terminal part contains a redox cofactor binding
site for either pyrroloquinoline quinone (PQQ) or
NADPH In particular, AASDH, containing a PQQ
binding domain, is supposed to be involved in lysine
degradation and to convert the a-aminoadipate
semi-aldehyde to a-aminoadipate [12] Lys2, possessing a
NADPH-binding domain, is involved in lysine
biosyn-thesis; it converts the a-aminoadipate to
a-aminoadi-pate semialdehyde [13,14] Furthermore, the protein
Ebony, an enzyme from Drosophila melanogaster
involved in conjugation of b-alanine to histamine and
sharing homology to NRPS domains A and T, was
recently characterized [15]
The occurrence of gene assets, typically encountered
in the microbial world, in evolutionarily higher
organ-isms is intriguing It appears worthwhile to carry out a
deeper investigation on the extent of similarity between
the A domains of the aminoacyl adenylate-forming
enzymes of the freestanding enzymes and those of the
traditional NRPS systems In particular, how many
sequences of freestanding A domains are known and
which are the evolutionary relationships to the NRPSs? Can the nonribosomal code of the traditional NRPS systems be applied in the freestanding A domains and, if so, what is the potential role of the residues involved? To address these issues, systematic sequence comparisons, homology modelling and dock-ing simulations were employed to predict the structure
of the active site of such enzymes and to propose func-tional roles for the conserved residues
Results and Discussion
Databank searches and sequence comparison The available sequences of the freestanding NRPS modules from eucaryotic organisms were collected by means of exhaustive databank searches The psi-blast [16] suite was applied over the NR and UniProt data-banks Query sequences were Ebony from Drosophila melanogaster, AASDH from Mus musculus and Lys2 from yeast Each sequence is representative of a domain pattern: A-T-AS (AS stands for puta-tive amine-selecting domain [15]), A-T-PQQ and A-T-NADPH, respectively Only the A and T domains were included in the query sequence Table 1 reports the homologous sequences collected by these databank searches including 10 sequences from genes coding for putative A domains not yet annotated in the protein databanks which were predicted through genome scans Overall, 39 sequences were identified from dif-ferent eucaryotic species and the domain assignments were confirmed by CDD [17] and Pfam [18] queries The sequence subset formed by the A-T domains was aligned utilizing the hmmer package [19] A set of 62 sequences, corresponding to the domain A of micro-bial NRPSs, were extracted from the seed alignment of the Pfam AMP-binding family (Pfam code: PF00501) The adjacent T domain was subsequently added to each sequence The extended sequences were aligned with clustalw [20] and the final alignment was manu-ally refined to match functionmanu-ally important residues such as the Asp235 that binds the a-amino group of the substrate [11] and Ser573 site of the phosphopant-etheine attachment The resulting alignment was finally utilized to train the HMM The resulting HMM was used to align a subset of the A-T domains listed in Table 1 The alignment was manually refined and used
in turn to train the final HMM, now specific for the eucaryotic A-T domains, to carry out the alignment of all the 39 sequences (Fig 1)
On the basis of the structural equivalencies con-tained in this multiple alignment, the occurrence of a specificity-conferring code similar to that described for
Trang 3the NRPS systems [10] was tested Substrate
specifici-ties were assigned either on the basis of literature
data or by use of the NRPS prediction server [11] The
sequence positions equivalent, in the multiple sequ-ence alignment, to those involved in the described nonribosomal specificity code [10] are reported in
Table 1 List of freestanding and NRPS-like enzymes retrieved from databanks All accession numbers refer to UniProt database except where noted Boldface names denote in silico predicted proteins not included in databanks A stands for adenylation domain, T for thiolation,
C for condensation, AS for amine-selecting, PQQ for PQQ-binding domain, NADPH for NADPH-binding domain, X for other domains not commonly present in NRPSs Numeric subscript to parentheses indicate the repetition of those modules MNRPS stands for monomodular NRPS Question marks denote unassigned function Every PSI-BLAST search was performed with three iterations using as a probe the sequence of GrsA (P14687).
Protein length
Putative function
BLAST
E-value
a EMBLCDS entry name and b EnsEMBL peptide databank Boldface names denote in silico predicted proteins not included in databanks.
c,d,e denote that the TBLASTN searches against genomes used as input query sequences, P07702, Q8L5Z8, Q80WC9, respectively The genes were predicted from the nucleotide sequences:fEMBL accession number AABZ01000259 (positions coding for the protein: 3300–7700),
g EMBL AACF01000123 (15700–20800), h EMBL AAAA01021459 (854–2040), i EMBL AAAA01023971 (610–2070), j EMBL AAAA01000789 (18780–31200), k EnsEMBL ctg11952 (800001–1000000), l EnsEMBL Chr_scaffold_632 (38782–48782), m EMBL AABS01000029, n EMBL AACT01000010,oEnsEMBL scaffold_37623 (4535897–4735897).
Trang 4Fig 1 Multiple sequence alignment of adenylation domains Only conserved portions from the multiple sequence alignment obtained as described in Results are shown Dashes represent insertion and deletion Numbers above the sequences refer to the sequence numbering
of the gramicidin synthetase; is used as block separator The sequence positions equivalent to those involved in the nonribosomal speci-ficity-conferring code described for the A domain of the gramicidin synthetase are marked with blue triangles The positions of the core motifs are marked underneath with grey bars labelled according to [1] Secondary structure assignments are shown for GrsA: a-helices and b-strands are rendered as squiggles and arrows, respectively; T stands for turn; blank for coil and irregular conformations; dots represent gaps introduced in the alignment Identically conserved residues are displayed as white characters on red background Conserved regions are denoted by boxed red characters.
Trang 5Fig 1 (Continued).
Trang 6Table 2 The residues equivalent to those which in the
GrsA were observed to interact with the a-amino and
a-carboxyl groups of the amino acid substrate, Asp235
and Lys517, respectively [7], are conserved, the only
exceptions being the freestanding NRPS from Leptos-phaeria maculans (UniProt accession no Q873Z1) (lacks the Asp235) and AASDH from Acremonium chrysogenum (UniProt accession no Q9HDP9) (lacks
Table 2 Nonribosomal specifity-conferring code in the freestanding enzymes All accession numbers refer to UniProt database except those noted Boldface codes denote in silico predicted protein not included in databanks MNRPS is monomodular NRPS; ACV stands for ( L -d-(a-aminoadipoyl)-cysteine- D -valine) tripeptide synthetase Question marks denote unassigned function or substrate L -a-Aa stands for
L -a-aminoadipate, L -a-Aas stands for L -a-aminoadipate semialdehyde, b-Ala for b-alanine, Hty for hydroxyl tyrosine, the other three letters code stand for standard amino acid abbreviations.
Activated substrate
Residue position according to GrsA A domain numbering
a EMBLCDS entry name and b EnsEMBL peptide databank c Module 1 of ACV synthetase is included for comparison with the code of Lys2 and AASDH d Predicted using the NRPS prediction BLAST server [11].
Trang 7the Lys517), and the sequences of Ebony from
Dro-sophila melanogaster and from Anopheles gambiae
(UniProt accession no Q7QKF0) where the Asp235 is
missing In this latter case, the absence of the Asp235
can be explained in the light of the model of the
inter-action of the substrate with the active site (vide infra)
It should be noted that the specificity code for the A
domains recognizing the substrate b-alanine (Table 2)
is similar to that already predicted for the A module
of exochelin synthetase from Mycobacterium smegmatis
(UniProt accession no O87313) [11], the only
differ-ences being in Ebony, at the positions 239 (Ser vs
Thr), 278 (Val vs Leu), 299 (Val vs Ile) and 322 (Phe
vs Ser) The specificity codes of Lys2 and AASDH
share the residues Asp235 and Pro236 The residue
Pro236 seems to be specific for the aminoadipate
sub-strates Indeed, the only other system in which it is
present in the same position is the module 1 of the
chloroeremomycin synthetase (UniProt accession no
O52821) from Amycolatopsis orientalis specific for
3,5-hydroxy-l-phenylglycine [11] The specificity code of
the module 1 of ACV
[l-d-(a-aminoadipoyl)-l-cys-teine-d-valine] synthetase from Penicillium
chrysoge-num that activates the l-a-aminoadipate, displays
strong similarities to the Lys2 code (Table 2) with the
remarkable difference at position 235 where a Glu
resi-due replaces the conserved Asp, and at position 330,
where a Phe residue replaces the conserved Arg⁄ His
The marginal resemblance of the AASDH code to that
of Lys2 and ACV module 1 provides a structural basis
for the current view that the physiological substrate of
the dehydrogenase is l-a-aminoadipate semialdehyde
rather than l-a-aminoadipate
Traditional and freestanding A and T domains share
also some conserved core motifs In particular, the
core motifs A3 to A10 [1] are conserved in the
eucary-otic NRPS-like domains while the motifs A1 and A2
are positioned in a nonconservative section of the
alignment (not shown in the figure) However, A1 and
A2 are away from the active site and probably only
conserved in the NRPSs for structural reasons [1]
Phylogenetic analysis
To visualize evolutionary relationships among the
freestanding NRPS A domains and the
correspond-ing domains of the traditional NRPS in a
phylo-genetic tree, the A domains of 25 bacterial NRPS
and the A domain of the ACV synthetase from
Peni-cillium chrysogenum were added to the multiple
sequence alignment shown in Fig 1 The 25 bacterial
NRPS sequences were selected taking one
representa-tive from each of the different substrate specificity
groups defined by Challis et al [11] to have a view
of the substrate range utilized by these enzymes The phylogenetic tree shown in Fig 2A, was built from the portion of the multiple sequence alignment shown in Fig 1 comprised between the positions 190–331, that contain the specificity code residues and the core motifs A3 to A5, using the neighbor-joining method as implemented in the module neigh-bor of the phylip package [21] The tree accuracy was tested with 1000 bootstrap replicates On the basis of the assumption that the nine amino acids lining the binding pocket determine substrate specific-ity [11], we used maximum parsimony method imple-mented in the program protpars of the phylip package [21] to establish a relationship between these important residues and substrate specificities in the
65 A-domains considered, i.e 39 freestanding plus 26 NRPS A domains Therefore, the tree in Fig 2B was derived considering only nine sequence positions cor-responding to the eight involved in the nonribosomal specificity code [11] and the Asp235 which was inclu-ded because not always conserved On the contrary, Lys517 was not included because it was conserved in all cases considered The resulting tree obviously has
no phylogenetic meaning The phylogenetic tree based on the positions 190–331 of the complete alignment revealed two clusters containing the a-ami-noadipate reductase from fungi and the a-aminoadi-pate semialdehyde dehydrogenase from metazoa, with independent segregation from the other bacterial sequences This pattern parallels that observed in the specificity code tree reported in Fig 2B and confirms that Lys2 and AASDH recognize different substrates Another independent cluster in both trees is made by the two Ebony proteins (UniProt accession nos Q7QKF0 and O76858), domain A of exochelin syn-thetase from Mycobacterium smegmatis module 2 (UniProt accession no O87313) and the two plant hypothetical NRPS-like proteins (UniProt accession
no Q8L5Z8 and in silico predicted protein O_SATI-VA3 in Table 1) This segregation could suggest that b-alanine or a very similar compound might be the substrate of the two plant proteins ACV synthetase module 1 from Penicillium chrysogenum (UniProt accession no P26046) displays a substrate specificity identical to fungal Lys2 although its sequence is more similar to that of the metazoa AASDH Finally, it is interesting to observe the unexpected position in the trees of the protein sequences from Caenorhabditis elegans and Caenorhabditis briggsae (UniProt accession nos Q95Q02 and Q17301) These proteins, containing 2870 and 4767 residues, respect-ively, display a typical NRPS modular structure and,
Trang 8in the phylogenetic tree, are grouped with
bacter-ial NRPSs Two sequences in the same species
homologous to AASDH are observed to cluster, as
expected, in the AASDH group (UniProt accession
no Q9XUJ4 and EnsEMBL accession no ENS-CBRP00000001007)
Evolutionary trace analysis [22] (results not shown) was also applied to confirm the presence of functionally
Fig 2 Phylogenetic trees based on the multiple alignment of A domain sequences Metazoa, plants, fungi and bacteria are represented with red, green, brown and black colours, respectively All names and numbers used in the phylogenetic trees are defined in Table 1 except for the following UniProt accession numbers: P35854, D -alanine activating enzyme, Lactobacillus casei; Q50857, saframycin Mx1 synthetase B., Myxococcus xanthus; O87313, FxbB, Mycobacterium smegmatis; O30409, tyrocidine synthetase 3, Brevibacillus brevis; Q9Z4X5, CDA pep-tide synthetase II, Streptomyces coelicolor; P19828, AngR protein, Listonella anguillarum; Q45295, LchAA protein, Bacillus licheniformis; P39845, putative fengycin synthetase, Bacillus subtilis; P45745, dhbF, Bacillus subtilis; O68008, bacitracin synthetase 3, Bacillus lichenifor-mis; O68006, bacitracin synthetase 1, Bacillus licheniforlichenifor-mis; O52819, PCZA363.3, Amycolatopsis orientalis; O68007, bacitracin synthetase 2, Bacillus licheniformis; O87606, peptide synthetase, Bacillus subtilis; Q9ZGA6, FK506 peptide synthetase, Streptomyces sp.; O07944, Pristi-namycin I synthetase 3 and 4, Streptomyces pristinaespiralis; P11454, enterobactin, Escherichia coli; O52820, PCZA363.4, Amycolatopsis orientalis; O52821, PCZA363.5, Amycolatopsis orientalis; P71717, phenyloxazoline synthetase MBTB, Mycobacterium tuberculosis; Q9Z4 · 6, CDA peptide synthetase I, Streptomyces coelicolor; Q50858, saframycin Mx1 synthetase A, Myxococcus xanthus; O69246, LchAB protein, Bacillus licheniformis; P26046, N-(5-amino-5-carboxypentanoyl)-L-cysteinyl- D -valine synthetase, Penicillium chrysogenum The
‘M’ followed by a number in bacterial NRPS refers to the module A used for building the trees Enzyme substrates are indicated at the end
of the databank code with the standard one-letter code for amino acids or with the following abbreviations Aa: L -a-aminoadipate; Orn:
L -ornithine; DHPG: 3,5-hydroxy-L-phenylglycine; PGly: L -phenylglycine; b-A: b-alanine; Aas: L -a-aminoadipate semialdehyde; 3hTyr: 3-hydroxy-L-tyrosine; HPG: 4-hydroxy-L-phenylglycine; 3h4mF: 3-hydroxy-4-methyl-phenylalanine (A) Neighbor-joining phylogenetic tree based on the comparison of alignment positions 190–331 The numbers on the branches indicate the number of times the partition of the species into the two sets which are separated by that branch occurred among the 1000 bootstrap trees; (B) maximum parsimony tree calculated with the nine amino acid lining the substrate binding pocket of adenylation domains.
Trang 9important residues conserved at different levels of
parti-tion of the freestanding NRPS family This method
exploits the information inherent in a family of
homo-logous proteins by dividing it to maximize functional
similarity within the groups and functional variation
between the single groups The analysis was conducted
using the TraceSuite II server (http://www.cryst
bioc.cam.ac.uk/jiye/evoltrace/evoltrace.html) with the
same multiple sequence alignment used for building
the tree shown in Fig 2A The results showed that the
core motifs A3 to A5 are conserved in almost all
parti-tions and are characteristic of the NRPS A domains
Furthermore, the variability of the residues of the
spe-cificity code confirms that they are group-specific
except for the residues Asp235 and Pro236 that are
shared by the two groups, AASDH and Lys2, which
bind similar substrates (Table 2)
Modelling of active sites and docking studies
Molecular modeling, manual and automated docking
have been utilized to map the conserved residues
onto a hypothetical active site structure, to
under-stand the role of their conserved residues and predict
their interaction with the substrates Figures 3, 4 and
5 report the model active sites of Ebony from
Drosophila melanogaster, Lys2 from Saccharomyces
cerevisiae and the AASDH from Homo sapiens,
respectively
The reliability of docking experiments using homo-logy models built at a sequence identity to the tem-plate of 25–30%, as in the reported case, can be
Fig 3 Model structure of Ebony from Drosophila melanogaster.
Ebony model is represented in teal blue cartoons AMP molecule is
rendered as a stick model The specifity code residues are shown
as stick models with superimposed slate blue CPK models Carbon,
oxygen, nitrogen and phosphorous atoms are displayed with green,
red, blue, purple colors, respectively b-Alanine is represented as a
stick model with grey carbon atoms Dashes indicate hydrogen
bonds This figure was rendered using PYMOL [31].
Fig 4 Model structure of the active site of Lys2 from Saccharo-myces cerevisiae AMP molecule is shown as a stick model Car-bon, oxygen, nitrogen and phosphorous atoms are displayed with green, red, blue, purple colors, respectively The two possible assets of the substrate L -a-aminoadipate ( L -a-Aa) are superimposed and represented as sticks Carbon atoms are colored in two differ-ent way: cyan for L -a-Aa in which the d-carboxyl group forms an hydrogen bond with Lys517; green for L -a-Aa in which a-carboxyl forms a hydrogen bond with Lys517 The other atoms are colored
as in AMP All the residues in the active site are rendered as CPK and colored in slate blue This figure was rendered using PYMOL [31].
Fig 5 Model structure of AASDH from Homo sapiens AASDH main chain is represented in teal blue cartoons; AMP is shown as stick model Carbon, oxygen, nitrogen and phosphorous atoms are displayed with violet, red, blue, purple colours, respectively.
L -a-Aminoadipate semialdehyde ( L -a-Aas) is represented as stick and carbon atoms are green The specifity code residues are shown as sticks and CPK Sticks are colored as in AMP except for carbon atoms which are in grey, and CPK which are colored in blue marine This figure was rendered using PYMOL [31].
Trang 10questionable Indeed, the superposition of the three
structures related to the freestanding A domains,
namely GrsA, firefly luciferase and DhbE that share
16% sequence identity on average, shows that the
average RMSD over the Ca of the entire structures is
2.6 A˚ On the contrary, the average RMSD calculated
over the Caenclosed in a sphere of radius 9 A˚ centered
at the GrsA residue Asp235 in the active site, is
0.95 A˚ Indeed, the active sites of the enzymes tend to
be structurally more conserved during evolution [23]
Therefore the error affecting the active site is expected
to be lower than that regarding the rest of the protein
Consequently, the docking studies can still provide
useful and testable indications
In the active site of Ebony (Fig 3), two residues
of the traditional nonribosomal code Asp235 and
Pro236, are replaced by Val and Asp, respectively
The aspartate in position 236 can form a hydrogen
bond to the b-amino group of the b-alanine substrate,
which interacts also via hydrogen bonds with Ser301
and Asp331 The other residues line the active site
pocket A bulky aromatic residue (Phe322) serves as
the floor of the active site pockets Apparently, the
rearrangement of the side chains at the active site
enabled the enzyme to recognize a substrate with a
b-amino instead of a a-amino group Interestingly,
substitution of Asp235 is indicative of the substrate
structure For example, in the case of DhbE position
235 is occupied by Asn and the relative susbstrate
lacks a a-amino group [8]
It has been proposed, for Lys2 from S cerevisiae,
that the l-a-aminoadipate substrate could be
adenyl-ated at the d-carboxylate rather than the
a-carboxy-late and that the a-amino and a-carboxyl groups of
the substrate bind at the bottom of the pocket
inter-acting with the Arg239 and Glu322 [14] Analogous
arrangement was proposed also for the binding of
l-a-aminoadipate to the adenylation domain of the
ACV synthetase from Penicillium chrysogenum [7]
The results of the docking experiments indicated
(Fig 4) that the possible binding modes cluster into
two solutions According to the first possibility, the
substrate a-aminoadipate is bound to the active site
with a salt bridge between the a-amino group and
the a-carboxyl group of Asp235 and a hydrogen
bond to the carbonyl group of Arg330 In yeast
Lys2, the d-carboxylate group of the substrate forms
a salt bridge with Arg239 Finally, the substrate
a-carboxylate interacts via hydrogen bonds with the
e-amino group of Lys517 The other residues of the
putative specificity code line the walls of the active
site In particular, the conserved Pro236 shapes the
pocket to host the substrate An alternative
inter-action way of binding of the substrate to the active site involves the formation of a salt bridge between the d-carboxylate group and the e-amino group of Lys517 and between the a-carboxylate and Arg239 The a-amino group interacts via hydrogen bonds with the carbonyl oxygens of Met322, Gly324 and Arg330 The first binding mode of the substrate (the a-carb-oxylate interacting with Asp235) is supported by the invariancy of Asp235 that usually stabilizes the a-amino group of the amino acid substrate The importance of Asp235 in Lys2 is evidenced also by mutational analysis which showed a complete loss of catalytic activity for the mutant Asp235fiAsn, while the mutant Asp235fiGlu retained only 4% of cata-lytic activity [24] Also, this binding mode is in line with the absence of a negatively charged side chain in the position 322 of the putative a-aminoadipate spe-cificity code (Table 2) whose role is to stabilize the a-amino group of the substrate Such a residue (Glu322) is present in ACV synthetase However, most importantly, the same binding mode does not account for the experimental evidence of the existence
of the a-aminoadipoyl-C6-AMP [13], which can be explained by the binding mode with the d-carboxylate
in proximity of Asp235
The results of the docking studies of a-aminoadipate semialdehyde, assumed to be the substrate of AASDH [12] (Fig 5), show that the substrate can interact with the active site in only one orientation It involves the formation of a salt bridge between the a-amino group
of the substrate and the carboxylic group of Asp235 and a hydrogen bond to the carbonyl atom of Ser330 The d-aldehyde group of the substrate interacts with Gln278 Finally, the substrate a-carboxylate, as expec-ted, interacts via hydrogen bonds with the e-amino group of Lys517 in both enzymes Once again, this binding mode can explain the invariancy of Asp235 and this model can account for the lack of a negatively charged side chain at position 322 of the putative spe-cificity code (Table 2) able to stabilize the a-amino group of the substrate which is instead present in ACV synthetase (Glu322)
The results reported in this work demonstrate that a specificity-conferring code can be recognized also in the freestanding eucaryotic NRPS-like enzymes A role for some of the specificity residues could be predicted
on the basis of in silico studies These indications can
be useful for programming experiments aimed at a bet-ter characbet-terization and at the engineering of this emerging group of single NRPS modules responsible for amino acid selection, activation and modification
in the absence of other NRPS assembly line compo-nents