Phylogenetic relationships in class I of the superfamily of bacterial, fungal, and plant peroxidases Marcel Za´mocky´ Institute of Molecular Biology, Slovak Academy of Sciences, Bratisla
Trang 1Phylogenetic relationships in class I of the superfamily of bacterial, fungal, and plant peroxidases
Marcel Za´mocky´
Institute of Molecular Biology, Slovak Academy of Sciences, Bratislava, Slovakia
Molecular phylogeny among catalase–peroxidases,
cyto-chrome c peroxidases, and ascorbate peroxidases was
ana-lysed Sixty representative sequences covering all known
subgroups of class I of the superfamily of bacterial, fungal,
and plant heme peroxidases were selected Each sequence
analysed contained the typical peroxidase motifs evolved to
bind effectively the prosthetic heme group, enabling
per-oxidatic activity The N-terminal and C-terminal domains of
catalase–peroxidases matching the ancestral tandem gene
duplication event were treated separately in the phylogenetic
analysis to reveal their specific evolutionary history The
inferred unrooted phylogenetic tree obtained by three
dif-ferent methods revealed the existence of four clearly
separated clades (C-terminal and N-terminal domains of
catalase–peroxidases, ascorbate peroxidases, and cyto-chrome c peroxidases) which were segregated early in the evolution of this superfamily From the results, it is obvious that the duplication event in the gene for catalase–peroxidase occurred in the later phase of evolution, in which the indi-vidual specificities of the peroxidase families distinguished were already formed Evidence is presented that class I of the heme peroxidase superfamily is spread among prokaryotes and eukaryotes, obeying the birth-and-death process of multigene family evolution
Keywords: ascorbate peroxidase; catalase–peroxidase; cyto-chrome c peroxidase; birth-and-death process; lateral gene transfer
Heme peroxidases are very abundant enzymes present in all
living forms These oxidoreductases are involved in a wide
array of physiological processes, the most important of
which are involved in the response to various forms of
oxidative stress [1,2] Attention was mainly drawn to the
family of catalase–peroxidases by the representative
enco-ded by KatG in Mycobacterium tuberculosis, which is
capable of oxidative activation of isoniazid (isonicotinic
acid hydrazide) [3], still the most widely used
antitubercu-losis drug All heme peroxidases have important features
of their catalytic mechanism in common After their initial
oxidation with a molecule of hydrogen peroxide, they
oxidize from the reactive intermediate known as compound
I a wide variety of substrates according to the simplified
reaction scheme: H2O2+ 2AH fi 2H2O + 2A
The detailed reaction mechanism and substrate specificity
of numerous peroxidases have been investigated for decades,
and a large amount of experimental data has accumulated
(e.g [4], for review) It was suggested that similar
heme-containing peroxidases which are very abundant in plants,
fungi and some bacteria should constitute the plant
peroxi-dase superfamily [5] They were further classified into three subclasses according to their cellular localization and function All representatives possess the same heme pros-thetic group containing high-spin ferric iron, so the reaction specificity is apparently determined by the protein surround-ings of the heme Catalase–peroxidases, which belong to class I, are the only group of this superfamily that possess notable catalase activity (i.e they can oxidize and reduce hydrogen peroxide; see [6] for details) All other members of the superfamily can only reduce hydrogen peroxide with subsequent oxidation of a secondary substrate These ÔnoncatalaseÕ members of class I exhibit strong specificity for electron donors: the preferred substrate is ascorbate in the case of ascorbate peroxidases and cytochrome c for cytochrome c peroxidases Several crystal structures of heme peroxidases have been solved, now covering all subgroups
of this superfamily In class I, the crystal structure of cytochrome c peroxidase (CCP) from Saccharomyces cerevisiae[7] and ascorbate peroxidase (APX) from Pisum sativum[8] is known; the former has served as a benchmark for peroxidase structures for two decades APX was already crystallized in a complex with its substrate [9] After many unsuccessful attempts, the crystals of several catalase– peroxidases were also obtained (e.g [10,11]) Recently, the structure of catalase–peroxidase from the halophilic arch-aeon Haloarcula marismortui was solved to high resolution [12], and this was followed by the highly resolved structure of KatGfrom Burkholderia pseudomallei [13]
The phylogenetic relations of heme peroxidases have only been analysed to a certain extent: the evolutionary analysis
of the mammalian peroxidase superfamily has been per-formed, and even a prokaryotic member has been detected [14] The phylogenetic relations in the plant peroxidase
Correspondence to M Za´mocky´, Institute of Molecular Biology,
Slovak Academy of Sciences, Du´bravska´ cesta 21, SK-845 51
Bratislava, Slovakia Fax: + 4212 59307416,
Tel.: + 4212 59307441, E-mail: umikmzam@savba.sk
Abbreviations: APX, ascorbate peroxidase; CCP, fungal
cyto-chrome c peroxidase; CP, catalase–peroxidase; CPn, N-terminal
domain of a catalase–peroxidase; CPc, C-terminal domain of a
catalase–peroxidase; KatG, gene for catalase–peroxidase;
NJ, neighbor-joining.
(Received 22 April 2004, revised 10 June 2004, accepted 21 June 2004)
Trang 2superfamily have been analysed only partially: the common
phylogeny of catalase–peroxidases and APXs have been
outlined [15], and a dendrogram of 29 lignin and manganese
peroxidases have been presented [16] The present study
should contribute to our understanding of the possible
modes of evolution of multigene families In principle, two
possible schemes have been suggested for this type of
phylogeny: (a) concerted evolution and (b) evolution by a
birth-and-death process In the first case, multigene families
arise in the genomes after gene duplications by the
mechanisms of unequal crossing over and gene conversion,
followed by natural selection [17] In the second case, new
genes are created by repeated gene duplication; some are
maintained in the genome for a long time, whereas others
are deleted or become nonfunctional [18–20] It is well
documented that almost all KatGs contain two fused copies
of the primordial peroxidase gene [5,15] The copy
trans-lated into the N-terminal domain participates in catalysis
and possesses the prosthetic heme group, but the catalytic
function of the C-terminal domain is not apparent [12], and
thus the role of the corresponding part of the gene is also
unknown These gene fusions together with single-copy
genes of APXs and CCPs are ideal for investigating the
evolution of a widespread multigene family Hence, the
most probable evolutionary route leading to clades of extant
heme peroxidases will help to explain the occurrence and
function of multigene families present in both prokaryotic
and eukaryotic genomes
Experimental Procedures
Sequence data
All protein sequences used in this study were obtained from
the UniProt database and are listed in Table 1 together with
their accession numbers and the organisms from which they
originate The protein sequences were used to infer the
phylogenetic relationships Both codon usage bias of
analysed sequences ranging from archaea to higher plants
and the presence of introns in only some of the members
analyzed (plant APXs) can cause serious problems with
analyzing the DNA sequences directly Owing to the
currently unequal availability of the sequences of the
proposed groups of heme peroxidases (known from
previ-ous analysis in [15]), only representative sequences from
each group and from each kingdom were selected for a
statistically equilibrated phylogenetic analysis All 34
cata-lase–peroxidases analysed here were divided into N-terminal
and C-terminal domains because of the apparent tandem
gene-duplication event reported previously [5,15] The
border between the domains was easily discernible because
of conserved residues and motifs present in all known
KatGs All sequenced N-terminal domains are longer
(average length 430 amino acids) because of several
insertions not present in C-terminal domains (309 amino
acids on average [6]) Twenty-one APXs, from red algae to
higher plants, were selected for the analysis APX genes
expressed both in cytoplasm and chloroplasts were chosen
in equal amounts Two APXs from Euglenozoa were also
included The 25 amino acid-long fragment of APX from
bovine eye (accession No PC4445) could not be used in this
analysis This N-terminal stretch is insufficient in length and
of rather unclear origin Moreover, no other homologs of APX are known in the whole kingdom Animalia Besides the well-known S cerevisiae CCP sequence, two additional ascomycetous CCP sequences (as putative ORFs from sequencing projects) were also included in this study No homologs of yeast CCP from other kingdoms are known, and, for example, bacterial CCPs belong to a different protein family
Multiple sequence alignments Multiple sequence alignments of catalase–peroxidases, and
of APXs with CCPs, were performed usingCLUSTALX[21]
In the case of catalase–peroxidases, two partial alignments were performed, for the N-terminal and C-terminal domain Suitable parameters for all three partial alignments were: gap opening penalty, 10.0; gap extension penalty, 0.2; and gap separation distance, 8 The Blosum 62 series protein– weight matrix was used in all three cases These parameters were the same as those used for the first alignment of class I
of the peroxidase superfamily [15] Varying the gap opening penalty setting in the range 5.0–20.0 did not change the alignment output significantly The sequence alignments were displayed with GENEDOC [22] and refined manually with respect to known structural homology
Profile alignments The profile alignment mode ofCLUSTALXwas used stepwise
on the partial alignments Firstly, the N-terminal domains
of catalase–peroxidases were aligned with the prealigned group of APXs and CCPs where the known secondary-structure elements were taken into account This new profile was finally used to align the group of C-terminal domains of catalase–peroxidases which share the lowest sequence similarity with other superfamily members in catalytically essential regions Suitable parameters used for all profile alignments were: gap opening penalty, 10.00; gap extension penalty, 0.1; Blosum 30 protein–weight matrix; helix and strand gap penalty, 4; and loop gap penalty, 1 Finally, areas
of extensive gaps (i.e longer than 10 amino acid positions and present in more than 90% of sequences) were omitted from the entire alignment to prevent long-branch attraction
in the following procedures
Phylogenetic analysis The profile alignment used for the phylogenetic analysis comprised 94 sequences (each KatG divided in the two corresponding domains) and a total length of 398 amino-acid positions Three different phylogenetic methods were applied
First, the phylogenetic relationships were inferred using the neighbor-joining (NJ) method selected from the package MEGA[23] The following parameters were used: the Poisson correction of substitutions; the option of Ôcomplete deletionÕ for handling gaps; and 100 bootstrap replications as a test of inferred phylogeny The resulting unrooted tree topology was visualized in theTREE EXPLORER
The same profile alignment of 94 sequences was subjected
to the bootstrap procedure of thePHYLIPpackage [24] After
100 bootstrap cycles, the data set was subjected to the
Trang 3Table 1 Sequences of enzymes used in this study Abbreviations for all peroxidases included in this evolutionary analysis, with their accession numbers from the UniProt database and organisms from which they originate In the case of catalase–peroxidases, the parts coding for the N-terminal and C-N-terminal domains of the corresponding genes (KatG) were treated separately Sequence data for Candida albicans was obtained from the Stanford Genome Technology Center website at http://www-sequence.stanford.edu/group/candida.
Abbreviation Accession number Enzyme Organism (strain)
ArathaAPXc Q05431 Ascorbate peroxidase 1 Arabidopsis thaliana
ArathaAPXt Q42593 Ascorbate peroxidase (thylakoid) Arabidopsis thaliana
ArchfulCP O28050 Catalase–peroxidase Archaeoglobus fulgidus
AspefumCP Q7Z7W6 Catalase–peroxidase Aspergillus fumigatus
AspenidCP Q96VT4 Catalase–peroxidase Emericella nidulans
BacihalCP Q9KEE6 Catalase–peroxidase Bacillus halodurans
BacisteCP P14412 Catalase I Geobacillus stearothermophilus BlumgraCP Q8 · 1 N3 Catalase–peroxidase Blumeria graminis
BurkcepCP Q9AP06 Catalase–peroxidase Burkholderia cepacia
BurkpseCP Q939D2, pdb: 1MWV Catalase–peroxidase Burkholderia pseudomallei CandalbCCP Contig19–10046* Cytochrome c peroxidase Candida albicans
CapsannAPX Q84UH3 Ascorbate peroxidase Capsicum annuum
CaulcreCP O31066 Peroxidase/catalase Caulobacter crescentus
ChlamspAPX Q9SXL5 Ascorbate peroxidase Chlamydomonas sp W80 ChlareiAPX O49822 Ascorbate peroxidase Chlamydomonas reinhardtii CucusatAPX Q96399 Ascorbate peroxidase (cytosolic) Cucumis sativus
CucurcAPXt O04873 Ascorbate peroxidase (thylakoid-bound)
Kurokawa Amakuri
Cucurbita cv.
DesulfiCP ZP_00096951 Catalase–peroxidase Desulfitobacterium hafniense E_coliHPI P13029 Catalase HPI Escherichia coli
E_coliPCP P77038 EHEC-strain catalase peroxidase
(strain 0157:H7)
Escherichia coli EuglgraAPX Q8LP26 Ascorbate peroxidase Euglena gracilis
FraganaAPX O48919 Ascorbate peroxidase Fragaria x ananassa
GaldparAPX Q8GT26 Hybrid-type ascorbate peroxidase
(Rhodophyta)
Galdieria partita GeobactCP AAR35476 Catalase–peroxidase Geobacter sulfurreducens GloeobaCP Q7NGW6 Catalase–peroxidase Gloeobacter violaceus
GlycmaxAPX Q43758 Ascorbate peroxidase 1 Glycine max
GosshirAPX Q39780 Ascorbate peroxidase Gossypium hirsutum
HalomarCP O59651, pdb: 1ITK Catalase–peroxidase Haloarcula marismortui HalosalCP Q9HHP5 Catalase–peroxidase Halobacterium salinarum LegipneCP Q9ZGM4 Catalase–peroxidase Legionella pneumophila LycoesAPXt Q8LSK6 Ascorbate peroxidase (thylakoid) Lycopersicon esculentum MesecryAPX Q42909 Ascorbate peroxidase Mesembryanthemum crystallinum MesolotCP Q987S0 Catalase–peroxidase Mesorhizobium loti
MethaceCP Q8TS34 Catalase–peroxidase Methanosarcina acetivorans MycoforCP O08404 Catalase–peroxidase Mycobacterium fortuitum MycosmeCP Q59557 Catalase–peroxidase Mycobacterium smegmatis MycospeCP Q9R2E9 Catalase–peroxidase Mycobacterium vanbaalenii MycotubCP Q08129 Catalase–peroxidase Mycobacterium tuberculosis NcrassaCP Q8 · 182 Catalase–peroxidase Neurospora crassa
Ncrassahyp Q7SDV9 Hypothetical protein Neurospora crassa
NictabAPXc Q42941 Ascorbate peroxidase (cytosolic) Nicotiana tabacum
NictabAPXt Q9XPR6 Ascorbate peroxidase (thylakoid-bound) Nicotiana tabacum
OryzsatAPX P93404 Ascorbate peroxidase Oryza sativa
PenimarCP Q8NJN2 Catalase–peroxidase Penicillium marneffei
PisusatAPX P48534, pdb: 1APX Ascorbate peroxidase Pisum sativum
PorpyezAPX Q7Y1X0 Ascorbate peroxidase (cytosolic)
(Rhodophyta)
Porphyra yezoensis PseuputCP Q88GQ0 Catalase–peroxidase HPI Pseudomonas putida KT2440 RhizlegCP Q8RJZ6 Catalase–peroxidase Rhizobium leguminosarum SacchceCCP P00431, pdb: 2CYP Cytochrome c peroxidase Saccharomyces cerevisiae ShewoneCP Q8EIV5 Catalase–peroxidase HPI Shewanella oneidensis
SpinolAPXt O46921 Ascorbate peroxidase (thylakoid) Spinacia oleracea
Trang 4pairwise protein distance calculation method in which the
JTT protein matrix [24] was formed This output was put in
the Fitch–Margoliash Ôleast squaresÕ phylogenetic tree
estimation method, in which the search for the best trees
was allowed Tn addition, global rearrangement of the
sequence order after each cycle in theFITCHprogram was
activated The series of trees produced was analysed by the
Consense method to reveal the majority rule consensus tree
This tree was visualized with the programTREEVIEW[25]
The maximum likelihood unrooted phylogenetic tree was
also calculated using the programPUZZLE, version 5.0 [26]
The WAG model of amino acid substitution was applied
[27] Slow and accurate parameter estimation and 50 000
puzzling steps were used on the set of data subjected to the
above methods The c-distribution of rate heterogeneity
with parameter estimation from the actual data set was used
(value obtained for parameter Gamma¼ 0.62) In total,
230 300 quartets were analysed, and an unrooted quartet
puzzling tree was produced This tree was also visualized
with the program TREEVIEW [25] The highest likelihood
trees resulting from all three methods described were
compared to arrive at the expected tree
Structural comparisons
Experimental 3D co-ordinates of two catalase–peroxidases,
one APX and one CCP, were obtained from the Protein
Data Bank, Research Collaboratory for Structural
Bio-informatics, Rutgers University, New Brunswick, NJ, USA
(http://www.rcsb.org) Their codes are mentioned in
Table 1 by the corresponding sequences The
secondary-structure elements of all secondary-structures used for comparison
were outlined by PDBSum (http://www.biochem.ucl.ac.uk/
bsm/pdbsum) [28] The secondary-structure content was
quantified from the resulting plots with the program
PRO-MOTIFimplemented in PDBsum
Results and discussion
Conserved regions and typical motifs in the sequences
of class I peroxidases
Sixty heme peroxidases belonging to class I of the plant
peroxidase superfamily were aligned with the option of
profile alignment inCLUSTALX The overall sequence
simi-larity is 28.5%, as calculated from the 398 amino acid-long
alignment used for the phylogenetic analysis The three
most important sequence areas possibly involved in the catalytic mechanism are presented in Figs 1–3 Region A is located on the distal side of the prosthetic heme group, and regions B and C are located on proximal side The unambiguous sequence similarities in these regions can also
be traced in the known 3D crystal structures of class I peroxidases presented in Fig 4 for members of each group analysed
The greatest sequence conservation is achieved in the area
on the distal side of the heme prosthetic group (known as peroxidase consensus pattern PS00436 in the Prosite database) surrounding the active site, where it reaches 76% (Fig 1) The catalytic triad Arg92, Trp95, and His96
in HalomarCPn (abbreviations of all sequences analysed are listed in Table 1) located in the distal heme cavity is invariantly conserved among all N-domains of catalase– peroxidases, all CCPs and all APXs Whereas the essential arginine (Arg92) and histidine (His96) are responsible for compound I formation [4], the latter allowing the heterolytic cleavage of the peroxide bond via acid-base catalysis [29], the coessential tryptophan (Trp95) facilitates the two-electron reduction of compound I by hydrogen peroxide [30] Site-directed mutagenesis in E_coliHPIn [31], MycotubCPn [32], and SyncyspCPn [29], as well as in PisusatAPX [33] and SacchceCCP [34], supported the role
of the catalytic triad by affecting the typical reactivity The level of decrease in the peroxidase activity (in contrast with catalase activity) correlated with the ability of the respective mutants to bind heme Residues corresponding to the catalytic triad are not conserved in the C-domains of catalase–peroxidases which do not bind heme The position corresponding to Arg92 (the numbering corresponds to HalomarCP, in which the residues can also be found; Fig 4A,B) is variable in C-domains, but there a similar basic residue occurs (e.g Lys465; Fig 4B) The position corresponding to His96 is even more variable in all C-domains of the catalase–peroxidases investigated In contrast, the positions Trp95 and Trp468 were invariantly conserved among all class I representatives except Mesecry-APX From Fig 1 it is obvious that the extension of the distal active site exhibits high sequence conservation, although lower than the region directly involved in the reaction with the peroxidic substrate An essential aspara-gine (Asn126 of HalomarCPn) was located here, and its role
is supported by a mutagenesis study [35] The hydrogen-bonding network in which this residue is involved has subtle differences from that present in PisusatAPX [36] visible in
Table 1 (Continued).
Abbreviation Accession number Enzyme Organism (strain)
StreretCP O87864 Catalase–peroxidase Streptomyces reticuli SyncyspCP P73911 Catalase HPI Synechocystis sp (PCC6803) SynecspCP Q55110 Catalase–peroxidase Synechococcus sp (PCC7942) TrypcruAPX Q8I1 N3 Ascorbate-dependent peroxidase Trypanosoma cruzi
VibrchoCP Q9KRS6 Catalase–peroxidase Vibrio cholerae
VignungAPX Q41712 Ascorbate peroxidase Vigna unguiculata
XantcamCP Q8PBB7 Catalase–peroxidase Xantomonas campestris XylefasCP Q9PBB2 Catalase–peroxidase Xylella fastidiosa
YerspesCP Q9X6B0 Catalase–peroxidase Yersinia pestis
ZeamaysAPX Q41772 Ascorbate peroxidase Zea mays
Trang 5the sequence alignment (e.g Asn121 in HalomarCPn and
Glu65 in PisusatAPX) The area around the proximal heme
ligand (His259 in HalomarCPn, His163 in Pisusat APX,
His175 in SacchceCCP; Fig 2) is less conserved The overall
sequence similarity around this iron ligand is only 48%
Nevertheless, the corresponding peroxidase consensus
pat-tern PS00435 (Prosite database) is discernible in all the
peroxidases analysed The lower sequence similarity
com-pared with the distal side can be explained by the fact that
this rather variable region contributes significantly to the
reaction specificity of the respective groups and therefore
each family has its own typical feature in this region The
iron of the prosthetic heme group is invariantly
co-ordinated by the above essential proximal histidine
(Fig 4A,C,D) However, in all C-domains of catalase–
peroxidases, there is a conserved arginine (Arg622 in
HalomarCPc) in the corresponding position, indicating that
these domains lost their ability to co-ordinate the heme
Further, a conserved tryptophan (Trp311 in HalomarCPn,
Trp179 in PisusatAPX, and Trp191 in SacchceCCP) is
thought to participate in an important hydrogen-bond network on the proximal side of the heme This residue is not conserved in all C-domains of catalase–peroxidases and some APXs (e.g position Phe171 in MesecryAPX), sup-porting the theory that it is not essential for the reaction mechanism of APXs [33] In contrast, for CCPs, Trp191 has been suggested to be the site of the free-radical formation of the corresponding CCP compound I [37] Site-directed mutagenesis was performed in the proximal heme cavity of SyncyspCPn [38] and SacchceCCP [34] and focused on the function of the two residues In mutated catalase–peroxid-ases, substitutions in both residues had a pronounced effect:
a decrease in activity and loss of the prosthetic heme group Similarly to the distal heme region, close to the essential His and Trp, there are highly conserved positions (Fig 2) among all the sequences investigated with unknown func-tion Even though the C-domains of the catalase–peroxid-ases had lost the ability to bind the prosthetic heme group, the structural elements remained conserved In addition, in the N-domains of the catalase-peroxidases, there is a large,
Fig 1 Multiple sequence alignment of 50 selected representatives of the superfamily of bacterial, fungal and plant heme peroxidases: region on the distal side of the prosthetic heme group Abbreviations of enzyme sources are defined in Table 1 Numbers indicate the position of each presented segment within the corresponding sequence Sequences are grouped together as discussed in the text (i.e catalase–peroxidases divided into two separate domains, CCPs, and APXs) Sequence similarity is graded from light grey (low similarity) to black (highest similarity) Functionally important residues involved in the catalytic mechanism are marked with an asterisk This figure was constructed using GENEDOC [22] The complete alignment of these sequences is available upon request.
Trang 6Fig 2 Multiple sequence alignment of 50 selected representatives of the superfamily of bacterial, fungal and plant heme peroxidases: region on the proximal side of the prosthetic heme group Abbreviations of enzyme sources are defined in Table 1 Numbers indicate the position of each presented segment within the corresponding sequence In the case of cata-lase–peroxidases and some APXs a large insertion is present here Sequences are grouped together as discussed in the text (i.e catalase–peroxidases divided into two separate domains, CCPs, and APXs) Sequence similarity is graded from light grey (low similarity) to black (highest similarity) Functionally important residues involved
in the catalytic mechanism are marked with
an asterisk This figure was constructed using
GENEDOC [22].
Trang 736 amino acid-long insertion (between residues Asp268 and
Thr304 in HalomarCPn) which has been suggested to have
a function in the strength of Fe–N co-ordination on the
proximal side [6] This unique sequence motif in known
KatGstructure(s) is in principle a large loop [12] leading
from the edge on the proximal side of heme to the molecular
surface on the distal side Part of this loop on the surface,
around Glu271 of HalomarCP (Fig 4A, shown in green),
forms an entrance to the substrate access channel, and the
remainder interacts with the C-domain of the neighboring
subunit [12] This loop contributes to the typical
organiza-tion of the substrate channel to the active site (compare
Fig 4A with Fig 4C and Fig 4D), not surprisingly as it is a
very flexible region which could not be located in the
electron density map The role of this unique insertion has
been examined in MyctubCPn by mutating Ser315 to a
threonine [39] The mutated protein did not activate
isoniazid because of the introduction of a steric hindrance
in the access channel Hence, it is very likely that this extension from the proximal side to the substrate channel guarantees efficient catalytic reaction of catalase–peroxid-ases via rapid diffusion through a channel to the heme in the active site, similarly to monofunctional catalases [40]
A third conserved sequence pattern is located nearer the C-termini of the investigated sequences With a sequence similarity of 52%, it is above the average for all the sequences Asp372 and Asp686 in HalomarCP (marked in Fig 3 with an asterisk) are 100% conserved in all the peroxidases analysed This invariant aspartate forms an important hydrogen bond with the proximal heme ligand (His259), facilitating the reactivity of the heme iron [8] Site-directed mutagenesis was performed in this region in SyncyspCPn [38] and SacchceCCP [34], with a large effect
on the reactivity of the engineered peroxidases In contrast with other catalytically important regions, this essential aspartate remained conserved in all known C-domains of
Fig 3 Multiple sequence alignment of 50 selected representatives of the superfamily of bacterial, fungal and plant heme peroxidases: conserved region around the essential aspartate on the proximal heme side Abbreviations of enzyme sources are defined in Table 1 Numbers indicate the position of each presented segment within the corresponding sequence Sequences are grouped together as discussed in the text (i.e catalase–peroxidases divided into two separate domains, CCPs, and APXs) Sequence similarity is graded from light grey (low similarity) to black (highest similarity) Functionally important residues involved in the catalytic mechanism are marked with an asterisk This figure was constructed using GENEDOC [22].
Trang 8catalase–peroxidases, although its function in these domains
is not apparent In this third analysed region, residues in
positions for which the function has not yet been determined
are highly conserved (Fig 3)
Phylogenetic relationships
The consensus phylogenetic trees produced by NJ and Fitch
distance methods as well as the reconstructed tree produced
by thePUZZLEmethod revealed four main clades of heme peroxidases belonging to class I of the superfamily of bacterial, fungal, and plant heme peroxidases In Fig 5 the simplified inferred FITCH tree, is presented, and in Fig 6 the same tree in a simplified form with an outgroup is shown The NJ-reconstructed tree exhibited identical topo-logy with slightly different branch lengths, and a very similar maximum likelihood tree was revealed byPUZZLE The latter method with the Ô+GÕ option for rates of heterogeneity
Fig 4 Structural comparison of four representatives of class I of the superfamily of bacterial, fungal, and plant peroxidases (A) N-terminal domain and (B) C-terminal domain of catalase–peroxidase from Haloarcula marismortui (PDB code 1ITK); (C) S cerevisiae CCP (PDB code 2CYP); (D) cytosolic APX from Pisum sativum (PDB code 1APX) All figures are in solid ribbon presentation In (A), (C) and (D) the prosthetic heme group is presented in ball and stick presentation Functionally important conserved residues discussed in the text and marked also in Fig 1 are shown with their corresponding number in the amino acid sequence Those on the distal side of the prosthetic heme group are coloured yellow and those on the proximal side blue The large loop in catalase–peroxidases with an essential residue in the entrance of a substrate channel is coloured green in (A).
Trang 9produced the parameter a¼ 1.96 estimated from the actual
data set of 94 analysed peroxidase sequences The bootstrap
support in all main nodes is strong; only in some minor
nodes is refining of the particular species within groups
moderate In the case of , the likelihood mapping
analysis also revealed strong support of all main nodes Hence, the four distinct clades can be understood as four diverse peroxidase families: ascomycetous CCPs; APXs; C-domains of catalase–peroxidases; and N-domains of catalase–peroxidases This evolutionary branching also
Fig 5 Unrooted phylogenetic tree of 60 peroxidase genes The inferred tree obtained with the Fitch method [24] is presented This tree is essentially identical with the majority rule consensus tree obtained by the NJ method [23] A very similar maximum likelihood tree was also obtained by PUZZLE
[26] Numbers represent the bootstrap values on the branches calculated for NJ/Fitch, respectively The third value gives likelihood output from
PUZZLE The scale bar represents 10% of the estimated sequence divergence Abbreviations of the species are identical with those used in Table 1 In the case of catalase–peroxidases, the N-terminal and C-terminal domains are analysed separately (giving rise to a total of 94 analysed sequences) due
to the evident tandem gene-duplication event discussed in the text Colour scheme for catalase–peroxidases: brown, Archaeons; cyan, Cyano-bacteria; orange, ProteoCyano-bacteria; magenta, Firmicutes; black, ActinoCyano-bacteria; dark blue, Ascomycota.
Trang 10matches the reaction specificity of the corresponding
enzymes, and, in the case of all known catalase–peroxidases,
the two domains are fused together in one KatG It is
obvious that CCPs are closely related to APXs, and,
although the active centre of catalase–peroxidases located
exclusively in the N-domains of KatGs resembles the active
centres of APX and CCP (Fig 4), the N-domains are
phylogenetically more closely related to catalytically inactive C-domains of catalase–peroxidases
The complete sequences are known for catalase–peroxi-dases from various prokaryotes, both eubacteria and archaea The systematic analysis reveals that KatGs are distributed unequally among closely related genomes Whereas in some complete genomes, no KatG is present (as discussed below), some bacteria even contain two different ones (e.g Mycobacterium fortuitum) This unequal distribution of KatGs can be attributed to a lateral gene transfer [41] between otherwise phylogenetically unrelated micro-organisms In the case of KatGs, it was first proposed to occur between archaea and eubacteria based
on the analysis of three archaeal and 16 bacterial KatGs [42] Later it was postulated that this phenomenon often occurs in all lineages of hydroperoxidases capable of catalytic reaction [43] From the phylogenetic tree presen-ted here (with 34 KatGs divided in the separate domains), it
is obvious that archaeal and eubacterial KatGs are phylogenetically more closely related than the rest of the genomes Moreover, several lateral gene transfer events are discernible in the phylogenetic tree in both branches of catalase–peroxidases (Fig 5) Interestingly, the sequence of ArchfulCP segregated very early on from the remaining known KatGs In this paper, I focus on the analysis of lateral gene transfer between KatGs of Firmicutes, Cyano-bacteria, and ProteoCyano-bacteria, which is also supported by high bootstrap values for both domains Their positions on the branches indicate that KatGs from pathogenic proteo-bacteria are descendants of genes from Firmicutes and Cyanobacteria There is also an obvious discrepancy between the rather high GC content of proteobacterial KatGs and the GC content of the whole organism (Table 2), supporting the hypothesis on the direction of the lateral gene transfer Pathogenic and soil proteobacteria could profit from such a mode of lateral gene transfer by causing new genes to resist more efficiently the harmful effects of oxidative stress often caused by the host immune response or the environment However, no KatG was found in the completed genome of Bacillus subtilis, indicating gene loss in some Firmicutes
Fig 6 Simplified presentation of the inferred tree with the use of an
outgroup (manganese peroxidase from Phanerochaete chrysosporium
belonging to class II of this superfamily) presented to demonstrate the
order of evolutionary events in class I of the peroxidase superfamily.
Table 2 Analysis of GC content in KatGs thought to be involved in lateral gene transfer between organisms Values were obtained from the codon usage database at kazusa.or.jp Abbreviations of enzymes are described in Table 1.
Type of peroxidase GC content of KatG (%) GC content of whole organism (%) Type of organism