Of the GH13 catalytic triad, only the cat-alytic nucleophile aspartic acid 199 of the oligo-1,6-glucosidase could have its counterpart in some 4F2hc proteins, whereas most rBATs contain
Trang 1heteromeric amino acid transporters rBAT and 4F2hc
within the GH13 a-amylase family
Marek Gabrisˇko1and Sˇ tefan Janecˇek1,2
1 Institute of Molecular Biology, Slovak Academy of Sciences, Bratislava, Slovakia
2 Department of Biotechnology, Faculty of Natural Sciences, University of SS Cyril and Methodius, Trnava, Slovakia
Introduction
To fulfil its metabolic needs, a cell uses specialized
transport proteins to perform and control the uptake
and efflux of crucial compounds (e.g sugars, amino
acids, nucleotides, inorganic ions and drugs) across
the plasma membrane These proteins have been
clas-sified into the phylogenetically derived solute carrier
(SLC) families; current classification counts almost 50
SLC families [1,2] The sequence similarity between the heavy-chain subunits of heteromeric amino acid transporters (hcHATs) and the a-glucosidases from the a-amylase family [3] was first recognized more than 15 years ago [4] HATs are composed proteins consisting of a light subunit (SLC7 members) and a heavy subunit (known as rBAT or 4F2hc; SLC3
Keywords
4F2hc; evolutionary relatedness;
oligo-1,6-glucosidase subfamily; rBAT; a-amylase
family
Correspondence
Sˇ Janecˇek, Institute of Molecular Biology,
Slovak Academy of Sciences, Du´bravska´
cesta 21, SK-84551 Bratislava, Slovakia
Fax: +421 2 59307416
Tel: +421 2 59307420
E-mail: Stefan.Janecek@savba.sk
(Received 15 July 2009, revised
18 September 2009, accepted 12 October
2009)
doi:10.1111/j.1742-4658.2009.07434.x
In an effort to shed more light on the early evolutionary history of the heavy-chain subunits of heteromeric amino acid transporters (hcHATs) rBAT and 4F2hc within the a-amylase family GH13, a bioinformatics study was undertaken The focus of the study was on a detailed sequence comparison of rBAT and 4F2hc proteins from as wide as possible taxo-nomic spectrum and enzyme specificities from the a-amylase family The GH13 enzymes were selected from the so-called GH13 oligo-1,6-glucosidase and neopullulanase subfamilies that represent the a-amylase family enzyme groups most closely related to hcHATs Within this study, more than 30 hcHAT-like proteins, designated here as hcHAT1 and hcHAT2 groups, were identified in basal Metazoa Of the GH13 catalytic triad, only the cat-alytic nucleophile (aspartic acid 199 of the oligo-1,6-glucosidase) could have its counterpart in some 4F2hc proteins, whereas most rBATs contain the correspondences for the entire GH13 catalytic triad Moreover, the 4F2hc proteins lack not only domain B typical for GH13 enzymes, but also
a stretch of 40 amino acid residues succeeding the b4-strand of the cata-lytic TIM barrel rBATs have the entire domain B as well as longer loop 4 The higher sequence–structural similarity between rBATs and GH13 enzymes was reflected in the evolutionary tree At present it is necessary to consider two different scenarios on how the chordate rBAT and 4F2hc proteins might have evolved The GH13-like protein from the cnidarian Nematostella vectensis might nowadays represent a protein close to the eventual ancestor of the hcHAT proteins within the GH13 family
Abbreviations
ATG, amino acid transporter glycoprotein; CSR, conserved sequence regions; GH, glycoside hydrolase; HAT, heteromeric amino acid transporter; hcHAT, heavy-chain subunits of heteromeric amino acid transporter; SLC, solute carrier.
Trang 2members), connected by a disulfide bridge [2].
Because of their significance in human pathology
(their defects lead to primary inherited
aminoacidu-rias, e.g failed renal reabsorption of amino acids),
HATs have attracted much attention in medical
stud-ies (e.g [2,5–7]) The light subunit is a
nonglycosylat-ed hydrophobic 12-helix transmembrane protein,
whereas the heavy subunit is a type II membrane
N-glycoprotein with an intracellular N-terminal end, a
single transmembrane region and a large extracellular
C-terminal domain [2] It is the light subunit that
possesses the amino acid transportation activity,
although without interacting with the heavy subunit it
is unable to reach the plasma membrane Thus, the
role of the heavy subunit is to recognize the light
subunit and to chaperone it to a proper position in
the plasma membrane, i.e this subunit is not
abso-lutely necessary for the transport activity [8], but
interestingly its C-terminal extracellular domain
exhib-its sequence similarities to the a-amylase family
enzymes [4,9]
The a-amylase family [3] forms in the
sequence-based classification of glycoside hydrolases (GHs), the
GH-H clan [10], consisting of three GH families:
GH13, GH70 and GH77 These enzymes ( 30
differ-ent EC numbers) should satisfy the following
require-ments: (a) the catalytic domain is formed by the (b⁄ a)8
barrel fold (i.e TIM barrel) with a small distinct
domain B protruding out from the barrel between the
b3-strand and the a3-helix; (b) the catalytic machinery
consists of the b4-strand aspartate (nucleophile),
b5-strand glutamate (proton donor) and b7-strand
aspartate (transition-state stabilizer); (c) the enzymes
employ retaining reaction mechanism; and (d)
sequences contain between four and seven conserved
sequence regions (CSRs) covering mainly the b-strands
of the catalytic TIM barrel [3,11–13] Of the three GH
families of the clan GH-H, it was the family GH13
that was originally established as the a-amylase family
[14–17] At present it belongs to the largest families in
the entire classification of GHs [10] Although the
overall sequence identity within the GH13 is extremely
low [13], it contains several groups of enzymes
exhibit-ing a higher degree of mutual sequence similarity so
that the family has recently been divided into
subfami-lies [18] Of these, the best resemblance to hcHATs
was revealed for the members of the so-called
oligo-1,6-glucosidase subfamily [9,19,20] This was recently
confirmed by solving the three-dimensional structure
of the C-terminal domain of 4F2hc [21], which most
resembles the oligo-1,6-glucosidase from Bacillus cereus
[22] and a-glucosidase from Geobacillus sp HTA-462
[23]
In hcHATs, the regions of similarity cover the sequence segments within the C-terminal extracellular domain The segments correspond, in fact, with some
of the a-amylase family CSRs, namely the b-strands b2, b3, b4 and b8 of the (b⁄ a)8barrel domain, and for rBAT also with the short stretch near the C-terminus
of domain B [9,19] From the sequence–structure point
of view, the basic difference between rBAT and 4F2hc
is that rBAT possesses the segment that corresponds with domain B of GH13 enzymes, whereas 4F2hc does not have it [9,19,24,25]
The main goal of the present study was to investi-gate further the resemblance between hcHAT proteins and the enzymes from the a-amylase family We there-fore carried out a bioinformatics study focused on a detailed comparison of all available rBAT and 4F2hc sequences with GH13 enzyme representatives covering mainly the oligo-1,6-glucosidase subfamily This could help to elucidate the origin of the hcHAT proteins within the GH13 a-amylase enzyme family, as well as shed some light on the possible evolutionary events leading to separation of the heavy-chain subunit of these amino acid transporters from the enzymes involved in the metabolism of starch and related saccharides
Results and Discussion
Evolutionary relationships and sequence–structural comparison This study delivers the in silico analysis of 134 sequences consisting of 92 hcHAT proteins (represent-ing known rBATs and 4F2hc proteins as well as their newly identified putative homologues) and 42 GH13 enzymes (including four GH13-like sequences) (Table 1) Their global multiple sequence alignment (not shown) covers: (a) the N-terminal region, trans-membrane segment, central TIM barrel domain, including domain B and the C-terminal domain C for rBAT proteins (669 residues on average); (b) the cata-lytic TIM barrel domain, including domain B and the C-terminal domain C for GH13 enzymes (572 residues
on average); and (c) the N-terminal region, transmem-brane segment, central TIM barrel domain and the C-terminal domain C for 4F2hc proteins (542 residues
on average) The length of the entire amino acid sequence alignment was 1099 positions, but it should be taken into account that, if the gaps are excluded, the overall number
of comparable positions would be < 100
Figure 1 illustrates the evolutionary relationships between the studied hcHAT proteins and GH13 enzymes from the a-amylase family The tree was
Trang 4calculated using the neighbour-joining method [26] Other approaches, such as maximum likelihood [27], maximum parsimony [28], minimum evolution [29] and upgma [30] were also used, but they delivered basically similar topologies (not shown)
The two main groups of hcHATs (Fig 1), i.e those
of rBAT and 4F2hc, form their own clusters within which taxonomy is respected: (a) for the rBATs from human via representatives of mammals, birds, lizard, frogs and fishes to Urochordata (sea squirts) and Cephalochordata (lancelet); and (b) for the 4F2hc pro-teins from human via mammals, perhaps omitting birds (as it is not found in chicken and zebra finch), lizard, frogs, fishes and platypus to Petromyzon (sea lamprey), Urochordata (sea squirts) and even Ixodes (tick) What is also clear is the grouping of the GH13 enzymes, which cover: (a) the representatives of the
Trang 5individual enzyme specificities from both the
oligo-1,6-glucosidase and neopullulanase GH13 subfamilies [20];
and (b) the additional GH13 a-glucosidases from fungi
(yeast), insects and thermophilic (and soil) bacteria It
is worth mentioning that the fungal (yeast)
a-glucosid-ases are clustered with their counterparts from bacilli
and the closely related specificities, such as
oligo-1,6-glucosidase, dextran oligo-1,6-glucosidase, trehalose-6-phosphate
hydrolase and isomaltulose synthase, whereas the
rep-resentatives of trehalose synthase, amylosucrase and
sucrose phosphorylase share the branch leading also to
members of the neopullulanase GH13 subfamily
together with the intermediary enzymes (Fig 1) The
overall arrangement of the tree is that the clusters of
true rBAT and 4F2hc proteins are separated from each
other by the GH13 enzymes
All remaining sequences (except those from
nema-todes) that were not possible to classify as true rBAT
and true 4F2hc proteins were first designated as
hcHAT-like proteins Then, based on an approximate
alignment, which served to construct a preliminary
evo-lutionary tree, these hcHAT-like proteins were divided
into hcHAT1 and hcHAT2 groups (Table 1) It is
worth mentioning that most of them are hypothetical
proteins that in some cases were retrieved from recent
complete genome sequencing projects containing raw
sequence data still without appropriate annotation
Most hcHAT1 proteins cover the insects and, in a
wider sense, the Arthropoda (daphnia), which are
com-pleted by Cephalochordata and Echinodermata (both
Deuterostomia) and one representative from Cnidaria
(Nematostella) The group of hcHAT2 proteins also
consists of Arthropoda, i.e insects accompanied by
Daphniaand Ixodes, and two representatives of
schisto-somes Interestingly, although present in the subgenus
Drosophila, hcHAT2 proteins seem to be lacking in the
melanogastergroup (subgenus Sophophora)
With regard to hcHAT1 from Aeges aegypti [31] and
Drosophila melanogaster [32], these two proteins have
already been experimentally confirmed as heavy-chain
subunits (CD98hc, i.e 4F2hc) in the amino acid
trans-porter system analogous to that known in mammals
[2,21] A similar observation was reported for the
SPRM1hc from Schistosoma mansoni [33], which in the
present study is classified in the hcHAT2 group
(Table 1) Obviously, although hcHAT1 and hcHAT2
groups retain independency from each other, both
seem to be more closely related to typical 4F2hc
proteins than to rBATs (Fig 1)
Concerning the above-mentioned hcHAT sequences
from nematodes, these proteins from
Caenorhabd-itis elegans[34] have been named as amino acid
trans-porter glycoproteins (ATG) Of the two groups, ATG1
and ATG2 (Table 1), the relevant light chains com-bined only with ATG2 exhibited the transporter func-tion [34] From the evolufunc-tionary tree (Fig 1), both ATG clusters (ATG1 and ATG2) from all studied nematodes could represent a counterpart group to hcHAT2 proteins
As far as the sequence similarities and differences between the hcHAT proteins and GH13 enzymes are concerned, the basic feature discriminating the 4F2hc proteins from both rBATs and GH13 enzymes is the lack of domain B protruding out of the TIM barrel in the place of loop 3 connecting the b3-strand to the a3-helix [9,21] Sharing domain B by rBATs and GH13 enzymes, and especially the sequence of the fifth CSR (QPDLN for both human rBAT and Bacil-lus cereusoligo-1,6-glucosidase) [20] (Fig 2), may indi-cate a shorter evolutionary distance for rBATs from the GH13 ancestor common for both rBAT and 4F2hc proteins Complete domain B with well-conserved b-strands is also present in hcHAT1 proteins In all other groups, this domain is more or less distorted, culminating in complete loss in 4F2hc proteins The presence of full GH13 domain B in hcHAT1 and the absence of its parts in hcHAT2 indicate the eventual intermediary or primordial character of both hcHAT1 and hcHAT2 with regard to the appearance of typical rBAT and typical 4F2hc proteins in animals This seems to be obvious, according to our present knowl-edge, from Urochordata (Fig 1)
The second sequence feature clearly visible from the alignment is whether the individual catalytic residues,
or even the entire catalytic triad of the GH13 a-amy-lase family, could be found in the hcHAT representa-tives Fort et al [21] reported that the human 4F2hc does not exhibit any a-glucosidase activity This is con-sistent with almost a complete lack of the catalytic triad in all 4F2hc proteins (Fig 2) It is worth men-tioning that, especially in higher animals (mammals and also in frogs and fishes), an aspartate (aspartic acid 248 in human 4F2hc; aspartic acid 380 in Fig 1
as both the N-terminal and transmembrane segments are involved) could be a relic of the GH13 b4-strand catalytic nucleophile [3,11–13], although shifted one position to the C-terminus (Fig 2) On the other hand, most rBAT representatives contain all three catalytic residues (Fig 2) with the exception of those from birds, lizards and frogs (lacking both essential aspar-tates at the b4- and b7-strands) and also from some fishes (lacking the b4-strand aspartate) This may mean that the eventuality of a-glucosidase activity of true rBATs cannot be unambiguously eliminated
The selected CSRs (Fig 2) characteristic of the a-amylase enzyme family GH13 [13] illustrate the
Trang 6addi-Fig 1 Evolutionary tree of the hcHAT pro-teins and the GH13 a-amylase family mem-bers The tree is based on the alignment of complete sequences and calculated includ-ing gaps The numbers represent the boot-strap values The individual proteins and enzymes are abbreviated as follows (see also Table 1): rBAT, true rBAT proteins; 4F2, true 4F2hc proteins; ATG1 and ATG2, ATGs from nematodes; hcHAT1 and hcHAT2, hcHAT-like proteins covering basal metazo-ans and arthropods; GH13, GH13-like proteins or enzymes; OGLU, oligo-1,6-glucosidase; AGLU, a-oligo-1,6-glucosidase; DGLU, dextran glucosidase; T6PH, trehalose-6-phosphate hydrolase; ASU, amylosucrase; SPH, sucrose phosphorylase; IMSY, iso-maltulose synthase; TSY, trehalose syn-thase; CMD, cyclomaltodextrinase; MGA, maltogenic amylase; NPU, neopullulanase; INT, intermediary group between oligo-1,6-glucosidase and neopullulanase subfamilies.
Trang 7tional sequence features conserved mutually between
the hcHAT and hcHAT-like proteins and GH13
enzymes, as well as within the individual groups of
hcHAT representatives, i.e rBAT, 4F2hc, hcHAT1,
hcHAT2 and ATG groups (Table 1) Overall, and
interestingly, the residues that have not yet been
revealed to be essential for the GH13 enzymes seem to
be well conserved, e.g (a) a stretch of three
hydro-phobic aliphatic residues (207_LII in human rBAT)
preceding the important aspartate (aspartic acid 98 in
oligo-1,6-glucosidase) in region I covering the
b3-strand; (b) a segment of up to five residues
(307_GVDGF in human rBAT) preceding the
func-tional arginine (arginine 197 in oligo-1,6-glucosidase)
in region II of the b4-strand; and (c) more or less the
entire region VII, i.e the b8-strand The fact that
rBATs exhibit more sequence similarities with the
GH13 enzymes than the 4F2hc proteins is also clearly
and easily visible in selected CSRs (Fig 2) It concerns
mainly: (a) tryptophan (tryptophan 161 in human
rBAT) in region VI (b2-strand); (b) histidine (histidine
215) at the end of region I (b3-strand); the entire
region V in loop 3 (i.e domain B) being 282_QPDLN
in human rBAT; and (d) conserving the catalytic
resi-dues (often the entire catalytic triad) Some of these
features can be traced in the sequences of hcHAT1
and hcHAT2 groups as well as of the ATG proteins
(Fig 2), indicating evolutionary relationships of all
these enzymes and proteins and hinting at their
even-tual evolutionary histories It is worth mentioning that
to understand the common evolutionary history of
hcHAT proteins and GH13 enzymes it is necessary to
re-evaluate the CSR VII covering the b8-strand
[13,20], as this segment – obviously without the GH13
functionally important residues – belongs to their best
conserved shared sequence parts (Fig 2) It is also of
importance to note that if the CSRs (Fig 2) serve to
calculate the evolutionary tree (not shown), all
hcHAT1 proteins (covering basal metazoans and
arthropods) and both ATG groups (ATG1 and ATG2
from nematodes; Table 1) cluster together with rBAT
proteins and GH13 enzymes (although with low
boot-strap values), whereas the entire hcHAT2 group shares
the branch with the 4F2hc proteins
As no a-glucosidase activity was detected for the
human 4F2hc [21], reflecting that only the catalytic
nucleophile (aspartic acid 380; Fig 2) may be
pre-served, it was of interest to identify the CSRs covering
the GH13 functionally important residues in hcHATs
From all of them (Fig 2), CSR III (b5-strand with the
glutamate acting as a proton donor) is not easily
iden-tifiable, even for the enzymatically active GH13
mem-bers [13] Therefore, one of the goals was to align
correctly the b5-strands of the hcHAT sequences, which was especially problematic for the 4F2hc pro-teins completely lacking the catalytic glutamate (Fig 2) In this regard, the putative GH13-like sequence from the cnidarian Nematostella vectensis containing the b5-strand segment 273_RLLIGE (Fig 2) should be of special importance from an evo-lutionary point of view, as it contains the features of both the GH13 enzymes (i.e the glutamic acid residue
in a corresponding position) and typical 4F2hc proteins (i.e arginine or lysine followed by the stretch
of three aliphatic hydrophobic residues, e.g 405_RLLIAG in human 4F2hc; Fig 2) This segment preceding the catalytic b5-strand glutamate is also con-served in most insect a-glucosidases, supporting the possibility that the ancestry of the hcHAT proteins within the GH13 a-amylase enzyme family could be rooted in basal metazoans, currently represented by Nematostella vectensis
A comparison of the three-dimensional structures of representatives of hcHATs (human 4F2hc, 417 residues [21] and a model of the human rBAT with 535 resi-dues) and GH13 enzymes (Geobacillus sp HTA-46 a-glucosidase; 531 residues [23]) confirmed the expected higher similarity between rBAT proteins and GH13 enzymes (root-mean-square deviation 1.62 A˚ between the Ca atoms of 436 corresponding residues) than between 4F2hc proteins and GH13 enzymes (1.67 A˚ for 293 Caatoms) as well as rBATs and 4F2hc proteins mutually (1.80 A˚ for 271 Caatoms) However, what could be more interesting is the observation of human 4F2hc lacking not only domain B, but also a stretch of 40 amino acid residues succeeding the b4-strand (not shown) The human 4F2hc thus pos-sesses a very short loop 4 connecting the b4-strand to a4-helix in an opposite manner to what is seen in both the Geobacillus a-glucosidase and human rBAT protein (having the entire domain B) Regardless of whether domain B in the GH13 oligo-1,6-glucosidase subfamily members (and also in rBATs) operates in conjunction with the prolonged loop 4, it seems that the consecu-tive loss of domain B in 4F2hc proteins is connected with adequate shortening of loop 4, as the observation can be generalized to all 4F2hc proteins Note that the GH13 neopullulanase subfamily members [20], possess-ing shorter domain B [9,35–37], also lack the longer excursion of the loop 4 segment
Selection pressure With regard to close sequence similarity between the GH13 enzymes and the hcHAT proteins (especially rBATs), it is interesting to compare the selection
Trang 8pres-Origin of rBAT and 4F2hc within the GH13 a-amylase family M Gabrisˇko and Sˇ Janecˇek
Trang 9sure acting on corresponding stretches of amino acid
sequences For this purpose, the selecton tool [38]
was chosen Figure 3 illustrates the similarities and
dif-ferences in selection pressure acting on the three
stud-ied protein groups – mammalian 4F2hc proteins,
vertebrate rBATs and insect a-glucosidases In
agree-ment with the higher degree of sequence similarity
between rBAT proteins and GH13 enzymes, the
selec-tion pressure was also found to be more similar for
these two groups than that observed for 4F2hc and
rBAT proteins, as well as for 4F2hc proteins and
GH13 enzymes (Fig 3) Remarkably, there are a few
segments, namely those at or around the b2-, b3- and
b8-strands (CSRs VI, I and VII, respectively) that
exhibit similar selection pressure for all the three
groups, i.e rBAT, 4F2hc and a-glucosidases This
indi-cates that the residues from the above-mentioned
seg-ments of both rBAT and 4F2hc proteins, sharing the
value of selection pressure with their counterparts from
a-glucosidases, may also share their functions
Although for the b3-strand at least the histidine
(histi-dine 103 in Bacillus cereus oligo-1,6-glucosidase) is
known to be involved in the active site of GH13
enzymes [3,11,22], no functional role has been assigned
to any residue from both the b2- and b8-strands The
results shown here (Fig 3) could therefore mean that
they contribute to the overall structural integrity of the
TIM barrel domain Concerning the GH13 catalytic
triad, it is worth mentioning that in spite of their
pres-ence in rBATs, their positions (especially for the b4
catalytic nucleophile and b7 transition-state stabilizer)
are selection neutral in contrast to strict purifying
selection observed here for a-glucosidases (Fig 3)
Eventual evolutionary scenarios
This study has delivered not only evolutionary
rela-tionships (Fig 1) based on a detailed sequence
com-parison of all currently available sequences of rBAT,
4F2hc and hcHAT-like proteins with their GH13
enzy-matic counterparts (Table 1), but it has also tried to
trace the ancestry of hcHAT proteins within the GH13
a-amylase family In fact, two different evolutionary
scenarios could be taken into account: (a) in one single
event in basal Metazoa and a subsequent split into
rBAT and 4F2hc (probably via hcHAT1 group) in
chordates; and (b) in two independent branching events, i.e 4F2hc in the basal Metazoa via HAT-like proteins and rBAT directly from enzymes in deuterost-omes It is worth mentioning here that both scenarios reflect the ancestry of both rBATs and 4F2hc proteins anchored within the GH13 a-amylase family The dif-ference is only in the way leading from the GH13 enzymes either to rBAT and 4F2hc together or to rBAT and 4F2hc separately At present it is not possi-ble to draw the evolutionary picture unambiguously The first evolutionary scenario, basically consistent with the one proposed originally [9], means that in basal Metazoa an ancestor of both the present-day 4F2hc and rBAT proteins was separated from the GH13 enzymes The ancestor acquired the N-terminal and transmembrane segments and, eventually (in most taxa), duplicated and evolved to give in chordates: (a) rBATs that have kept most of the GH13 sequence– structural features, including domain B as well as cata-lytic residues (often the entire catacata-lytic triad); and (b) 4F2hc that has consecutively lost almost all of the GH13 characteristic sequence–structural features, including domain B as well as functional residues (mainly the catalytic triad) The weak points of this scenario are: (a) the striking similarity between rBATs and GH13 enzymes; (b) the higher similarity between 4F2hc and hcHAT-like proteins than between 4F2hc and rBATs; and (c) the seeming absence of rBAT ancestors in nematodes and arthropods (Fig 1) The other completely different scenario that would seemingly obey the observation of a generally higher degree of sequence–structural similarity between rBATs and GH13 enzymes than between 4F2hc pro-teins and GH13 enzymes would assume the indepen-dent evolution of rBATs and 4F2hc proteins This eventuality would leave both hcHAT1 and hcHAT2 groups in the history leading to the 4F2hc proteins The problems in this scenario would be: (a) the inde-pendent acquisition of both the N-terminal segment and the transmembrane region in rBAT and 4F2hc proteins, which should appear more parsimoniously only once; and (b) the gain of the analogous function Because the family GH13 enzymes are spread throughout the whole taxonomy spectrum from prok-aryotes to eukprok-aryotes and are therefore more ancient than the hcHATs (present only in Metazoa), there is
Fig 2 The CSRs of the hcHAT proteins and the GH13 a-amylase family members A list of the abbreviations of proteins and enzymes can
be found in Fig 1 The segments covering the strands b2, b3, loop 3 (near the C-terminus of domain B connecting the b3-strand and helix 3), b4, b5, b7 and b8 represent the individual CSRs of the a-amylase family [13] The positions corresponding with the GH13 catalytic triad are boxed The individual selected residues are highlighted as follows: aspartate and glutamate – red; glycine and proline – black; valine, leucine and isoleucine – grey; phenylalanine and tyrosine – blue; tryptophan – magenta; histidine – cyan; arginine and lysine – green; cysteine – yellow.
Trang 10only one possible place for rooting the tree that is on
the branch leading to the enzymes originating from
non-Metazoa (the eventual outgroup)
It is worth mentioning, however, that if the
evolu-tionary tree of all proteins studied here is based on the
alignment of CSRs (Fig 2), the ATG proteins from
nematodes [34] and all hcHAT-like proteins designated
here as hcHAT1 group (Table 1), i.e hcHAT-like
pro-teins covering the basal Metazoa and Arthropoda,
cluster together with both the rBAT proteins and
GH13 enzymes, leaving the 4F2hc proteins with the
hcHAT2 group at a different branch (tree not shown)
It should be pointed out that despite the fact the
GH13 CSRs could be considered to be something like
sequence fingerprints of the GH13 a-amylase family
members [13], the tree based on the CSRs is supported
by low bootstrap values It is thus not possible to say
which one, hcHAT1 or hcHAT2, is orthologous to
rBAT or 4F2hc, if any Although both hcHAT1
(insect) and hcHAT2 (schistosoma) representatives
have already been shown to function rather as 4F2hc
than as rBAT [31–33], their rBAT-like role has not as
yet been investigated However, as seen in Fig 1 (the
tree based on the complete alignment), the hcHAT2
group (Arthropoda) cluster with both ATG1 and
ATG2 (Nematoda), indicating that the hcHAT2 and
ATG proteins are orthologues Because hcHAT2⁄ ATG
are present only in Arthropoda and Nematoda, they probably came from one hcHAT protein (i.e hcHAT1;
cf Fig 1) originating from a common ancestor of Ecdysozoa However, it should be stressed that hcHAT2 proteins (except for those from Schistosoma [33]) were first identified in this study, so further research on their function and to identify a light subunit to which they bind, could throw more light on the relationships between various hcHAT proteins Finally, it should be taken into account that the a-amylase family GH13 belongs to the largest GH families covering several tens of specificities and several thousand sequences [3,13,18] where, for example, it is still complicated to trace clearly the evolutionary history, even just for the animal a-amylase [39]
Conclusions
The examples of a close evolutionary relatedness between the TIM barrel enzymes and their counter-parts without the catalytic function are not so excep-tional For example, in the family GH18 chitinases, several plant proteins, such as narbonin [40] and con-canavalin B [41], have been recognized to be former chitinases that have lost their catalytic residues Even
in the GH13 a-amylase family, an enzymatically inac-tive remote paralogous Amyrel (amylase-related) gene
Fig 3 Selection pressure acting on rBAT and 4F2hc proteins and GH13 insect a-glucosidases (AGLU) Yellow highlighting (1 and 2) indicates
a positive selection, whereas red highlighting (4–7) indicates a purifying selection The sequences used for the SELECTON analysis [38] are marked by an asterisk in Table 1 The individual CSRs of the GH13 a-amylase family [13] are boxed; the GH13 catalytic residues are indi-cated by small yellow boxes The individual structural parts of the proteins, i.e the N-terminal and the transmembrane segments, domain A (TIM barrel), domain B and domain C, are indicated by green, yellow, blue and grey shadowing, respectively.