A vast number of enzymes act to cleave, synthesize or modify glycosidic bonds in carbohydrate compounds, and therefore, the need for a robust method for classification of carbohydrate-ac
Trang 1CRYSTAL STRUCTURE OF AMYB, AN
α-AMYLASE FROM HALOTHERMOTHRIX ORENII,
AND COMPARISON WITH ITS HOMOLOGS
Tien Chye Tan
Submitted 14 April 2007
A THESIS SUBMITTED FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY DEPARTMENT OF BIOLOGICAL SCIENCES
NATIONAL UNIVERSITY OF SINGAPORE 2007
Trang 2PUBLICATIONS
Paper I
Tan, T.C., Yien, Y.Y., Patel, B.K., Mijts, B.N., and Swaminathan, K (2003)
Crystallization of a novel alpha-amylase, AmyB, from the thermophilic halophile
Halothermothrix orenii
Acta Crystallografica sect D 59, 2257-2258
Paper II
Huynh, F., Tan, T.C., Swaminathan, K., and Patel, B.K (2005)
Expression, purification and preliminary crystallographic analysis of sucrose
phosphate synthase (SPS) from Halothermothrix orenii
Acta Crystallographica sect F 61, 116-117
Trang 3ACKNOWLEDGEMENTS
In my list of acknowledgements, there are people with specific contributions and also people who have helped me in many small but important ways that are impossible to list The following list is by no means complete nor are the contributions of those listed limited to what is listed It is just a weak attempt to thank everyone for their help and kindness shown to me during my time here
Dr Victor Wong and Dr Kunchithapadam Swaminathan for their patience and guidance Dr Wong for getting me started in research and for believing in me Dr Swami for burning the midnight oil to help me get my thesis out on time
My collaborator Dr Bharat Patel for spending hours educating me on his field Dr Jayaraman Sivaraman for putting up with my endless list of questions
Thanks also go out to my ‘editor’ Maykalavaane d/o Narayanan for helping make sense of my disjointed thoughts and my dyslexic spelling
My lab mates, members of Structural Biology Lab for the mayhem and chaos to spice up our stay here Not forgetting the department staff who definitely have help a lot in their own way
Dr Christina Divne for supervision on additional experimental results presented in the revised version of the thesis, and advice on revision of the thesis And for pulling my brain out of the gutter and getting it to stay focused
Prof Birte Svensson and Dr Karen Marie Jakobsen at BioCentrum-DTU (Lyngby, Denmark) for supplying the acarbose inhibitor, which was put to good use
Of course there is my mom but that would take up another thesis on its own so I shall just say “THANKS MOM”
Trang 4Table of Contents
1.2.2 Carbohydrate-active enzymes are classified in CAZy 7
1.2.3 Reaction mechanisms of glycoside hydrolases 9
1.2.5 The TIM barrel is a recurrent fold in GH enzymes 13
Trang 51.4.2 Halothermothrix orenii produces two α-amylases 28
1.4.3 Biochemical characteristics of AmyA and AmyB 30
1.4.4 Thermal inactivation studies on AmyA and AmyB 31
1.4.6 Content of charged amino acids in AmyA and AmyB 33
2.1.4 Preparation of the expression vector 38
2.3.2 Optimization of crystallization conditions and ligand soaks 45
2.3.4 Structure determination and refinement 46
Trang 63 RESULTS 50
3.2.3 Starch degradation studies on AmyB and ΔAmyB 53
3.2.4 Starch binding studies on AmyB and ΔAmyB 54
3.2.5 Stability analysis of AmyB and ΔAmyB 56
3.5 Structure determination, model building and refinement 64
3.6.2 Domains A and B – the catalytic module 71
3.7.1 Binding of acarbose-derived oligosaccharide 82
3.7.2 Binding of maltoheptaose and α-cyclodextrin 90
Trang 74 DISCUSSION 95
4.1.1 Natural habitat of Halothermothrix orenii 95
4.1.2 Possible natural substrates for H orenii AmyB 95
4.2.1 Influence of negatively charged surfaces 97
4.2.4 Thermal stability as a function of salt concentration and pH 100 4.3 AmyB represents a unique member of the α-amylase family 101
4.3.3 AmyB is unique compared with AmyA and other α-amylases 104
Paper I
Paper II
Trang 8LIST OF TABLES
Table 1.1 Categories of halophiles
Table 1.2 Clans and folds of glycoside hydrolases
Table 1.3 CBM fold families
Table 1.4 Types of carbohydrate binding platforms
Table 1.5 Characteristics of exoamylases
Table 3.1 Statistics for data collection
Table 3.2 Statistics for crystallographic refinement
Table 3.3 Interface parameter analysis for domain A/N association
Table 3.4 Mapping of sugar residues of acarbose-derived oligosaccharides to the active-site subsites of the α-amylases BA2, AmyB and BHA
Table 3.5 Interactions with a nonasaccharide in the A-B groove of AmyB ACR
Table 3.6 Interactions with acarbose in the N-C groove of AmyB ACR
Table 3.7 Interactions with α-D-glucose in the B1 and B2 sites of AmyB ACR
Table 3.8 Interactions with maltotetraose in the A-B groove of AmyB MAL7-ACX
Table 3.9 Interactions with maltotetraose in the N-C groove of AmyB MAL7-ACX
Trang 9LIST OF FIGURES
Figure 1.1 Chair representation of cellobiose
Figure 1.2 Reaction mechanisms for glycoside hydrolases
Figure 1.3 Active-site topologies of glycoside hydrolases
Figure 1.4 GH clans with the TIM-barrel fold
Figure 1.5 Sugar-binding platforms in CBMs
Figure 1.6 Structure of starch components
Figure 1.7 Helical structure of V- and A-amylose
Figure 1.8 Enzymes involved in starch processing
Figure 1.9 The domain-organization of α-amylases
Figure 1.10 The active site in Bacillus circulans strain 251 CGTase
Figure 1.11 The structure of a lipoprotein secretion-signal peptide
Figure 2.1 Schematic representation of AmyB constructs
Figure 2.2 Schematic representation of the amyB-containing pTHAB template
Figure 3.1 Lipoprotein signal peptide in AmyB
Figure 3.2 Analysis of protein purity by SDS-PAGE
Figure 3.3 Gel-filtration chromatogram for AmyB (B2) and ΔAmyB (B3)
Figure 3.4 Analysis of protein purity by SDS-PAGE
Figure 3.5 Rate of starch degradation by AmyB and ΔAmyB
Figure 3.6 Binding of AmyB and ΔAmyB to raw starch as a function of [NaCl]
Figure 3.7 Tm values for AmyB and ΔAmyB as a function of NaCl concentration at different pH values
Figure 3.8 Tm values for AmyB and ΔAmyB as a function of pH at different NaCl concentrations
Figure 3.9 Morphology of slow-growing, non-optimized AmyB crystal forms I-III Figure 3.10 Morphology of AmyB crystal form IV
Figure 3.11 Diffraction pattern of the AmyBACR crystal
Figure 3.12 Crystal packing in the C2 unit cell of the AmyB-III crystal form
Trang 10Figure 3.13 Ramachandran analysis of the AmyB models
Figure 3.14 Representative electron density for AmyB models
Figure 3.15 Overall structure of AmyB
Figure 3.16 Topology diagram for the catalytic A/B domain in H orenii AmyB Figure 3.17 Topology diagram of the AmyB-C domain
Figure 3.18 Overall fold of the AmyB-N domain
Figure 3.19 Structural superposition of H orenii AmyB-N with P syringae CopC Figure 3.20 Domains that are topographically similar to the AmyB-N domain Figure 3.21 The location of the N domain in α-amylases
Figure 3.22 Chair configuration of acarbose
Figure 3.23 Electron density around the –2 subsite in the A-B groove
Figure 3.24 Binding of the nonasaccharide in the A-B groove of AmyB
Figure 3.25 Binding of acarbose in the N-C groove of AmyB
Figure 3.26 Picture showing the positions of carbohydrate bound to AmyBACR Figure 3.27 Picture showing the positions of carbohydrate bound to AmyBMAL7-ACX Figure 3.28 The active site in AmyB
Figure 3.29 Comparison of the active-site loops in AmyB and AmyA
Figure 4.1 Electrostatic potential surfaces at different salt concentrations
Figure 4.2 Model of full-length AmyB on the lipid membrane
Trang 11LIST OF ABBREVATIONS USED
amyA Halothermothrix orenii α-amylase type A gene
AmyB Halothermothrix orenii α-amylase type B, full-length enzyme
amyB Halothermothrix orenii α-amylase type B gene
AmyBACR AmyB in complex with the inhibitor acarbose
AmyB-N N domain of H orenii AmyB
ASA solvent-accessible surface area
BLAST Basic Local Alignment Search Tool
BSA bovine serum albumin
CAZy Carbohydrate-Active Enzyme database
CBH cellobiohydrolase
CBM carbohydrate-binding module
CE carbohydrate esterase
CGTase cyclodextrin glycosyltransferase
Cop copper resistance protein
cop copper resistance operon
C-terminal carboxy-terminal
Trang 12ΔAmyB Halothermothrix orenii α-amylase type B, lacking N domain
DNA deoxyribonucleic acid
Hepes 4-(2-hydroxyethyl)-1-piperazineethanesulfonic acid
IDA iminodiacetic acid
IMAC immobilized metal affinity chromatography
IPTG isopropyl β-D-1-thiogalactopyranoside
IUBMB International Union of Biochemistry and Molecular Biology
Mes 2-(N-morpholino) ethanesulfonic acid, or 4-morpholine
ethanesulfonic acid Mops 3-(N-Morpholino) propanesulfonic acid, or 4-morpholine
propanesulfonic acid MPD 2-methyl 2,4-pentanediol
Trang 13NTA nitrilotriacetic acid
OD600 optical density measured at 600 nm
PCR Polymerase Chain Reaction
PEG polyethylene glycol
PL polysaccharide lyase
r.m.s root-mean-square
r.m.s.d root-mean-square deviation
SDS-PAGE sodium dodecyl sulfate polyacrylamide gel electrophoresis
ThMA Thermus maltogenic amylase
TIM triosephosphate isomerase
TLS translation, libration, screw-rotation
Tris 2-amino-2-(hydroxymethyl)-1,3-propanediol
TVA I Thermoactinomyces vulgaris α-amylase I
TVA II Thermoactinomyces vulgaris α-amylase II
X-gal 5-bromo-4-chloro-3-indolyl- beta-D-galactopyranoside
Trang 14molecule, the N-C groove Results from starch-binding studies using full-length AmyB and a truncated mutant lacking the N domain show that the presence of the N domain enhances binding to the insoluble substrate Moreover, the present
study has confirmed a sequence signal for a lipoprotein peptide in AmyB that serves to anchor the enzyme to the bacterial membrane Based on the above observations we have produced a tentative model as to how the full-length enzyme is immobilized to the membrane surface Results presented in this thesis show that AmyB is indeed unique compared with other α-amylases in that it is membrane bound, monomeric, and carries an N-terminal domain between the
membrane linker and domain A that forms a large groove for binding of raw
starch We observe that for AmyB the conditions for maximal stability to unfolding and stability at maximum catalytic performance do not coincide; and we
provide a rational explanation for the tendency of the other H orenii amylase,
AmyA, to aggregate in the absence of salt
Trang 151 INTRODUCTION
1.1 Extremophiles
Over the years, a large number of microorganisms has been isolated from
a variety of different habitats, of which some are termed extreme environments and are inhospitable to humans Extreme environments include those with extremely low or high temperatures, low or high pH, high salinity or high pressure While there is basic biological interest in understanding how microorganisms are able to adapt and survive under extreme conditions, there is also biotechnological and industrial interest in these enzymes Increased understanding of the underlying adaptive mechanisms would help to better utilize the enzymes industrially and to tailor them for specific industrial bioprocessing purposes
1.1.1 Adaptation to high salinity
The main problem that microorganisms face in a saline environment is water loss by osmosis As the cytoplasmic membrane is permeable to water, one possible solution would be to regulate the osmotic potential of the cytoplasm such that it equals that of the outside environment The osmotic potential of the cytoplasm can be increased by accumulation of either inorganic salts (“salt-in”
strategy), or osmolytes, i.e., low molecular weight non-salt compounds
(“compatible-solute” strategy; Madigan & Oren, 1999)
The compatible-solute strategy is used by most halophilic bacteria, eukaryotic algae, fungi and even some halophilic metoganogenic archaea Here, the strategy is to balance the osmotic pressure of the medium by organic compatible solutes, or osmolytes (Madigan & Oren, 1999) This strategy does not require any adaptation of the intracellular system, and thus, enzymes that are not adapted to high-salt conditions would still be stable and active in the cytoplasm
Trang 16Possible osmolytes include carbohydrates, amino acids, methylamine, and methylsulphonium zwitterions These are highly water-soluble, polar molecules that are uncharged or zwitterionic at physiological pH The concentration of osmolytes is regulated according to the external salt concentration Although the salt concentration of the cytoplasm would be low, it contains high concentrations
of osmolytes that ensure osmotic balance (Grant et al., 1998) In addition to
increasing the osmotic potential of the cytoplasm, osmolytes are also able to stabilize proteins that are under the stress of heat or pressure This strategy provides the host organism with a high degree of flexibility and adaptability to
differing environmental conditions (Grant et al., 1998) The categorization of salt
tolerance and optimal salt conditions that have been adapted by Grant and workers (1998) is listed in Table 1.1
co-Table 1.1 Categories of halophiles
Trang 17been purified from a group of the Halobacteriaceae family Most members of this
family use the salt-in strategy, and accumulate high concentrations of potassium chloride in the cytoplasm Proteins from halophilic microbes would generally unfold or become inactivated in low salt conditions (<1 M NaCl) Typically, these proteins show a higher content of negatively charged amino-acid residues compared to their mesophilic counterparts and concomitant lower isoelectric
points (DasSarma et al., 2006) Concomitant with the higher density of acidic
amino-acid residues, there is a significant reduction of lysine residues, as well as
an increased number of small hydrophobic amino-acid residues, and a reduction
in the number of aliphatic amino-acid residues As a result, the molecular surfaces
of halophilic proteins feature highly negative electrostatic potentials that have been suggested to be an important mechanism in halophilic adaptation (Madern
et al., 2000) The negative electrostatic surface potential appears to be attributed mainly to an increase in aspartate residues (Fukuchi et al., 2003)
It should be noted that, although the term “halophilic protein” implies a enzyme that is active and stable only at high salt conditions, there are halotolerant proteins that, while stable and active at high salt concentrations, have evolved to function at lower salt concentrations by mechanisms that remain poorly understood
1.1.2 Thermal adaptation
There are no firm rules that govern the thermostability of proteins, however, there are several strategies with which thermostability can be attained Structural strategies include highly hydrophobic cores, reduced surface-to-volume ratios, a decrease in glycine content, a high number of electrostatic interactions, higher states of oligomerization, and shortening of surface loops (Madigan & Oren, 1999) A key feature of thermophilic proteins is a bias in amino acid composition Thermophilic proteins are usually rich in charged amino-acid
Trang 18residues while having a scarce amount of polar amino-acid residues Thermophilic proteins, unlike their mesophilic and non-halophilic counterparts, show an increase in alanine and threonine residues at the expense of asparagine
residues (Fukuchi et al., 2003) Comparisons of the amino-acid composition
between the molecular surfaces of thermophilic and mesophilic proteins show that the bias in composition occurs mainly at the molecular surface (Fukuchi & Nishikawa, 2001)
Both halophilic and thermophilic proteins show an increase in the number
of charged amino-acid residues compared with their mesophilic homologs However, the halophilic proteins show a bias towards acidic amino-acid residues, while the thermophilic proteins have an equal partitioning of acidic and basic amino-acid residues on the surface The simultaneous increase in acidic and basic amino acids enables more ion pairs to form at the surface that may help stabilize
thermophilic proteins (Karshikoff & Ladenstein, 2001; Fukuchi et al., 2003) Ion
pairs are often formed between side chains that are distant in the amino-acid sequence, and they tend to be organized into networks that can be found on the protein surface, partially buried inside the protein, or at domain or subunit interfaces Such networks show a high degree of cooperativity to the extent that the stabilization effect cannot be reduced to merely the sum of ion-pair interactions Although ion pairs have an important role in the stabilization of thermophilic proteins, they are not the sole determinants of thermostability
A major shortcoming of most studies that attempt to explain the mechanisms of thermal adaptation is that only small sets of proteins are compared and analyzed However, well into the post-genomic era it is now possible to take full advantage of genomic data and the outcomes of structural-genomics projects to analyze, with statistical significance, various factors
responsible for the adaptation process In a recent study (Robinson-Rechavi et al., 2006), a large dataset of protein structures from the hyperthermophilic
Trang 19bacterium Thermotoga maritima was compiled and analyzed together with
structures of close protein homologs of mesophilic origin The results showed that, contrary to what has been suggested previously, factors such as oligomerization order, hydrogen bonds, and secondary structures are of minor importance to the adaptation process in bacteria Statistically significant contributions to stability were observed for density of salt bridges and compactness, which accounted for changes in 96% of the protein pairs studied
1.1.3 Stabilization mechanisms
One hypothesis (Mevarech et al., 2000) suggests that stabilization of
halophilic proteins by means of excess acidic amino-acid residues is best
explained by the solvation-stabilization model In this model, acidic
surface-exposed amino acids bind cooperatively to a network of hydrated salt ions to which water molecules become associated to form a solvation shell At reduced salt concentration, however, the protein-associated solvation shell is depleted of
salt ions, which may destabilize the protein and induce unfolding (Mevarech et al.,
2000) The strength of solvent-protein interactions is solvent and salt dependent, and thus, factors such as complex ion-pair networks, weak protein-protein interactions, and specific ion bindings are additional factors that contribute to
stability and solubility of halophilic proteins (Premkumar et al., 2005)
1.2 Classification of carbohydrate-active enzymes
1.2.1 Carbohydrates as enzyme substrates
Carbohydrates are present in large amounts everywhere on Earth with functions ranging from building blocks and energy reserves in our bodies to mechanical reinforcement in trees At the cellular level, carbohydrates are involved in a large spectrum of intracellular and extracellular processes ranging from energy storage to delicately controlled and specific molecular signaling
Trang 20reactions Sugar compounds display high stereochemical diversity, a hexasaccharide can give rise to more than 1012 different isomers, and living organisms have efficiently taken advantage of this variation by producing enzymes that can degrade virtually all different types of saccharides; simple or complex, non-polymeric or polymeric, crystalline or non-crystalline A vast number of enzymes act to cleave, synthesize or modify glycosidic bonds in carbohydrate compounds, and therefore, the need for a robust method for classification of carbohydrate-active enzymes was realized early on Before discussing the classification of carbohydrate-active enzymes and their reaction mechanisms in depth, a few definitions regarding carbohydrates will be provided (Fig 1.1)
Figure 1.1 Chair representation of cellobiose The disaccharide consists of two Dglucose units linked covalently by a β-1,4 glycosidic bond The unblocked hemiacetal group
-is at the reducing end of the d-isaccharide
Carbohydrates containing a six-member ring such as D-glucose are referred to as pyranoses In aqueous solution, D-glucose exists in equilibrium with
an open open-chain aldehyde, two pyranose forms, two furanose forms member rings) and the hydrated form of the open chain (the keto form) The two pyranose rings differ such that one form has the C1 hydroxyl group in equatorial
equatorial position (β)
reducing end non-reducing end
Trang 21position (i.e., β form with the hydroxyl group trans to the exocyclic C6-O6 group) and the other in axial position (i.e., α form with the hydroxyl group cis to the C6-
O6 group) Unless the anomeric C1 carbon is protected, inter-conversion (tautomerisation) will occur between the α and β forms The two forms are referred to as the α and β anomers, or stereoisomers, of D-glucose In the case of
D-glucose, the C1 carbon is also known as the anomeric carbon, or the hemiacetal carbon, indicating that this is the position where the chain can open up to yield the open form
The covalent linking of two carbohydrates involves a dehydration-synthesis during which a hydrogen atom is removed from one sugar unit and a hydroxyl group is removed from the other with the formation of one water molecule The new bond is termed a glycosidic bond, and when a glycosidic bond is cleaved it occurs by hydrolysis When several sugar units are linked by glycosidic bonds, different types of polymers are formed depending on the types of carbohydrate building blocks used In the polymeric form, the end of the polymeric chain that has a free, unprotected anomeric carbon is referred to as the reducing end The name refers to the ability of the ring to open up at the anomeric carbon to give the open aldehyde form The aldehyde group readily reduces other molecules and ions, whereby the aldehyde becomes oxidized to the carboxylic form When the reducing-end sugar of a chain is linked to the hydroxyl group of another sugar, it
is converted to an acetal that is unable to open to the aldehyde or keto form The sugar is then said to be non-reducing Thus, an extended carbohydrate chain has directionality where one end is referred to as the non-reducing end, and the other
as the reducing end
1.2.2 Carbohydrate-active enzymes are classified in CAZy
The historical classification of enzymes provided by the International Union
of Biochemistry and Molecular Biology (IUBMB) Enzyme nomenclature
Trang 22(www.chem.qmul.ac.uk/iubmb) classifies enzymes based on their substrate specificity, and in some cases also on the reaction mechanism This classification does not take into consideration the three-dimensional (3-D) structure of the enzymes, nor does it account for the fact that some enzymes use multiple substrates, for instance, many endoglucanases that hydrolyze cellulose are also able to cleave xylan, xyloglucan, β-glucan as well as some artificial substrates In the early 90’s, the need for a better classification system for carbohydrate-active enzymes prompted Henrissat and co-workers to investigate the relation between enzymes in more detail, and over the past decade, an impressive amount of data has been collected and implemented into a rigorous classification system (Henrissat, 1991; Henrissat & Bairoch, 1993; Davies & Henrissat, 1995; Henrissat
& Davies, 1997; Davies et al., 2005a) that is now available at the Active Enzyme database (afmb.cnrs-mrs.fr/CAZY)
Carbohydrate-This method uses similarities in amino-acid sequence analyzed by hydrophobic-cluster analysis, and provides useful information beyond substrate specificity such as protein structure, evolutionary relationships, in addition to functioning as a tool to derive and predict mechanistic information The method better reflects the structural and evolutionary features of the enzymes and, in addition to sequence similarity, members of a given family will share a common 3-D structure and reaction mechanism Carbohydrate-active enzymes are often modular containing a catalytic domain (module) linked to one or more modules with other functions such as carbohydrate binding Currently, catalytic modules of carbohydrate-active enzymes belonging to any of the four CAZy groups have been defined as: i) Glycoside Hydrolases (GH) that hydrolyze glycosidic bonds in sugar compounds; ii) Glycosyl Transferases (GT) that synthesize glycosidic bonds by transferring activated donor molecules to specific acceptor sugars; iii)
Polysaccharide Lyases (PL) that cleave polysaccharide chains by a β-elimination reaction to give a double bond at the newly produced reducing substituted end;
Trang 23and, iv) Carbohydrate Esterases (CE) that catalyze O- or N-glycosylation of
substituted saccharides by using the sugar either as an acid (pectin methyl esterases), or as an alcohol
In addition to the classification of catalytic modules, the non-catalytic modules often associated with the catalytic domains of carbohydrate-active enzymes constitute a separate group, the Carbohydrate-Binding Modules (CBM)
In this group, modules are found attached to catalytic modules, but are not
catalytic per se Usually the CBMs function to bind to polymeric carbohydrate substrates (e.g cellulose, xylan, starch, chitin etc.), but for some CBMs,
carbohydrate binding has not been demonstrated, and thus, their functions remain unclear As of November 2006, there are 108 GH families, 87 GT families,
18 PL families, 14 CE families, and 48 CBM families To date, 25760 amino-acid sequences of GHs have been grouped into the 108 GH families, 2178 3-D structures of GHs have been deposited with the Protein Data Bank (www.rcsb.org), and of the 108 GH families, 42 still lack information about their 3-D structures
1.2.3 Reaction mechanisms of glycoside hydrolases
Enzymatic hydrolysis of glycosidic bonds by GH enzymes involves acid catalysis As mentioned above, glycoside hydrolases are classified into as many as 107 different GH families, however, the mechanism whereby the glycosidic bond between sugar units is hydrolyzed can only be one of two possible types (Koshland, 1953; Sinnott, 1990): an inverting (bimolecular SN2 nucleophilic substitution), or a retaining (unimolecular SN1 nucleophilic substitution, or double-displacement) mechanism (Fig 1.2) The two mechanisms differ by their stereochemical outcome, but have some common features In both mechanisms, the proton donor is located within hydrogen-bonding distance of the glycosidic
Trang 24general-oxygen of the susceptible bond, and the reaction proceeds via an oxocarbenium
ion-like transition state
In the inverting mechanism (Fig 1.2 a), a single displacement occurs at
the anomeric carbon The reaction is catalyzed by two carboxylate residues: a proton donor acting as a general acid, and a nucleophile acting as a general base The reaction starts by protonation of the glycosidic oxygen and release of the
leaving group concomitant with the nucleophilic attack by a water molecule, i.e.,
the existing bond is broken and the new bond is formed in one concerted operation The reaction requires the two catalytic amino-acid residues are on opposite sides of the substrate, which results in net inversion of configuration at the anomeric carbon
Figure 1.2 Reaction mechanisms for glycoside hydrolases The (a) inverting and (b)
retaining mechanism (picture adapted from CAZy) See text for details on the reaction schemes
Trang 25The retaining reaction mechanism (Fig 1.2 b) involves two steps with two
successive displacements at the anomeric carbon, the glycosylation step and the deglycosylation step In the first step, glycosylation, the substrate is bound to the enzyme The general acid donates a proton to the glycosidic oxygen, and the leaving group departs before the protein nucleophile attacks at the anomeric carbon to form a covalent intermediate As a result of the single displacement, the resulting glycosyl-enzyme intermediate has an inverted configuration at the anomeric carbon relative to the original configuration During the next step, deglycosylation, the glycosyl-enzyme intermediate is hydrolyzed by a general-base catalyzed attack by water on the asymmetric center This displacement causes another inversion of configuration at the anomeric carbon and the configuration assumes that of the original state, thus the completed reaction gives
a net retention of configuration In the active sites of retaining and inverting GH enzymes, the two catalytic amino-acid residues are always positioned roughly 5.5
Å and 10 Å apart, respectively (Davies & Henrissat, 1995; McCarter & Withers, 1994) The rather long distance between the proton donor and nucleophile in inverting enzymes is important in as it accommodates the catalytic water molecule
1.2.4 Modes of action in GH enzymes
The mode with which a glycoside hydrolase binds and attacks its sugar substrate is typically reflected by the shape of the enzyme molecule, and the distribution of amino acids at the molecular surface Three principal active-site topologies can be distinguished based on the requirements of the different modes
of action and the pre-hydrolytic requirements: a pocket, cleft, or tunnel (for a
review, see Davies & Henrissat, 1995; Fig 1.3) The pocket (Fig 1.3 a) is usually
employed for recognition of a monosaccharide present at the non-reducing end of
a chain This mode is observed mainly in monosaccharidases (e.g., β-
Trang 26galactosidase, β-glucosidase, sialidase and neuraminidase), and
exopolysaccharidases (e.g., glucoamylases and β-amylases)
In the case of the amylases, the need for a pocket is nicely demonstrated
by the architecture of the substrate in that starch assumes a helical structure that exposes a large number of protruding non-reducing chain ends A pocket is clearly not optimal for a fibrous substrate like cellulose, which has few free, exposed chain ends
Figure 1.3 Active-site topologies of glycoside hydrolases Different types of
active-site topologies and the mode of action of GH enzymes: (a) pocket; (b) cleft or canyon; and (c) tunnel Enzymes shown: CBH, cellobiohydrolase; EG, endoglucanase; MS, monosaccharidases; ExP, exopolysaccharidases
The cleft, or canyon (Fig 1.3 b), allows random hydrolysis of glycosidic
bonds in polymeric substrates and is commonly found in endopolysaccharidases such as lysozyme, endocellulases, chitinases, α-amylases and xylanases etc A
tunnel (Fig 1.3 c) is formed mainly by extended loops that come together to
close off the binding site This type of substrate-binding site is used primarily by cellobiohydrolases that hydrolyze polymeric chains from either the reducing or the
Trang 27non-reducing end The tunnel is thought to be important for processivity in that the product can be released successively while the enzyme is still bound to the
substrate (Rouvinen et al., 1990; Divne et al., 1994, 1998)
1.2.5 The TIM barrel is a recurrent fold in GH enzymes
Similar substrate specificity is found in members of different families indicating convergent evolution However, mechanisms of divergent evolution are also observed since different substrate specificities are encountered within the same family The 3-D structures of proteins are more conserved than their amino-acid sequences which makes it possible to define clusters of related GH families, termed clans, with similar folds and both substrate and product configuration The
108 GH families can be grouped into 14 different clans denoted A-N (Table 1.2)
The most common fold among the catalytic domains of glycoside hydrolases is the TIM barrel, or (β/α)8 barrel, that is encountered in four clans (A,
D, H and K), representing 24 different GH families (Table 1.2, Fig 1.4) The name TIM refers to the enzyme triosephosphate isomerase in which it was first
discovered (Banner et al., 1975; Wierenga, 2001)
The idealized TIM barrel is composed of eight twisted parallel β strands arranged close together into a barrel Each parallel β strand is connected to an α helix that packs onto the β sheet on the outside of the barrel The structure can
be regarded as consisting of repeated βαβ motifs TIM barrels are observed predominantly in enzymes, and a canonical feature is the location of the active
sites at the C-terminal end of the barrel (i.e., the C-terminal end of the parallel β
strands)
Trang 28Table 1.2 Clans and folds of glycoside hydrolases
GH-A (β/α)8 1, 2, 5, 10, 17, 26, 30, 35, 39, 42, 50,
51, 53, 59, 72, 79, 86 GH-B β-jelly roll 7, 16
GH-C β-jelly roll 11 (xylanases), 12 (endoglucanases)
Figure 1.4 GH clans with the TIM-barrel fold Ribbon drawings showing the four GH
clans that represent the (β/α)8-barrel fold Clan A is represented by Streptomyces glucosidase (family GH1, PDB is 1GNX; Guasch et al., 1999); clan D by Trichoderma reesei α-galactosidase (family GH11, PDB code 1SZN; Golubev et al., 2004); clan H by
β-Pyrococcus woesei α-amylase (family GH13, PDB code 1MXG; Linden et al., 2003), and
clan K by Serratia marcescens chitinase (family GH18, PDB code 1CTN; Perrakis et al.,
1994)
Trang 291.2.6 Linking CBMs to catalytic domains
The catalytic domains of GH proteins are frequently associated with a catalytic carbohydrate-binding module (CBM) As mentioned above in section 1.2.2, 48 CBM families have hitherto been defined For many CBMs in the CAZy database the ability of the module to bind carbohydrate has been shown, and the specificity determined However, for most CBMs, binding studies have either not been performed, or results from binding studies have proved inconclusive as to whether the module binds or interacts with the carbohydrate For this reason, these modules may sometimes be referred to simply as non-catalytic modules (NCMs) As will be discussed later in this thesis, starch-degrading enzymes often contains NCMs that do not appear to bind to sugar, but are, nevertheless classified as CBMs by CAZy The CBM families can be grouped into seven fold families (outlined in Table 1.3) according to their structural similarities (Boraston
non-et al, 2004) More recent structure dnon-eterminations show additional members of
the β-sandwich CBM family: CBM20, 21, 25, 26, 30, 31, 33, 41, 44, 47 and 48; as well as a β-trefoil structure, CBM42; and a lectin-like fold, CBM40
Table 1.3 CBM fold families
Trang 30The CBMs can also be classified into either of three groups depending on their mode of carbohydrate binding (Table 1.4): type A, surface binding; type B, glycan-chain binding; and type C, small-sugar binding
Table 1.4 Types of carbohydrate binding platforms
B 1 2b, 4, 6, 15, 17, 20, 22, 27, 28, 29, 34, 36
C 1, 2, 6, 7 9, 13, 14, 18, 32
As seen in Table 1.4, all three types of binding modes can be
accommodated by the fold family 1 representing the β sandwich (Boraston et al.,
2004), which clearly emphasizes the versatility of this fold The major factors that shape the binding sites in CBMs are aromatic protein side chains and the conformation of loop structures The different types of CBM platforms, planar, twisted, and sandwich, that can be formed using aromatic side chains are shown
in Figure 1.5
Type-A CBMs bind to insoluble, highly crystalline cellulose, and/or chitin
using a flat, platform-like binding surface (Fig 1.5 a) Type-B CBMs have their
binding site located in a cleft or groove where the ligand is bound to a series of
subsites (Fig 1.5 b-c) These CBMs have limited interactions with the
oligosaccharides, and affinity increases with chain length (highest affinity for hexasaccharides) Type-C CBMs have a lectin-like mode of binding to mono-, di-
or trisaccharides
Trang 31Figure 1.5 Sugar-binding platforms in CBMs (a) Planar platform with three Trp
residues in a type-B CBM (Pseudomonas fluorescens CBM10; PDB code 1QLD; Raghothama
et al., 2000); (b) twisted platform using one Tyr and two Trp residues in a type-B CBM
(Piromyces equi CBM29; PDB code 1GWL; Charnock et al., 2002), and (c) sandwich platform with three Tyr residues in a type-B CBM (Cellulomonas fimi CBM4; PDB code 1GU3; Boraston et al., 2002) The Cα backbone is shown as beige coil with carbohydrate-
interacting aromatic residues drawn as stick objects The carbohydrate in (b) and (c) are colored green
1.3 Starch and starch-processing enzymes
1.3.1 Starch - the enzyme substrate
Starch is the major carbohydrate reserve in plant tubers and seed
endosperm where it is found as spherical granules (Buléon et al., 1998), and the
pure polymer consists predominantly of α-glucan Polymerization of starch involves the covalent linking of α-D-glucose units in the 4C1 chair conformation by
α-glycosidic bonds, α-1,4 or α-1,6 linkages The α-1,4 glycosidic bond (Fig 1.6 a)
Trang 32joins the C1 carbon atom of a glucosyl residue i to the C4 carbon atom of a glucosyl residue (i–1) to form the disaccharide maltose Successive addition of
glucosyl residues joined by α-1,4 linkages results in the formation of a roughly
linear, non-branched polymeric chain known as amylose (Fig 1.6 b) In addition,
α-1,6 glycosidic bonds can form between the C1 and C6 such that a new branch is
formed, as in the starch polymer amylopectin (Fig 1.6 c).
Figure 1.6 Structure of starch components Chair-configuration representations of (a)
maltose, (b) amylose, (c) amylopectin The numbering convention of the carbon atoms in the pyranose ring is shown for the reducing-end glucosyl moiety of maltose The α-1,4 and α-1,6 glycosidic bonds are indicated Amylose is linear and unbranched whereas amylopectin is branched with α-1,6 glycosidic bonds at the branch points
The relative ratios of amylose and amylopectin vary with typical values of 15-25% and 75-85%, respectively The degree of polymerization (DP) of the non-branched, linear amylose molecule is in the range 1000 to 6000 glucose units
Trang 33(Fig 1.6 b) Amylopectin, which refers to the branched polymer, is linear with
typical length of 12-120 glucose units linked by α-1,4 bonds The branches are linked at the branch points by α-1,6 bonds, and range in size from 15 to 45
glucose units (Fig 1.6 c) The length of both α-1,4 backbone and α-1,6-linked branches varies depending on the botanical origin The DP of amylopectin can be
as high as 2,000,000 glucose units with roughly 95% α-1,4- and 5% α-1,6-bond content In addition, hydrogen bonding between O3 and O2 atoms of adjacent sugar residues generates a helical twist of the polymer
Amylose in starch is present as double-helical A- and B-amyloses, and the single-helical V-amylose V-amylose is found naturally in non-A and non-B segments of amylose and is folded into a left-handed single helix The 1.0-Å
resolution crystal structure of V-amylose (Gessler et al., 1999) shows that the
molecule folds into two short, antiparallel left-handed helices that are related by
two-fold rotational pseudo-symmetry (Fig 1.7 a) The V-form of amylose contains
a channel along the helical axis that is filled with disordered water molecules, that can also bind to molecules such as “iodine’s blue”
Figure 1.7 Helical structure of V- and A-amylose Stick models of (a) V-amylose,
diameter of 9-10 Å; (b) amylose (perpendicular to helical axis), diameter 6-7 Å; (c) amylose (down the helical axis), and (d) A-amylose (perpendicular to tilted helical axis)
Trang 34A-Based on techniques such as X-ray fiber diffraction, electron diffraction, and solid-state 13C cross-polarization/magic angle spinning NMR spectroscopy and computer-aided modeling, the structures of A- and B-amylose have been shown
to contain parallel double helices, where each helix contains six glucose residues
per helical turn over 21.38 Å (Fig 1.7 b-d) Due to the inherent limitations of
these techniques, the handedness of the helices cannot be determined
unambiguously, and thus, the helices can have either a right-handed (Murphy et al., 1975; Brisson et al., 1991; Veregin et al., 1987; Gidley & Bociek, 1988) or left-handed (Imberty et al., 1988) twist The A- and B-form differ only in the
precise packing arrangement and their water content The A-form is found preferentially in cereals, and the B-form in tubers
exoamylase; (iii) debranching enzymes; and (iv) transferases (Fig 1.8)
Endoamylases, the most common member being α-amylase (EC 3.2.1.1), perform random cleavage of the α-1,4 glycosidic bonds in both amylose and amylopectin to produce shorter linear and branched oligosaccharides Exoamylases cleave specifically the non-reducing end of a starch polymer to liberate either glucose or maltose from the chain end Examples of exoamylases are β-amylases (EC 3.2.1.2), glucoamylases (EC 3.2.1.3) and α-glucosidases (EC 3.2.1.20) β-amylase cleaves α-1,4 glycosidic bonds specific to maltose, glucoamylase produces glucose with β-anomeric configuration, and α-glucosidase produces glucose with α-anomeric configurations Glucoamylase and α-glucosidase are able to cleave both α-1,4 and α-1,6 glycosidic bonds These
Trang 35amylolytic enzymes display multi-domain organization, and belong to either of the following GH families: α-amylases, GH13 and GH57; GH14, β-amylases; and GH15, glucoamylases The three types of amylolytic enzymes do not share any sequence similarity, and about 10% carry a non-catalytic domain that can bind to raw starch Whereas α-amylases are retaining enzymes, β-amylases and glucoamylases are inverting
Figure 1.8 Enzymes involved in starch processing Grey and yellow rings represent
D-glucose units, where the yellow rings are reducing-end glucosyl moieties The various
enzymatic activities are indicated Adapted from van der Maarel et al., 2002
Debranching enzymes catalyze mainly the hydrolysis of α-1,6 glycosidic bonds There are three members in this group: isoamylase (EC 3.2.1.68), pullanases type I (EC 3.2.1.41), and pullanases type II (neopullulanase, EC 3.2.1.135) All three are able to hydrolyze the α-1,6 glycosidic bonds in amylopectin resulting in the generation of long linear α-D-glucose chains The pullanases have the additional ability to hydrolyze pullulan, a polysaccharide that
α -1,6 hydrolysis
α -amylase
amylomaltase
cyclodextrin glycosyltransferase
glucan branching enzyme
α -1,4 transferase
α -1,4 hydrolysis
α -1,6 hydrolysis
α -1,4 hydrolysis
Trang 36is made up of α-1,6 glycosidic linked maltotriose units Type II pullanases are also known as α-amylase-pullanases or amylopullanases because they are able to hydrolyze both α-1,4 and α-1,6 glycosidic bonds
Transferases cleave the α-1,4 glycosidic bond of the donor molecule and transfer part of the donor to a glycosidic acceptor The enzymes amylomaltase (EC 2.4.1.25) and cyclodextrin glycosyltransferase (EC 2.4.1.19) form new α-1,4 glycosidic bonds while glucan-branching enzymes form new α-1,6 glycosidic bonds Transglycosylation by amylomaltase produces a linear product, while cyclodextrin glycosyltransferase generates a cyclic product Glucan-branching enzymes are mainly involved in the synthesis of glycogen in microorganisms where they synthesize α-1,6 glycosidic bonds in the side chains of glycogen The main characteristics of some of the exoamylases are given in Table 1.5
Table 1.5 Characteristics of exoamylases
Bonds cleaved Product Substrate preference
β-amylase α-1,4 glycosidic bond Maltose None
glucoamylase α-1,4 glycosidic bond
α-1,6 glycosidic bond
β-anomeric glucose
Long-chain polysaccharides
α-glucosidase α-1,4 glycosidic bond
α-1,6 glycosidic bond
α-anomeric glucose
Short malto- oligosaccharides
Detailed investigations by MacGregor and co-workers (2001) revealed that the four reactions catalyzed by starch-processing enzymes, namely α-1,4 and α-1,6 hydrolysis and α-1,4 and α-1,6 transglycosylation, all proceed by net retention of configuration at the anomeric carbon (Kuriki, 1999) Like many other
GH enzymes, they display a modular, multi-domain organization with catalytic domains with TIM-barrel structures The catalytic module is attached, either directly or indirectly, to various non-catalytic modules By means of varying the
Trang 37non-catalytic modules, the enzymes are able to adopt a wide range of substrate
and product specificities (van der Maarel et al., 2002)
1.3.3 The α-amylase superfamily
Members of the α-amylase superfamily (Janecek et al., 1997, 2003; MacGregor et al., 2001; Svensson, 1994) are found in two CAZy families, GH13
and GH57 The α-amylases hydrolyze α-1,4-glycosidic bonds with net retention of configuration at the anomeric carbon, and are strictly calcium dependent The principal products from starch degradation by α-amylases are maltoriose and maltose from amylose, and maltose and glucose from amylopectin Three distinct domains are commonly present in α-amylases, referred to as domains A, B and C
(MacGregor et al., 2001; Fig 1.9 a-c) The A/B domains constitute the catalytic domain, and assume the fold of a TIM barrel (Wierenga, 2001) Domain A
contains the active site with three catalytic carboxylate residues (one glutamic and two aspartic acids), and is also responsible for the cooperative binding of at least one Ca2+ ion together with domain B Domain B constitutes an insertion between the third β strand and the third α helix of domain A, and features an
irregular β-rich structure whose precise appearance varies between different enzymes
In some members of CAZy family GH13, domain B forms part of the substrate-binding cleft Domain C is found in most members of the family, and is located immediately after domain A The principal structure is a β-sandwich
domain containing a Greek key motif In deletion studies of the GH13 enzyme PalI
(Zhang et al., 2003), the B domain has been shown to stabilize A, and the
catalytic activity of the enzyme is correlated to the point of truncation in domain
C In addition, domain B helps to shield hydrophobic amino-acid residues in domain A from the solvent, and for some enzymes, domain C has also been
implicated in substrate binding It should be noted that the catalytic domain of
Trang 38GH57 α-amylases differs from the GH13 enzymes in that the structure lacks one β strand and one α helix, creating a (β/α)7 barrel, i.e., a TIM-like barrel
Figure 1.9 The domain organization of α-amylases Ribbon drawings showing the
domain organization of representative α-amylases of CAZy family 13 (a) Bacillus
licheniformis α-amylase (PDB code 1BLI; Machius et al., 1998); (b) isoamylase from Pseudomonas amyloderamosa (PDB code 1BF2; Katsuya et al., 1998); and (c) the Thermoactinomyces vulgaris α-amylase II (TVA II) monomer (PDB icode 1BVZ; Kamitori et al., 1999) Secondary structure elements are shown as: β-strands as arrows; α-helices as
spirals, and loops as coils The domains discussed in the text are colored as follows: A, yellow; B, red; C, magenta; N, light blue; and linker between N and A, blue Calcium and
sodium ions are shown as blue and green spheres, respectively The configuration of three metal ions in series, Ca2+-Na+-Ca2+, in the B licheniformis enzyme is unique
In addition to the three common domains, some members of family GH13
carry an extra domain, denoted N because it is found at the N terminus (Fig 1.9 b-c) The N domain precedes domain A to which it is connected by a peptide linker of varying length The typical structure of domain N is a β sandwich, however, some diversity exists among different GH13 members (Janecek et al., 2003) Comparison of domain N-containing amylases shows that the precise function and location of their N domains differ (Fig 1.9 b-c)
Trang 39In some N-domain containing enzymes (Fig 1.9 b), the N domain is involved in the formation of the active site cleft Domain N interacts with both domains A and B to help stabilize their flexible regions The B domain is smaller,
roughly half the size, compared with most other enzymes, and its role in forming
the binding cleft has been partly taken over by domain N In addition to its contributions to stability, domain N has been suggested to have a role in the initial steps of substrate binding (Timmins et al., 2005) However, their N
domains differ in the precise length and conformation of loop regions Examples
of this type include isoamylase from Pseudomonas amyloderamosa (Katsuya et al., 1998), maltooligosyltrehalose trehalohydrolase from Deinococcus radiodurans (Timmins et al., 2005), and Thermoactinomyces vulgaris R-47 α-amylase I (TVA
I; Abe et al., 2004; 2005)
Another type (Fig 1.9 c) is represented by the dimeric enzymes α-amylase
II (TVA II) from Thermoactinomyces vulgaris R-47 (Kamitori et al., 1999) and maltogenic amylase (ThMA) from Thermus (Kim et al., 1999; Lee et al., 2002) In the crystal structures, the N domain from one subunit of the dimer forms extensive interactions with that of the other subunit, which indicates that the N
domain has a function in the dimerization process Within the subunit structure,
the N domain is located too far away from the active site to participate directly in catalysis In the dimer however, the N domain of one subunit interacts extensively with the active site and substrate-binding cleft of the A/B domains of the other subunit, suggesting that the N domain may participate in catalysis and substrate binding in the dimer (Kamitori et al., 1999) Studies on TVA II mutants where the N domain had been deleted, suggest that the domain may have three
functions: to stabilize the enzyme under conditions of extreme pH and temperature; to participate in substrate recognition and hydrolysis; and to have a
role in dimerization of the TVA II molecule as a connector (Yokota et al., 2001) It
should be emphasized, however, that the limited biochemical and structural data
Trang 40available for enzymes carrying N domains prevents a more detailed discussion of the function of this domain The homodimeric Bacillus stearothermophilus neopullulanase (PDB code 1J0J; Hondoh et al., 2003) also contains an N domain,
however, it appears to be involved in dimer association and to some extent, substrate binding
As expected for retaining GHs, two amino-acid side chains in the active site play important roles in catalysis by α-amylases A glutamate side chain acts as acid/base to protonate the glycosidic oxygen of the bond to be cleaved in the first step of the retaining reaction, and in the second step, it deprotonates the attacking hydroxyl group A second amino acid, an aspartate carboxyl group, acts
as a nucleophile that attacks the substrate sugar at C1 to which it forms a covalent linkage to produce a covalent enzyme-substrate intermediate The roles
of these amino acids have been demonstrated elegantly by Dijkstra and
co-workers (Uitdehaag et al., 1999) for the enzyme cyclodextrin glycosyltransferase
(CGTase), a member of the α-amylase family They presented a structure of CGTase with a covalently bound reaction intermediate (4-deoxy-maltotrioside) at 1.8 Å resolution (PDB code 1CXL) that allowed a detailed picture of substrate distortion and characteristics of the intermediate, as well as the structure of CGTase in complex with a maltononaose oligosaccharide at 2.1 Å resolution (PDB code 1CXK) The architecture of the non-liganded active site in CGTase is shown
in Figure 1.10 highlighting the catalytic amino acids Asp229 (nucleophile), Glu257 (acid/base) and Asp328 (helps to induce substrate distortion) Other conserved amino acids include: His327 and Arg227 which help to stabilize the intermediate; Tyr100 which provides stacking interactions with the –1 sugar ring; and Asp135 and Trp75 that help provide a suitable binding site for the intermediate