Crystal structure of amyb, an alpha amylase from halothermothrix orenii, and comparison with its homologs

A vast number of enzymes act to cleave, synthesize or modify glycosidic bonds in carbohydrate compounds, and therefore, the need for a robust method for classification of carbohydrate-ac

Trang 1

CRYSTAL STRUCTURE OF AMYB, AN

α-AMYLASE FROM HALOTHERMOTHRIX ORENII,

AND COMPARISON WITH ITS HOMOLOGS

Tien Chye Tan

Submitted 14 April 2007

A THESIS SUBMITTED FOR THE DEGREE OF

DOCTOR OF PHILOSOPHY DEPARTMENT OF BIOLOGICAL SCIENCES

NATIONAL UNIVERSITY OF SINGAPORE 2007

Trang 2

PUBLICATIONS

Paper I

Tan, T.C., Yien, Y.Y., Patel, B.K., Mijts, B.N., and Swaminathan, K (2003)

Crystallization of a novel alpha-amylase, AmyB, from the thermophilic halophile

Halothermothrix orenii

Acta Crystallografica sect D 59, 2257-2258

Paper II

Huynh, F., Tan, T.C., Swaminathan, K., and Patel, B.K (2005)

Expression, purification and preliminary crystallographic analysis of sucrose

phosphate synthase (SPS) from Halothermothrix orenii

Acta Crystallographica sect F 61, 116-117

Trang 3

ACKNOWLEDGEMENTS

In my list of acknowledgements, there are people with specific contributions and also people who have helped me in many small but important ways that are impossible to list The following list is by no means complete nor are the contributions of those listed limited to what is listed It is just a weak attempt to thank everyone for their help and kindness shown to me during my time here

Dr Victor Wong and Dr Kunchithapadam Swaminathan for their patience and guidance Dr Wong for getting me started in research and for believing in me Dr Swami for burning the midnight oil to help me get my thesis out on time

My collaborator Dr Bharat Patel for spending hours educating me on his field Dr Jayaraman Sivaraman for putting up with my endless list of questions

Thanks also go out to my ‘editor’ Maykalavaane d/o Narayanan for helping make sense of my disjointed thoughts and my dyslexic spelling

My lab mates, members of Structural Biology Lab for the mayhem and chaos to spice up our stay here Not forgetting the department staff who definitely have help a lot in their own way

Dr Christina Divne for supervision on additional experimental results presented in the revised version of the thesis, and advice on revision of the thesis And for pulling my brain out of the gutter and getting it to stay focused

Prof Birte Svensson and Dr Karen Marie Jakobsen at BioCentrum-DTU (Lyngby, Denmark) for supplying the acarbose inhibitor, which was put to good use

Of course there is my mom but that would take up another thesis on its own so I shall just say “THANKS MOM”

Trang 4

Table of Contents

1.2.2 Carbohydrate-active enzymes are classified in CAZy 7

1.2.3 Reaction mechanisms of glycoside hydrolases 9

1.2.5 The TIM barrel is a recurrent fold in GH enzymes 13

Trang 5

1.4.2 Halothermothrix orenii produces two α-amylases 28

1.4.3 Biochemical characteristics of AmyA and AmyB 30

1.4.4 Thermal inactivation studies on AmyA and AmyB 31

1.4.6 Content of charged amino acids in AmyA and AmyB 33

2.1.4 Preparation of the expression vector 38

2.3.2 Optimization of crystallization conditions and ligand soaks 45

2.3.4 Structure determination and refinement 46

Trang 6

3 RESULTS 50

3.2.3 Starch degradation studies on AmyB and ΔAmyB 53

3.2.4 Starch binding studies on AmyB and ΔAmyB 54

3.2.5 Stability analysis of AmyB and ΔAmyB 56

3.5 Structure determination, model building and refinement 64

3.6.2 Domains A and B – the catalytic module 71

3.7.1 Binding of acarbose-derived oligosaccharide 82

3.7.2 Binding of maltoheptaose and α-cyclodextrin 90

Trang 7

4 DISCUSSION 95

4.1.1 Natural habitat of Halothermothrix orenii 95

4.1.2 Possible natural substrates for H orenii AmyB 95

4.2.1 Influence of negatively charged surfaces 97

4.2.4 Thermal stability as a function of salt concentration and pH 100 4.3 AmyB represents a unique member of the α-amylase family 101

4.3.3 AmyB is unique compared with AmyA and other α-amylases 104

Paper I

Paper II

Trang 8

LIST OF TABLES

Table 1.1 Categories of halophiles

Table 1.2 Clans and folds of glycoside hydrolases

Table 1.3 CBM fold families

Table 1.4 Types of carbohydrate binding platforms

Table 1.5 Characteristics of exoamylases

Table 3.1 Statistics for data collection

Table 3.2 Statistics for crystallographic refinement

Table 3.3 Interface parameter analysis for domain A/N association

Table 3.4 Mapping of sugar residues of acarbose-derived oligosaccharides to the active-site subsites of the α-amylases BA2, AmyB and BHA

Table 3.5 Interactions with a nonasaccharide in the A-B groove of AmyB ACR

Table 3.6 Interactions with acarbose in the N-C groove of AmyB ACR

Table 3.7 Interactions with α-D-glucose in the B1 and B2 sites of AmyB ACR

Table 3.8 Interactions with maltotetraose in the A-B groove of AmyB MAL7-ACX

Table 3.9 Interactions with maltotetraose in the N-C groove of AmyB MAL7-ACX

Trang 9

LIST OF FIGURES

Figure 1.1 Chair representation of cellobiose

Figure 1.2 Reaction mechanisms for glycoside hydrolases

Figure 1.3 Active-site topologies of glycoside hydrolases

Figure 1.4 GH clans with the TIM-barrel fold

Figure 1.5 Sugar-binding platforms in CBMs

Figure 1.6 Structure of starch components

Figure 1.7 Helical structure of V- and A-amylose

Figure 1.8 Enzymes involved in starch processing

Figure 1.9 The domain-organization of α-amylases

Figure 1.10 The active site in Bacillus circulans strain 251 CGTase

Figure 1.11 The structure of a lipoprotein secretion-signal peptide

Figure 2.1 Schematic representation of AmyB constructs

Figure 2.2 Schematic representation of the amyB-containing pTHAB template

Figure 3.1 Lipoprotein signal peptide in AmyB

Figure 3.2 Analysis of protein purity by SDS-PAGE

Figure 3.3 Gel-filtration chromatogram for AmyB (B2) and ΔAmyB (B3)

Figure 3.4 Analysis of protein purity by SDS-PAGE

Figure 3.5 Rate of starch degradation by AmyB and ΔAmyB

Figure 3.6 Binding of AmyB and ΔAmyB to raw starch as a function of [NaCl]

Figure 3.7 Tm values for AmyB and ΔAmyB as a function of NaCl concentration at different pH values

Figure 3.8 Tm values for AmyB and ΔAmyB as a function of pH at different NaCl concentrations

Figure 3.9 Morphology of slow-growing, non-optimized AmyB crystal forms I-III Figure 3.10 Morphology of AmyB crystal form IV

Figure 3.11 Diffraction pattern of the AmyBACR crystal

Figure 3.12 Crystal packing in the C2 unit cell of the AmyB-III crystal form

Trang 10

Figure 3.13 Ramachandran analysis of the AmyB models

Figure 3.14 Representative electron density for AmyB models

Figure 3.15 Overall structure of AmyB

Figure 3.16 Topology diagram for the catalytic A/B domain in H orenii AmyB Figure 3.17 Topology diagram of the AmyB-C domain

Figure 3.18 Overall fold of the AmyB-N domain

Figure 3.19 Structural superposition of H orenii AmyB-N with P syringae CopC Figure 3.20 Domains that are topographically similar to the AmyB-N domain Figure 3.21 The location of the N domain in α-amylases

Figure 3.22 Chair configuration of acarbose

Figure 3.23 Electron density around the –2 subsite in the A-B groove

Figure 3.24 Binding of the nonasaccharide in the A-B groove of AmyB

Figure 3.25 Binding of acarbose in the N-C groove of AmyB

Figure 3.26 Picture showing the positions of carbohydrate bound to AmyBACR Figure 3.27 Picture showing the positions of carbohydrate bound to AmyBMAL7-ACX Figure 3.28 The active site in AmyB

Figure 3.29 Comparison of the active-site loops in AmyB and AmyA

Figure 4.1 Electrostatic potential surfaces at different salt concentrations

Figure 4.2 Model of full-length AmyB on the lipid membrane

Trang 11

LIST OF ABBREVATIONS USED

amyA Halothermothrix orenii α-amylase type A gene

AmyB Halothermothrix orenii α-amylase type B, full-length enzyme

amyB Halothermothrix orenii α-amylase type B gene

AmyBACR AmyB in complex with the inhibitor acarbose

AmyB-N N domain of H orenii AmyB

ASA solvent-accessible surface area

BLAST Basic Local Alignment Search Tool

BSA bovine serum albumin

CAZy Carbohydrate-Active Enzyme database

CBH cellobiohydrolase

CBM carbohydrate-binding module

CE carbohydrate esterase

CGTase cyclodextrin glycosyltransferase

Cop copper resistance protein

cop copper resistance operon

C-terminal carboxy-terminal

Trang 12

ΔAmyB Halothermothrix orenii α-amylase type B, lacking N domain

DNA deoxyribonucleic acid

Hepes 4-(2-hydroxyethyl)-1-piperazineethanesulfonic acid

IDA iminodiacetic acid

IMAC immobilized metal affinity chromatography

IPTG isopropyl β-D-1-thiogalactopyranoside

IUBMB International Union of Biochemistry and Molecular Biology

Mes 2-(N-morpholino) ethanesulfonic acid, or 4-morpholine

ethanesulfonic acid Mops 3-(N-Morpholino) propanesulfonic acid, or 4-morpholine

propanesulfonic acid MPD 2-methyl 2,4-pentanediol

Trang 13

NTA nitrilotriacetic acid

OD600 optical density measured at 600 nm

PCR Polymerase Chain Reaction

PEG polyethylene glycol

PL polysaccharide lyase

r.m.s root-mean-square

r.m.s.d root-mean-square deviation

SDS-PAGE sodium dodecyl sulfate polyacrylamide gel electrophoresis

ThMA Thermus maltogenic amylase

TIM triosephosphate isomerase

TLS translation, libration, screw-rotation

Tris 2-amino-2-(hydroxymethyl)-1,3-propanediol

TVA I Thermoactinomyces vulgaris α-amylase I

TVA II Thermoactinomyces vulgaris α-amylase II

X-gal 5-bromo-4-chloro-3-indolyl- beta-D-galactopyranoside

Trang 14

molecule, the N-C groove Results from starch-binding studies using full-length AmyB and a truncated mutant lacking the N domain show that the presence of the N domain enhances binding to the insoluble substrate Moreover, the present

study has confirmed a sequence signal for a lipoprotein peptide in AmyB that serves to anchor the enzyme to the bacterial membrane Based on the above observations we have produced a tentative model as to how the full-length enzyme is immobilized to the membrane surface Results presented in this thesis show that AmyB is indeed unique compared with other α-amylases in that it is membrane bound, monomeric, and carries an N-terminal domain between the

membrane linker and domain A that forms a large groove for binding of raw

starch We observe that for AmyB the conditions for maximal stability to unfolding and stability at maximum catalytic performance do not coincide; and we

provide a rational explanation for the tendency of the other H orenii amylase,

AmyA, to aggregate in the absence of salt

Trang 15

1 INTRODUCTION

1.1 Extremophiles

Over the years, a large number of microorganisms has been isolated from

a variety of different habitats, of which some are termed extreme environments and are inhospitable to humans Extreme environments include those with extremely low or high temperatures, low or high pH, high salinity or high pressure While there is basic biological interest in understanding how microorganisms are able to adapt and survive under extreme conditions, there is also biotechnological and industrial interest in these enzymes Increased understanding of the underlying adaptive mechanisms would help to better utilize the enzymes industrially and to tailor them for specific industrial bioprocessing purposes

1.1.1 Adaptation to high salinity

The main problem that microorganisms face in a saline environment is water loss by osmosis As the cytoplasmic membrane is permeable to water, one possible solution would be to regulate the osmotic potential of the cytoplasm such that it equals that of the outside environment The osmotic potential of the cytoplasm can be increased by accumulation of either inorganic salts (“salt-in”

strategy), or osmolytes, i.e., low molecular weight non-salt compounds

(“compatible-solute” strategy; Madigan & Oren, 1999)

The compatible-solute strategy is used by most halophilic bacteria, eukaryotic algae, fungi and even some halophilic metoganogenic archaea Here, the strategy is to balance the osmotic pressure of the medium by organic compatible solutes, or osmolytes (Madigan & Oren, 1999) This strategy does not require any adaptation of the intracellular system, and thus, enzymes that are not adapted to high-salt conditions would still be stable and active in the cytoplasm

Trang 16

Possible osmolytes include carbohydrates, amino acids, methylamine, and methylsulphonium zwitterions These are highly water-soluble, polar molecules that are uncharged or zwitterionic at physiological pH The concentration of osmolytes is regulated according to the external salt concentration Although the salt concentration of the cytoplasm would be low, it contains high concentrations

of osmolytes that ensure osmotic balance (Grant et al., 1998) In addition to

increasing the osmotic potential of the cytoplasm, osmolytes are also able to stabilize proteins that are under the stress of heat or pressure This strategy provides the host organism with a high degree of flexibility and adaptability to

differing environmental conditions (Grant et al., 1998) The categorization of salt

tolerance and optimal salt conditions that have been adapted by Grant and workers (1998) is listed in Table 1.1

co-Table 1.1 Categories of halophiles

Trang 17

been purified from a group of the Halobacteriaceae family Most members of this

family use the salt-in strategy, and accumulate high concentrations of potassium chloride in the cytoplasm Proteins from halophilic microbes would generally unfold or become inactivated in low salt conditions (<1 M NaCl) Typically, these proteins show a higher content of negatively charged amino-acid residues compared to their mesophilic counterparts and concomitant lower isoelectric

points (DasSarma et al., 2006) Concomitant with the higher density of acidic

amino-acid residues, there is a significant reduction of lysine residues, as well as

an increased number of small hydrophobic amino-acid residues, and a reduction

in the number of aliphatic amino-acid residues As a result, the molecular surfaces

of halophilic proteins feature highly negative electrostatic potentials that have been suggested to be an important mechanism in halophilic adaptation (Madern

et al., 2000) The negative electrostatic surface potential appears to be attributed mainly to an increase in aspartate residues (Fukuchi et al., 2003)

It should be noted that, although the term “halophilic protein” implies a enzyme that is active and stable only at high salt conditions, there are halotolerant proteins that, while stable and active at high salt concentrations, have evolved to function at lower salt concentrations by mechanisms that remain poorly understood

1.1.2 Thermal adaptation

There are no firm rules that govern the thermostability of proteins, however, there are several strategies with which thermostability can be attained Structural strategies include highly hydrophobic cores, reduced surface-to-volume ratios, a decrease in glycine content, a high number of electrostatic interactions, higher states of oligomerization, and shortening of surface loops (Madigan & Oren, 1999) A key feature of thermophilic proteins is a bias in amino acid composition Thermophilic proteins are usually rich in charged amino-acid

Trang 18

residues while having a scarce amount of polar amino-acid residues Thermophilic proteins, unlike their mesophilic and non-halophilic counterparts, show an increase in alanine and threonine residues at the expense of asparagine

residues (Fukuchi et al., 2003) Comparisons of the amino-acid composition

between the molecular surfaces of thermophilic and mesophilic proteins show that the bias in composition occurs mainly at the molecular surface (Fukuchi & Nishikawa, 2001)

Both halophilic and thermophilic proteins show an increase in the number

of charged amino-acid residues compared with their mesophilic homologs However, the halophilic proteins show a bias towards acidic amino-acid residues, while the thermophilic proteins have an equal partitioning of acidic and basic amino-acid residues on the surface The simultaneous increase in acidic and basic amino acids enables more ion pairs to form at the surface that may help stabilize

thermophilic proteins (Karshikoff & Ladenstein, 2001; Fukuchi et al., 2003) Ion

pairs are often formed between side chains that are distant in the amino-acid sequence, and they tend to be organized into networks that can be found on the protein surface, partially buried inside the protein, or at domain or subunit interfaces Such networks show a high degree of cooperativity to the extent that the stabilization effect cannot be reduced to merely the sum of ion-pair interactions Although ion pairs have an important role in the stabilization of thermophilic proteins, they are not the sole determinants of thermostability

A major shortcoming of most studies that attempt to explain the mechanisms of thermal adaptation is that only small sets of proteins are compared and analyzed However, well into the post-genomic era it is now possible to take full advantage of genomic data and the outcomes of structural-genomics projects to analyze, with statistical significance, various factors

responsible for the adaptation process In a recent study (Robinson-Rechavi et al., 2006), a large dataset of protein structures from the hyperthermophilic

Trang 19

bacterium Thermotoga maritima was compiled and analyzed together with

structures of close protein homologs of mesophilic origin The results showed that, contrary to what has been suggested previously, factors such as oligomerization order, hydrogen bonds, and secondary structures are of minor importance to the adaptation process in bacteria Statistically significant contributions to stability were observed for density of salt bridges and compactness, which accounted for changes in 96% of the protein pairs studied

1.1.3 Stabilization mechanisms

One hypothesis (Mevarech et al., 2000) suggests that stabilization of

halophilic proteins by means of excess acidic amino-acid residues is best

explained by the solvation-stabilization model In this model, acidic

surface-exposed amino acids bind cooperatively to a network of hydrated salt ions to which water molecules become associated to form a solvation shell At reduced salt concentration, however, the protein-associated solvation shell is depleted of

salt ions, which may destabilize the protein and induce unfolding (Mevarech et al.,

2000) The strength of solvent-protein interactions is solvent and salt dependent, and thus, factors such as complex ion-pair networks, weak protein-protein interactions, and specific ion bindings are additional factors that contribute to

stability and solubility of halophilic proteins (Premkumar et al., 2005)

1.2 Classification of carbohydrate-active enzymes

1.2.1 Carbohydrates as enzyme substrates

Carbohydrates are present in large amounts everywhere on Earth with functions ranging from building blocks and energy reserves in our bodies to mechanical reinforcement in trees At the cellular level, carbohydrates are involved in a large spectrum of intracellular and extracellular processes ranging from energy storage to delicately controlled and specific molecular signaling

Trang 20

reactions Sugar compounds display high stereochemical diversity, a hexasaccharide can give rise to more than 1012 different isomers, and living organisms have efficiently taken advantage of this variation by producing enzymes that can degrade virtually all different types of saccharides; simple or complex, non-polymeric or polymeric, crystalline or non-crystalline A vast number of enzymes act to cleave, synthesize or modify glycosidic bonds in carbohydrate compounds, and therefore, the need for a robust method for classification of carbohydrate-active enzymes was realized early on Before discussing the classification of carbohydrate-active enzymes and their reaction mechanisms in depth, a few definitions regarding carbohydrates will be provided (Fig 1.1)

Figure 1.1 Chair representation of cellobiose The disaccharide consists of two Dglucose units linked covalently by a β-1,4 glycosidic bond The unblocked hemiacetal group

-is at the reducing end of the d-isaccharide

Carbohydrates containing a six-member ring such as D-glucose are referred to as pyranoses In aqueous solution, D-glucose exists in equilibrium with

an open open-chain aldehyde, two pyranose forms, two furanose forms member rings) and the hydrated form of the open chain (the keto form) The two pyranose rings differ such that one form has the C1 hydroxyl group in equatorial

equatorial position (β)

reducing end non-reducing end

Trang 21

position (i.e., β form with the hydroxyl group trans to the exocyclic C6-O6 group) and the other in axial position (i.e., α form with the hydroxyl group cis to the C6-

O6 group) Unless the anomeric C1 carbon is protected, inter-conversion (tautomerisation) will occur between the α and β forms The two forms are referred to as the α and β anomers, or stereoisomers, of D-glucose In the case of

D-glucose, the C1 carbon is also known as the anomeric carbon, or the hemiacetal carbon, indicating that this is the position where the chain can open up to yield the open form

The covalent linking of two carbohydrates involves a dehydration-synthesis during which a hydrogen atom is removed from one sugar unit and a hydroxyl group is removed from the other with the formation of one water molecule The new bond is termed a glycosidic bond, and when a glycosidic bond is cleaved it occurs by hydrolysis When several sugar units are linked by glycosidic bonds, different types of polymers are formed depending on the types of carbohydrate building blocks used In the polymeric form, the end of the polymeric chain that has a free, unprotected anomeric carbon is referred to as the reducing end The name refers to the ability of the ring to open up at the anomeric carbon to give the open aldehyde form The aldehyde group readily reduces other molecules and ions, whereby the aldehyde becomes oxidized to the carboxylic form When the reducing-end sugar of a chain is linked to the hydroxyl group of another sugar, it

is converted to an acetal that is unable to open to the aldehyde or keto form The sugar is then said to be non-reducing Thus, an extended carbohydrate chain has directionality where one end is referred to as the non-reducing end, and the other

as the reducing end

1.2.2 Carbohydrate-active enzymes are classified in CAZy

The historical classification of enzymes provided by the International Union

of Biochemistry and Molecular Biology (IUBMB) Enzyme nomenclature

Trang 22

(www.chem.qmul.ac.uk/iubmb) classifies enzymes based on their substrate specificity, and in some cases also on the reaction mechanism This classification does not take into consideration the three-dimensional (3-D) structure of the enzymes, nor does it account for the fact that some enzymes use multiple substrates, for instance, many endoglucanases that hydrolyze cellulose are also able to cleave xylan, xyloglucan, β-glucan as well as some artificial substrates In the early 90’s, the need for a better classification system for carbohydrate-active enzymes prompted Henrissat and co-workers to investigate the relation between enzymes in more detail, and over the past decade, an impressive amount of data has been collected and implemented into a rigorous classification system (Henrissat, 1991; Henrissat & Bairoch, 1993; Davies & Henrissat, 1995; Henrissat

& Davies, 1997; Davies et al., 2005a) that is now available at the Active Enzyme database (afmb.cnrs-mrs.fr/CAZY)

Carbohydrate-This method uses similarities in amino-acid sequence analyzed by hydrophobic-cluster analysis, and provides useful information beyond substrate specificity such as protein structure, evolutionary relationships, in addition to functioning as a tool to derive and predict mechanistic information The method better reflects the structural and evolutionary features of the enzymes and, in addition to sequence similarity, members of a given family will share a common 3-D structure and reaction mechanism Carbohydrate-active enzymes are often modular containing a catalytic domain (module) linked to one or more modules with other functions such as carbohydrate binding Currently, catalytic modules of carbohydrate-active enzymes belonging to any of the four CAZy groups have been defined as: i) Glycoside Hydrolases (GH) that hydrolyze glycosidic bonds in sugar compounds; ii) Glycosyl Transferases (GT) that synthesize glycosidic bonds by transferring activated donor molecules to specific acceptor sugars; iii)

Polysaccharide Lyases (PL) that cleave polysaccharide chains by a β-elimination reaction to give a double bond at the newly produced reducing substituted end;

Trang 23

and, iv) Carbohydrate Esterases (CE) that catalyze O- or N-glycosylation of

substituted saccharides by using the sugar either as an acid (pectin methyl esterases), or as an alcohol

In addition to the classification of catalytic modules, the non-catalytic modules often associated with the catalytic domains of carbohydrate-active enzymes constitute a separate group, the Carbohydrate-Binding Modules (CBM)

In this group, modules are found attached to catalytic modules, but are not

catalytic per se Usually the CBMs function to bind to polymeric carbohydrate substrates (e.g cellulose, xylan, starch, chitin etc.), but for some CBMs,

carbohydrate binding has not been demonstrated, and thus, their functions remain unclear As of November 2006, there are 108 GH families, 87 GT families,

18 PL families, 14 CE families, and 48 CBM families To date, 25760 amino-acid sequences of GHs have been grouped into the 108 GH families, 2178 3-D structures of GHs have been deposited with the Protein Data Bank (www.rcsb.org), and of the 108 GH families, 42 still lack information about their 3-D structures

1.2.3 Reaction mechanisms of glycoside hydrolases

Enzymatic hydrolysis of glycosidic bonds by GH enzymes involves acid catalysis As mentioned above, glycoside hydrolases are classified into as many as 107 different GH families, however, the mechanism whereby the glycosidic bond between sugar units is hydrolyzed can only be one of two possible types (Koshland, 1953; Sinnott, 1990): an inverting (bimolecular SN2 nucleophilic substitution), or a retaining (unimolecular SN1 nucleophilic substitution, or double-displacement) mechanism (Fig 1.2) The two mechanisms differ by their stereochemical outcome, but have some common features In both mechanisms, the proton donor is located within hydrogen-bonding distance of the glycosidic

Trang 24

general-oxygen of the susceptible bond, and the reaction proceeds via an oxocarbenium

ion-like transition state

In the inverting mechanism (Fig 1.2 a), a single displacement occurs at

the anomeric carbon The reaction is catalyzed by two carboxylate residues: a proton donor acting as a general acid, and a nucleophile acting as a general base The reaction starts by protonation of the glycosidic oxygen and release of the

leaving group concomitant with the nucleophilic attack by a water molecule, i.e.,

the existing bond is broken and the new bond is formed in one concerted operation The reaction requires the two catalytic amino-acid residues are on opposite sides of the substrate, which results in net inversion of configuration at the anomeric carbon

Figure 1.2 Reaction mechanisms for glycoside hydrolases The (a) inverting and (b)

retaining mechanism (picture adapted from CAZy) See text for details on the reaction schemes

Trang 25

The retaining reaction mechanism (Fig 1.2 b) involves two steps with two

successive displacements at the anomeric carbon, the glycosylation step and the deglycosylation step In the first step, glycosylation, the substrate is bound to the enzyme The general acid donates a proton to the glycosidic oxygen, and the leaving group departs before the protein nucleophile attacks at the anomeric carbon to form a covalent intermediate As a result of the single displacement, the resulting glycosyl-enzyme intermediate has an inverted configuration at the anomeric carbon relative to the original configuration During the next step, deglycosylation, the glycosyl-enzyme intermediate is hydrolyzed by a general-base catalyzed attack by water on the asymmetric center This displacement causes another inversion of configuration at the anomeric carbon and the configuration assumes that of the original state, thus the completed reaction gives

a net retention of configuration In the active sites of retaining and inverting GH enzymes, the two catalytic amino-acid residues are always positioned roughly 5.5

Å and 10 Å apart, respectively (Davies & Henrissat, 1995; McCarter & Withers, 1994) The rather long distance between the proton donor and nucleophile in inverting enzymes is important in as it accommodates the catalytic water molecule

1.2.4 Modes of action in GH enzymes

The mode with which a glycoside hydrolase binds and attacks its sugar substrate is typically reflected by the shape of the enzyme molecule, and the distribution of amino acids at the molecular surface Three principal active-site topologies can be distinguished based on the requirements of the different modes

of action and the pre-hydrolytic requirements: a pocket, cleft, or tunnel (for a

review, see Davies & Henrissat, 1995; Fig 1.3) The pocket (Fig 1.3 a) is usually

employed for recognition of a monosaccharide present at the non-reducing end of

a chain This mode is observed mainly in monosaccharidases (e.g., β-

Trang 26

galactosidase, β-glucosidase, sialidase and neuraminidase), and

exopolysaccharidases (e.g., glucoamylases and β-amylases)

In the case of the amylases, the need for a pocket is nicely demonstrated

by the architecture of the substrate in that starch assumes a helical structure that exposes a large number of protruding non-reducing chain ends A pocket is clearly not optimal for a fibrous substrate like cellulose, which has few free, exposed chain ends

Figure 1.3 Active-site topologies of glycoside hydrolases Different types of

active-site topologies and the mode of action of GH enzymes: (a) pocket; (b) cleft or canyon; and (c) tunnel Enzymes shown: CBH, cellobiohydrolase; EG, endoglucanase; MS, monosaccharidases; ExP, exopolysaccharidases

The cleft, or canyon (Fig 1.3 b), allows random hydrolysis of glycosidic

bonds in polymeric substrates and is commonly found in endopolysaccharidases such as lysozyme, endocellulases, chitinases, α-amylases and xylanases etc A

tunnel (Fig 1.3 c) is formed mainly by extended loops that come together to

close off the binding site This type of substrate-binding site is used primarily by cellobiohydrolases that hydrolyze polymeric chains from either the reducing or the

Trang 27

non-reducing end The tunnel is thought to be important for processivity in that the product can be released successively while the enzyme is still bound to the

substrate (Rouvinen et al., 1990; Divne et al., 1994, 1998)

1.2.5 The TIM barrel is a recurrent fold in GH enzymes

Similar substrate specificity is found in members of different families indicating convergent evolution However, mechanisms of divergent evolution are also observed since different substrate specificities are encountered within the same family The 3-D structures of proteins are more conserved than their amino-acid sequences which makes it possible to define clusters of related GH families, termed clans, with similar folds and both substrate and product configuration The

108 GH families can be grouped into 14 different clans denoted A-N (Table 1.2)

The most common fold among the catalytic domains of glycoside hydrolases is the TIM barrel, or (β/α)8 barrel, that is encountered in four clans (A,

D, H and K), representing 24 different GH families (Table 1.2, Fig 1.4) The name TIM refers to the enzyme triosephosphate isomerase in which it was first

discovered (Banner et al., 1975; Wierenga, 2001)

The idealized TIM barrel is composed of eight twisted parallel β strands arranged close together into a barrel Each parallel β strand is connected to an α helix that packs onto the β sheet on the outside of the barrel The structure can

be regarded as consisting of repeated βαβ motifs TIM barrels are observed predominantly in enzymes, and a canonical feature is the location of the active

sites at the C-terminal end of the barrel (i.e., the C-terminal end of the parallel β

strands)

Trang 28

Table 1.2 Clans and folds of glycoside hydrolases

GH-A (β/α)8 1, 2, 5, 10, 17, 26, 30, 35, 39, 42, 50,

51, 53, 59, 72, 79, 86 GH-B β-jelly roll 7, 16

GH-C β-jelly roll 11 (xylanases), 12 (endoglucanases)

Figure 1.4 GH clans with the TIM-barrel fold Ribbon drawings showing the four GH

clans that represent the (β/α)8-barrel fold Clan A is represented by Streptomyces glucosidase (family GH1, PDB is 1GNX; Guasch et al., 1999); clan D by Trichoderma reesei α-galactosidase (family GH11, PDB code 1SZN; Golubev et al., 2004); clan H by

β-Pyrococcus woesei α-amylase (family GH13, PDB code 1MXG; Linden et al., 2003), and

clan K by Serratia marcescens chitinase (family GH18, PDB code 1CTN; Perrakis et al.,

1994)

Trang 29

1.2.6 Linking CBMs to catalytic domains

The catalytic domains of GH proteins are frequently associated with a catalytic carbohydrate-binding module (CBM) As mentioned above in section 1.2.2, 48 CBM families have hitherto been defined For many CBMs in the CAZy database the ability of the module to bind carbohydrate has been shown, and the specificity determined However, for most CBMs, binding studies have either not been performed, or results from binding studies have proved inconclusive as to whether the module binds or interacts with the carbohydrate For this reason, these modules may sometimes be referred to simply as non-catalytic modules (NCMs) As will be discussed later in this thesis, starch-degrading enzymes often contains NCMs that do not appear to bind to sugar, but are, nevertheless classified as CBMs by CAZy The CBM families can be grouped into seven fold families (outlined in Table 1.3) according to their structural similarities (Boraston

non-et al, 2004) More recent structure dnon-eterminations show additional members of

the β-sandwich CBM family: CBM20, 21, 25, 26, 30, 31, 33, 41, 44, 47 and 48; as well as a β-trefoil structure, CBM42; and a lectin-like fold, CBM40

Table 1.3 CBM fold families

Trang 30

The CBMs can also be classified into either of three groups depending on their mode of carbohydrate binding (Table 1.4): type A, surface binding; type B, glycan-chain binding; and type C, small-sugar binding

Table 1.4 Types of carbohydrate binding platforms

B 1 2b, 4, 6, 15, 17, 20, 22, 27, 28, 29, 34, 36

C 1, 2, 6, 7 9, 13, 14, 18, 32

As seen in Table 1.4, all three types of binding modes can be

accommodated by the fold family 1 representing the β sandwich (Boraston et al.,

2004), which clearly emphasizes the versatility of this fold The major factors that shape the binding sites in CBMs are aromatic protein side chains and the conformation of loop structures The different types of CBM platforms, planar, twisted, and sandwich, that can be formed using aromatic side chains are shown

in Figure 1.5

Type-A CBMs bind to insoluble, highly crystalline cellulose, and/or chitin

using a flat, platform-like binding surface (Fig 1.5 a) Type-B CBMs have their

binding site located in a cleft or groove where the ligand is bound to a series of

subsites (Fig 1.5 b-c) These CBMs have limited interactions with the

oligosaccharides, and affinity increases with chain length (highest affinity for hexasaccharides) Type-C CBMs have a lectin-like mode of binding to mono-, di-

or trisaccharides

Trang 31

Figure 1.5 Sugar-binding platforms in CBMs (a) Planar platform with three Trp

residues in a type-B CBM (Pseudomonas fluorescens CBM10; PDB code 1QLD; Raghothama

et al., 2000); (b) twisted platform using one Tyr and two Trp residues in a type-B CBM

(Piromyces equi CBM29; PDB code 1GWL; Charnock et al., 2002), and (c) sandwich platform with three Tyr residues in a type-B CBM (Cellulomonas fimi CBM4; PDB code 1GU3; Boraston et al., 2002) The Cα backbone is shown as beige coil with carbohydrate-

interacting aromatic residues drawn as stick objects The carbohydrate in (b) and (c) are colored green

1.3 Starch and starch-processing enzymes

1.3.1 Starch - the enzyme substrate

Starch is the major carbohydrate reserve in plant tubers and seed

endosperm where it is found as spherical granules (Buléon et al., 1998), and the

pure polymer consists predominantly of α-glucan Polymerization of starch involves the covalent linking of α-D-glucose units in the 4C1 chair conformation by

α-glycosidic bonds, α-1,4 or α-1,6 linkages The α-1,4 glycosidic bond (Fig 1.6 a)

Trang 32

joins the C1 carbon atom of a glucosyl residue i to the C4 carbon atom of a glucosyl residue (i–1) to form the disaccharide maltose Successive addition of

glucosyl residues joined by α-1,4 linkages results in the formation of a roughly

linear, non-branched polymeric chain known as amylose (Fig 1.6 b) In addition,

α-1,6 glycosidic bonds can form between the C1 and C6 such that a new branch is

formed, as in the starch polymer amylopectin (Fig 1.6 c).

Figure 1.6 Structure of starch components Chair-configuration representations of (a)

maltose, (b) amylose, (c) amylopectin The numbering convention of the carbon atoms in the pyranose ring is shown for the reducing-end glucosyl moiety of maltose The α-1,4 and α-1,6 glycosidic bonds are indicated Amylose is linear and unbranched whereas amylopectin is branched with α-1,6 glycosidic bonds at the branch points

The relative ratios of amylose and amylopectin vary with typical values of 15-25% and 75-85%, respectively The degree of polymerization (DP) of the non-branched, linear amylose molecule is in the range 1000 to 6000 glucose units

Trang 33

(Fig 1.6 b) Amylopectin, which refers to the branched polymer, is linear with

typical length of 12-120 glucose units linked by α-1,4 bonds The branches are linked at the branch points by α-1,6 bonds, and range in size from 15 to 45

glucose units (Fig 1.6 c) The length of both α-1,4 backbone and α-1,6-linked branches varies depending on the botanical origin The DP of amylopectin can be

as high as 2,000,000 glucose units with roughly 95% α-1,4- and 5% α-1,6-bond content In addition, hydrogen bonding between O3 and O2 atoms of adjacent sugar residues generates a helical twist of the polymer

Amylose in starch is present as double-helical A- and B-amyloses, and the single-helical V-amylose V-amylose is found naturally in non-A and non-B segments of amylose and is folded into a left-handed single helix The 1.0-Å

resolution crystal structure of V-amylose (Gessler et al., 1999) shows that the

molecule folds into two short, antiparallel left-handed helices that are related by

two-fold rotational pseudo-symmetry (Fig 1.7 a) The V-form of amylose contains

a channel along the helical axis that is filled with disordered water molecules, that can also bind to molecules such as “iodine’s blue”

Figure 1.7 Helical structure of V- and A-amylose Stick models of (a) V-amylose,

diameter of 9-10 Å; (b) amylose (perpendicular to helical axis), diameter 6-7 Å; (c) amylose (down the helical axis), and (d) A-amylose (perpendicular to tilted helical axis)

Trang 34

A-Based on techniques such as X-ray fiber diffraction, electron diffraction, and solid-state 13C cross-polarization/magic angle spinning NMR spectroscopy and computer-aided modeling, the structures of A- and B-amylose have been shown

to contain parallel double helices, where each helix contains six glucose residues

per helical turn over 21.38 Å (Fig 1.7 b-d) Due to the inherent limitations of

these techniques, the handedness of the helices cannot be determined

unambiguously, and thus, the helices can have either a right-handed (Murphy et al., 1975; Brisson et al., 1991; Veregin et al., 1987; Gidley & Bociek, 1988) or left-handed (Imberty et al., 1988) twist The A- and B-form differ only in the

precise packing arrangement and their water content The A-form is found preferentially in cereals, and the B-form in tubers

exoamylase; (iii) debranching enzymes; and (iv) transferases (Fig 1.8)

Endoamylases, the most common member being α-amylase (EC 3.2.1.1), perform random cleavage of the α-1,4 glycosidic bonds in both amylose and amylopectin to produce shorter linear and branched oligosaccharides Exoamylases cleave specifically the non-reducing end of a starch polymer to liberate either glucose or maltose from the chain end Examples of exoamylases are β-amylases (EC 3.2.1.2), glucoamylases (EC 3.2.1.3) and α-glucosidases (EC 3.2.1.20) β-amylase cleaves α-1,4 glycosidic bonds specific to maltose, glucoamylase produces glucose with β-anomeric configuration, and α-glucosidase produces glucose with α-anomeric configurations Glucoamylase and α-glucosidase are able to cleave both α-1,4 and α-1,6 glycosidic bonds These

Trang 35

amylolytic enzymes display multi-domain organization, and belong to either of the following GH families: α-amylases, GH13 and GH57; GH14, β-amylases; and GH15, glucoamylases The three types of amylolytic enzymes do not share any sequence similarity, and about 10% carry a non-catalytic domain that can bind to raw starch Whereas α-amylases are retaining enzymes, β-amylases and glucoamylases are inverting

Figure 1.8 Enzymes involved in starch processing Grey and yellow rings represent

D-glucose units, where the yellow rings are reducing-end glucosyl moieties The various

enzymatic activities are indicated Adapted from van der Maarel et al., 2002

Debranching enzymes catalyze mainly the hydrolysis of α-1,6 glycosidic bonds There are three members in this group: isoamylase (EC 3.2.1.68), pullanases type I (EC 3.2.1.41), and pullanases type II (neopullulanase, EC 3.2.1.135) All three are able to hydrolyze the α-1,6 glycosidic bonds in amylopectin resulting in the generation of long linear α-D-glucose chains The pullanases have the additional ability to hydrolyze pullulan, a polysaccharide that

α -1,6 hydrolysis

α -amylase

amylomaltase

cyclodextrin glycosyltransferase

glucan branching enzyme

α -1,4 transferase

Trang 36

is made up of α-1,6 glycosidic linked maltotriose units Type II pullanases are also known as α-amylase-pullanases or amylopullanases because they are able to hydrolyze both α-1,4 and α-1,6 glycosidic bonds

Transferases cleave the α-1,4 glycosidic bond of the donor molecule and transfer part of the donor to a glycosidic acceptor The enzymes amylomaltase (EC 2.4.1.25) and cyclodextrin glycosyltransferase (EC 2.4.1.19) form new α-1,4 glycosidic bonds while glucan-branching enzymes form new α-1,6 glycosidic bonds Transglycosylation by amylomaltase produces a linear product, while cyclodextrin glycosyltransferase generates a cyclic product Glucan-branching enzymes are mainly involved in the synthesis of glycogen in microorganisms where they synthesize α-1,6 glycosidic bonds in the side chains of glycogen The main characteristics of some of the exoamylases are given in Table 1.5

Table 1.5 Characteristics of exoamylases

Bonds cleaved Product Substrate preference

β-amylase α-1,4 glycosidic bond Maltose None

glucoamylase α-1,4 glycosidic bond

α-1,6 glycosidic bond

β-anomeric glucose

Long-chain polysaccharides

α-glucosidase α-1,4 glycosidic bond

α-1,6 glycosidic bond

α-anomeric glucose

Short malto- oligosaccharides

Detailed investigations by MacGregor and co-workers (2001) revealed that the four reactions catalyzed by starch-processing enzymes, namely α-1,4 and α-1,6 hydrolysis and α-1,4 and α-1,6 transglycosylation, all proceed by net retention of configuration at the anomeric carbon (Kuriki, 1999) Like many other

GH enzymes, they display a modular, multi-domain organization with catalytic domains with TIM-barrel structures The catalytic module is attached, either directly or indirectly, to various non-catalytic modules By means of varying the

Trang 37

non-catalytic modules, the enzymes are able to adopt a wide range of substrate

and product specificities (van der Maarel et al., 2002)

1.3.3 The α-amylase superfamily

Members of the α-amylase superfamily (Janecek et al., 1997, 2003; MacGregor et al., 2001; Svensson, 1994) are found in two CAZy families, GH13

and GH57 The α-amylases hydrolyze α-1,4-glycosidic bonds with net retention of configuration at the anomeric carbon, and are strictly calcium dependent The principal products from starch degradation by α-amylases are maltoriose and maltose from amylose, and maltose and glucose from amylopectin Three distinct domains are commonly present in α-amylases, referred to as domains A, B and C

(MacGregor et al., 2001; Fig 1.9 a-c) The A/B domains constitute the catalytic domain, and assume the fold of a TIM barrel (Wierenga, 2001) Domain A

contains the active site with three catalytic carboxylate residues (one glutamic and two aspartic acids), and is also responsible for the cooperative binding of at least one Ca2+ ion together with domain B Domain B constitutes an insertion between the third β strand and the third α helix of domain A, and features an

irregular β-rich structure whose precise appearance varies between different enzymes

In some members of CAZy family GH13, domain B forms part of the substrate-binding cleft Domain C is found in most members of the family, and is located immediately after domain A The principal structure is a β-sandwich

domain containing a Greek key motif In deletion studies of the GH13 enzyme PalI

(Zhang et al., 2003), the B domain has been shown to stabilize A, and the

catalytic activity of the enzyme is correlated to the point of truncation in domain

C In addition, domain B helps to shield hydrophobic amino-acid residues in domain A from the solvent, and for some enzymes, domain C has also been

implicated in substrate binding It should be noted that the catalytic domain of

Trang 38

GH57 α-amylases differs from the GH13 enzymes in that the structure lacks one β strand and one α helix, creating a (β/α)7 barrel, i.e., a TIM-like barrel

Figure 1.9 The domain organization of α-amylases Ribbon drawings showing the

domain organization of representative α-amylases of CAZy family 13 (a) Bacillus

licheniformis α-amylase (PDB code 1BLI; Machius et al., 1998); (b) isoamylase from Pseudomonas amyloderamosa (PDB code 1BF2; Katsuya et al., 1998); and (c) the Thermoactinomyces vulgaris α-amylase II (TVA II) monomer (PDB icode 1BVZ; Kamitori et al., 1999) Secondary structure elements are shown as: β-strands as arrows; α-helices as

spirals, and loops as coils The domains discussed in the text are colored as follows: A, yellow; B, red; C, magenta; N, light blue; and linker between N and A, blue Calcium and

sodium ions are shown as blue and green spheres, respectively The configuration of three metal ions in series, Ca2+-Na+-Ca2+, in the B licheniformis enzyme is unique

In addition to the three common domains, some members of family GH13

carry an extra domain, denoted N because it is found at the N terminus (Fig 1.9 b-c) The N domain precedes domain A to which it is connected by a peptide linker of varying length The typical structure of domain N is a β sandwich, however, some diversity exists among different GH13 members (Janecek et al., 2003) Comparison of domain N-containing amylases shows that the precise function and location of their N domains differ (Fig 1.9 b-c)

Trang 39

In some N-domain containing enzymes (Fig 1.9 b), the N domain is involved in the formation of the active site cleft Domain N interacts with both domains A and B to help stabilize their flexible regions The B domain is smaller,

roughly half the size, compared with most other enzymes, and its role in forming

the binding cleft has been partly taken over by domain N In addition to its contributions to stability, domain N has been suggested to have a role in the initial steps of substrate binding (Timmins et al., 2005) However, their N

domains differ in the precise length and conformation of loop regions Examples

of this type include isoamylase from Pseudomonas amyloderamosa (Katsuya et al., 1998), maltooligosyltrehalose trehalohydrolase from Deinococcus radiodurans (Timmins et al., 2005), and Thermoactinomyces vulgaris R-47 α-amylase I (TVA

I; Abe et al., 2004; 2005)

Another type (Fig 1.9 c) is represented by the dimeric enzymes α-amylase

II (TVA II) from Thermoactinomyces vulgaris R-47 (Kamitori et al., 1999) and maltogenic amylase (ThMA) from Thermus (Kim et al., 1999; Lee et al., 2002) In the crystal structures, the N domain from one subunit of the dimer forms extensive interactions with that of the other subunit, which indicates that the N

domain has a function in the dimerization process Within the subunit structure,

the N domain is located too far away from the active site to participate directly in catalysis In the dimer however, the N domain of one subunit interacts extensively with the active site and substrate-binding cleft of the A/B domains of the other subunit, suggesting that the N domain may participate in catalysis and substrate binding in the dimer (Kamitori et al., 1999) Studies on TVA II mutants where the N domain had been deleted, suggest that the domain may have three

functions: to stabilize the enzyme under conditions of extreme pH and temperature; to participate in substrate recognition and hydrolysis; and to have a

role in dimerization of the TVA II molecule as a connector (Yokota et al., 2001) It

should be emphasized, however, that the limited biochemical and structural data

Trang 40

available for enzymes carrying N domains prevents a more detailed discussion of the function of this domain The homodimeric Bacillus stearothermophilus neopullulanase (PDB code 1J0J; Hondoh et al., 2003) also contains an N domain,

however, it appears to be involved in dimer association and to some extent, substrate binding

As expected for retaining GHs, two amino-acid side chains in the active site play important roles in catalysis by α-amylases A glutamate side chain acts as acid/base to protonate the glycosidic oxygen of the bond to be cleaved in the first step of the retaining reaction, and in the second step, it deprotonates the attacking hydroxyl group A second amino acid, an aspartate carboxyl group, acts

as a nucleophile that attacks the substrate sugar at C1 to which it forms a covalent linkage to produce a covalent enzyme-substrate intermediate The roles

of these amino acids have been demonstrated elegantly by Dijkstra and

co-workers (Uitdehaag et al., 1999) for the enzyme cyclodextrin glycosyltransferase

(CGTase), a member of the α-amylase family They presented a structure of CGTase with a covalently bound reaction intermediate (4-deoxy-maltotrioside) at 1.8 Å resolution (PDB code 1CXL) that allowed a detailed picture of substrate distortion and characteristics of the intermediate, as well as the structure of CGTase in complex with a maltononaose oligosaccharide at 2.1 Å resolution (PDB code 1CXK) The architecture of the non-liganded active site in CGTase is shown

in Figure 1.10 highlighting the catalytic amino acids Asp229 (nucleophile), Glu257 (acid/base) and Asp328 (helps to induce substrate distortion) Other conserved amino acids include: His327 and Arg227 which help to stabilize the intermediate; Tyr100 which provides stacking interactions with the –1 sugar ring; and Asp135 and Trp75 that help provide a suitable binding site for the intermediate

Định dạng
Số trang	141
Dung lượng	15,37 MB