Since the structure with a bound substrate could not be obtained, computational studies with cellobiose, cellotetraose and cellohexaose were carried out to determine the molecular recogn
Trang 1carbohydrate binding modules – an NMR, X-ray
crystallography and computational chemistry approach Aldino Viegas1,*, Nate´rcia F Bra´s2,*, Nuno M F S A Cerqueira2,*, Pedro Alexandrino Fernandes2, Jose´ A M Prates3, Carlos M G A Fontes3, Marta Bruix4, Maria Joa˜o Roma˜o1, Ana Luı´sa
Carvalho1, Maria Joa˜o Ramos2, Anjos L Macedo1and Eurico J Cabrita1
1 REQUIMTE–CQFB, Departamento de Quı´mica, Faculdade de Cieˆncias e Tecnologia, Universidade Nova de Lisboa, Caparica, Portugal
2 REQUIMTE, Departamento de Quı´mica, Faculdade de Cieˆncias do Porto, Portugal
3 Centro Interdisciplinar de Investigac¸a˜o em Sanidade Animal, Faculdade de Medicina Veterina´ria, Lisbon, Portugal
4 Instituto de Quı´mica Fı´sica Rocasolano, CSIC, Madrid, Spain
Keywords
cellulosome; Clostridium thermocellum;
CtCBM11; STD-NMR molecular modelling;
X-ray crystallography
Correspondence
E J Cabrita, REQUIMTE-CQFB,
Departamento de Quı´mica, Faculdade de
Cieˆncias e Tecnologia, Universidade Nova
de Lisboa, 2829-516 Caparica, Portugal
Fax: +351 212948550
Tel: +351 212948358
E-mail: ejc@dq.fct.unl.pt
M J Ramos, REQUIMTE, Departamento de
Quı´mica, Faculdade de Cieˆncias do Porto,
4169-007 Porto, Portugal
Fax: +351 226082959
Tel: +351 226082806
E-mail: mjramos@fc.up.pt
A L Carvalho, REQUIMTE-CQFB,
Departamento de Quı´mica, Faculdade de
Cieˆncias e Tecnologia, Universidade Nova
de Lisboa, 2829-516 Caparica, Portugal
Fax: +351 212948550
Tel: +351 212948300
E-mail: alcarvalho@dq.fct.unl.pt
*These authors contributed equally to this
work
(Received 7 February 2008, revised 7 March
2008, accepted 13 March 2008)
doi:10.1111/j.1742-4658.2008.06401.x
The direct conversion of plant cell wall polysaccharides into soluble sugars
is one of the most important reactions on earth, and is performed by cer-tain microorganisms such as Clostridium thermocellum (Ct) These organ-isms produce extracellular multi-subunit complexes (i.e cellulosomes) comprising a consortium of enzymes, which contain noncatalytic carbohy-drate-binding modules (CBM) that increase the activity of the catalytic module In the present study, we describe a combined approach by X-ray crystallography, NMR and computational chemistry that aimed to gain further insight into the binding mode of different carbohydrates (cellobiose, cellotetraose and cellohexaose) to the binding pocket of the family 11 CBM The crystal structure of C thermocellum CBM11 has been resolved
to 1.98 A˚ in the apo form Since the structure with a bound substrate could not be obtained, computational studies with cellobiose, cellotetraose and cellohexaose were carried out to determine the molecular recognition of glucose polymers by CtCBM11 These studies revealed a specificity area at the CtCBM11 binding cleft, which is lined with several aspartate residues
In addition, a cluster of aromatic residues was found to be important for guiding and packing of the polysaccharide The binding cleft of CtCBM11 interacts more strongly with the central glucose units of cellotetraose and cellohexaose, mainly through interactions with the sugar units at posi-tions 2 and 6 This model of binding is supported by saturation transfer difference NMR experiments and linebroadening NMR studies
Abbreviations
AMBER, assisted model building and energy refinement; CBM, carbohydrate-binding modules; Ct, Clostridium thermocellum; STD,
saturation transfer difference.
Trang 2The enzymatic degradation of insoluble
polysaccha-rides and of cellulose, in particular, is one of the most
important reactions on earth This subject is currently
under intense research because glucose derivatives can
be obtained from degradation of polysaccharides
After fermentation processes, compounds such as
glucose derivatives [1,2], acetone, alcohols and volatile
fatty acids [3,4] can be obtained that are essential for
biotech and pharmaceutical industries Furthermore,
the biofuel industry has a great interest in this field
because ethanol can also be directly obtained from
glucose monomers [2]
Efficient methods for degrading cellulose chains have
been intensively investigated worldwide within the last
decade The degradation of plant cell wall
polysaccha-rides into soluble sugars has been found to be possible
either by chemical means or by certain
microorgan-isms The latter method has become the most
attrac-tive due to reasons of economy and efficiency [2]
However, the enzymatic degradation of this type of
polysaccharide was shown to be relatively inefficient in
most cases because their targets (i.e the glycosidic
bonds) are often inaccessible to the active site of the
appropriate enzymes [5] Even so, it was found that
some microorganisms (e.g Clostridium thermocellum)
have evolved and improved their catalytic capabilities
These organisms have a consortium of enzymes
associ-ated together in high molecular weight cellulolytic
multi-subunit complexes, normally called cellulosomes,
which exist at the extracellular level [6] The enzymes
are generally modular proteins that contain
noncata-lytic carbohydrate-binding modules (CBM), which
increase the activity of the catalytic module [7–9]
The catalytic mechanisms of the enzymes present in
the cellulosome are well understood [2], but the
func-tion and behaviour of the noncatalytic modules have
not yet been fully elucidated It has been proposed that
the latter may play different roles in the cellulosome
consortium, including promotion of the association of
the enzyme with the substrate and guiding the
sub-strate to the catalytic site of the enzyme Moreover, it
is believed that it serves as an ‘anchor’ that promotes
an increase in the concentration of the enzyme on the
surface of the substrate polymers, leading to a faster
degradation of the polysaccharide [5,8]
Generally, CBMs can be grouped into several
fami-lies taking into account ligand specificity (http://
afmb.cnrs-mrs.fr/CAZY), the conservation of the
protein fold, and based on structural and functional
similarities In this last case, the protein modules have
been grouped into three subfamilies: ‘surface-binding’
CBMs (type A), ‘glycan-chain-binding’ CBMs (type B),
and ‘small sugar-binding’ CBMs (type C) [5]
The focus of the present study is on the noncatalytic modules present in C thermocellum In this organism, bifunctional cellulosomes are found that contain two catalytic modules (GH5 and GH26), each one with a family 11 CBM (CtCBM11) This CtCBM11 is part of the type B subfamily and is characterized by the bind-ing of a sbind-ingle polysaccharide chain [10] It has been observed that this type of CBM can bind to a diversity
of ligands and its specificity depends mostly on the aromatic residues present in the binding cleft Direct hydrogen bonds also play a key role in defining the affinity and ligand specificity of type B glycan chain binders [5,8,11–13]
Additionally, it has been shown that the specificity of CtCBM11 is consistent with the type of substrates that are hydrolyzed by the associated catalytic domains [14]
To increase the current knowledge of the molecular interactions that define the ligand specificity in cellu-losomal CBMs and the mechanism by which they rec-ognize and select their substrates, we used X-ray crystallography, NMR and computational chemistry approaches to identify the molecular determinants of ligand specificity of CtCBM11 By means of NMR studies, we have analyzed various cello-oligosaccha-rides of different sizes This approach enabled us to identify a range of cello-oligosaccharides with an affin-ity for the binding cleft This information was comple-mented with docking and molecular mechanics studies that allowed localized structural information to be obtained on the pocket site of CtCBM11 and, in par-ticular, the identification of the atoms of the ligand that are closer to the protein when the complex
is formed The ligands cellobiose, cellotetraose and cellohexaose were studied
Results and Discussion
The crystal structure of CtCBM11, the binding cleft and its ligand specificity
In a previous study [14], isothermal titration calorime-try of wild-type CtCBM11 with oligosaccharides and polysaccharides was used to analyse and determine the binding affinities of CtCBM11 for substrates such as lichenan, b-glucan, cellohexaose, cellotetraose, cello-pentaose and G4G4G3G CtCBM11 exhibits a prefer-ence for b-1,3-1,4 glucans and a considerable affinity for b-1,4 linked glucose polymers No affinity for b-1,3 glucans was observed The same study also described the affinity gel electrophoresis results obtained from binding of wild-type CtCBM11 and its mutant deriva-tives [14] Tyrosines 22, 53 and 129 appear to play a central role in carbohydrate recognition
Trang 3The 3D structure of CtCBM11 has been resolved to
1.98 A˚ resolution and is deposited in the protein
data-bank under the accession code 1v0a Its 3D structure
has been fully characterized and a complete description
of its fold has been performed, including a compilation
of the residues that compose the binding cleft [14] It
folds as a b-jelly roll [8] of two six-stranded
anti-paral-lel b-sheets that form a convex side (b-strands 1, 3, 4,
6, 9 and 12) and a concave side (b-strands 2, 5, 7, 8, 10
and 11) The concave side is decorated by the side
chains of several residues, with a probable substrate
recognition role Most relevant is the presence of four
tyrosine residues (numbers 22, 53, 129 and 152), as well
as four aspartate, two arginine and two histidine
resi-dues The cleft is also decorated by the side chains of
three serine and two methionine residues Due to
sym-metry constraints, the reported structure of 1v0a
exhib-its a binding cleft occupied by the C-terminus residues
(an engineered six-histidine tail) of a symmetry-related
molecule The structure details of 1v0a suggest that
res-idues Ser59, Asp99, Tyr53, Arg126, Tyr129 and Tyr152
might be involved in the binding mechanisms of
possi-ble ligands However, the presence of the His-tag
resi-dues appears to have impaired crystal soaking and
co-crystallization experiments with candidate ligands
The hypothesis that the histidine tail was preventing
ligand binding led us to design a new protein
produc-tion strategy that would allow CtCBM11 to be
obtained with an unoccupied binding cleft The
crystal-lization conditions of the newly purified protein are
different from those of the tagged one (data not
shown), and the new crystals belong to a different
space group The deposited structure of 1v0a belongs
to the P21212 space group whereas, in the absence of
the six-histidine tail, CtCBM11 crystals grow in the
P21 space group However, crystal soaking and
co-crystallization of CtCBM11 with candidate ligands
was unsuccessful Nevertheless, the engineered
six-histi-dine tag appears to be important for crystallization
because the crystals, in the absence of these extra
resi-dues, are comparatively more fragile and exhibit a
lower diffraction quality (data not shown)
Confronted with these negative results from the
crys-tallographic approach, complementary experiments by
NMR and computational calculations were considered
NMR interaction studies
Different information may be deduced for protein–
carbohydrate complexes in solution by NMR
spectros-copy In the present study, we focused our attention
on those methods that allow us to obtain information
on the bound carbohydrate
The identification and mapping of the ligand epi-topes (i.e atoms of the ligand that are closer to the protein when the complex is formed) was performed using the saturation transfer difference (STD)-NMR technique [15,16] The interaction between cellohexaose and CtCBM11 was used as a model to study the inter-action between the soluble protein and cellulose because cellohexaose is the longest readily available cello-oligosaccharide that can be used to mimic the glucose chain of cellulose [17] Line broadening effects
on cellohexaose resonances upon addition of increasing amounts of CtCBM11 were also explored as an aid to identify those sugar resonances that are more affected upon binding to the protein
Line broadening studies The simple measure or estimation of linewidths may serve as a basis to deduce the occurrence of binding or recognition (a dynamic process) Because the relaxa-tion properties of the oligosaccharides are affected upon protein binding due to their dependence on molecular motion, we studied the linebroadening effects (related to T2 relaxation) of cellohexaose reso-nances upon addition of CtCBM11
In general, a progressive line broadening of all the cellohexaose protons was observed during titration with increasing amounts of protein, which can be understood
as a result of the loss of local mobility caused by bind-ing of the sugar to the protein Chemical shifts are only slightly affected, suggesting fast equilibrium between free ligand and protein bound forms The cellohexaose proton resonances are identified in Fig 1I
A detailed comparison of the cellohexaose spectra showed that the most significant linebroadening was observed for protons 6 and 2, from glucose units b to
e (Fig 1III–V), which could indicate that the corre-sponding hydroxyl groups are involved in protein binding
The results for the linebroadening measurements of protons H1a in the alpha and beta configurations, aHa1 and bHa1 (Fig 1II,V), showed that these pro-tons are almost unaffected by protein binding, as would be expected for protons on the terminal end of the sugar located out of the binding cavity However,
a slight effect can be detected for bHa1 compared to aHa1, which may indicate a higher affinity of the pro-tein for the b form
STD-NMR
To understand how CtCBM11 distinguishes and selects the different ligands, it is extremely important to
Trang 4identify which atoms of the ligand are closer to the
protein when the complex is formed (epitope
map-ping) Identification and mapping of the epitopes can
be achieved using the STD-NMR technique The
abil-ity of the STD-NMR technique to detect the binding
of low molecular weight compounds to large
biomole-cules has been demonstrated previously [16,18–20]
This technique offers several advantages over other
methods in detecting binding activity First, the
bind-ing component can usually be directly identified, even
from a substance mixture, allowing it to be utilized in
screening for ligands with dissociation constants KD
ranging from approximately 10)3 to 10)8m Second,
the building block of the ligand having the strongest
contact with the protein shows the most intense NMR
signals, enabling mapping of the ligand’s binding
epi-tope Finally, and most importantly for a NMR-based
detection system, its high sensitivity allows the use of
as little as 1 nmol of protein with a molecular mass
> 10 kDa [16,18,21]
STD-NMR spectroscopy was used to analyze the
binding of cellohexaose to CtCBM11 The STD-NMR
spectrum of the hexasaccharide in a 20-fold excess over
CtCBM11 is shown in Fig 2 along with the
cellohexa-ose reference spectrum Comparison of both spectra clearly shows that the residues of the hexasaccharide are involved in the binding in different ways From Fig 2, it can be seen that the more intense signals are those corresponding to H2 and H6 from glucose units b to e, indicating that, when the complex is formed, these protons are those that are closer to the protein
The fact that only one of the diastereotopic protons H6⁄ H6¢ from the methylene groups shows a relevant peak in the STD spectrum is indicative of the precise orientation of the methylene groups upon binding to the protein
No STD signals could be detected for protons aH1a and bH1a, the anomeric protons of the reducing end
of the oligosaccharide
In the region between 3.63 and 3.52 p.p.m., despite
of the presence of STD signals, the individual contri-butions of protons aH4a, bH3a, H4b-e and H5b-e to the binding cannot be determined due to signal over-lap Nevertheless, information concerning the relative binding contribution can be obtained by comparing the intensity of the signals in this region with that
of protons H2 and H6 By comparison of the STD
H6’a H6’
H5f H4f
I
H6’a H6’
H5f H4f
I
Fig 1 Line broadening studies (I) Spectral
assignment of 1 H NMR cellohexaose
reso-nances (II–IV) Series of spectral regions of
a solution of cellohexaose 0.787 m M in D2O,
corresponding to protons aa1, 6 and 2,
respectively, acquired at 298 K as a function
of peptide (CtCBM11) concentration (A,
0.0 m M ; B, 0.031 m M ; C, 0.060 m M ; D,
0.116 m M ; E, 0.168 m M ) V, Linewidths
(Dt1⁄ 2) of selected cellohexaose protons,
determined after spectral deconvolution, as a
function of peptide (CtCBM11)
concentra-tion: , aH1a; , bH1a; ), H2b-e; d, H6¢b-e,
bH6¢a, aH5a; , H6b-e, bH6a, aH6a.
Trang 5intensity relative to the reference, a binding epitope
map can be created This is described by the STD
factor (ASTD):
ASTD¼ ðI0 IsatÞ=I0 ligand excess ð1Þ
The STD epitope map of cellohexaose binding to
CtCBM11 (Fig 3) was obtained by normalizing the
largest value to 100%
From these data, it is clear that, regardless of the
large number of protons in the region between 3.63
and 3.52 p.p.m (16 protons), the relative intensity of
their signal in the STD is smaller than that from
pro-tons H2 (four propro-tons) and H6 (six propro-tons) In this
way, we can clearly distinguish between those protons
very close to the protein (protons H2 and H6 from
subunits b to e) and those other protons that, in spite
of having a STD signal, are more distant from the
protein
Subunits a and f should not contribute significantly
to the binding because the signals of its protons do not appear in the STD spectrum, meaning that their protons are more distant from the protein
STD-NMR spectroscopy experiments were also per-formed with cellobiose and cellotetraose With cellobi-ose, no STD signals could be detected, which is in accordance with a previous report demonstrating a weak binding of cellobiose to CtCBM11 [14] in the limits of STD detection The STD results obtained for cellotetraose are very similar to those obtained for cellohexaose Again, not all protons give a STD signal and the maximum intensity is found for protons H2 and H6 of the central glucose units and a-H1 of the reducing end
These results indicate that the binding cleft of CtCBM11 interacts more strongly with the central glu-cose units, mainly through interactions with positions
2 and 6 of the sugar units, which is consistent with previous studies [14] and with the ligands accommo-dated by other type B CBMs The fact that only one
of the methylene protons at position 6 gives a STD signal, together with the presence of a STD signal from the anomeric proton, suggests a very well defined geometry upon binding
Computational studies
As the X-ray structure of CtCBM11 with a bound sub-strate is not available, it is difficult to evaluate the importance and function of each residue at the CtCBM11 cleft in the binding process of carbohy-drates Consequently, computational studies were used
to deduce this kind of information and complement the NMR studies These studies can provide localized structural information about the binding pocket of CtCBM11 and identify which atoms of the ligand and
of CtCBM11 interact preferentially Calculations were performed with cellobiose, cellotetraose and cellohexa-ose carbohydrates Moreover, for each ligand, the a and b isomers were considered
Initial attempts to simulate the interaction between the carbohydrates and the CtCBM11 cleft resorted to standard docking methodologies The ligands were built independently and the structure was optimized using the assisted model building and energy refine-ment (AMBER) force field
The results obtained from these simulations were, however, disappointing because the conformations of some residues near the binding pocket (i.e Tyr22, Tyr53, Tyr129 and Tyr152) give rise to a steric obsta-cle, and precluded the efficient binding of the ligands The importance of these residues in the binding process
Fig 2 STD-NMR of cellohexaose with CtCBM11 (A) Reference
1
H NMR cellohexaose spectrum (B) STD spectra of the solution of
cellohexaose (50 l M ) with the protein (5 l M ) Protons H6b-e and
H2b-e show the more intense signals, indicating that these are the
ones closer to the protein upon binding In the region between
3.63 and 3.52 p.p.m (*), the signal overlap does not allow
determi-nation of the individual contributions of protons aH4a, bH3a, H4b-e
and H5b-e to the binding.
Fig 3 Structure of cellohexaose Relative degrees of saturation of
the individual protons normalized to that of the proton e:
H2b-e, 100%; H6b-H2b-e, 48.4% and 36.6% (two non-equivalent protons),
determined from 1D STD NMR spectra at a 20-fold ligand excess.
The concentrations of CtCBM11 and cellohexaose were 18 l M and
364 l M , respectively.
Trang 6had already been noted in several previous studies
[13,14], and confirms our own observations To
over-come this cornerstone issue, we used madamm software
[22] that allows the introduction of a certain degree of
protein flexibility in standard docking processes
The process tries to mimic a conformational binding
model, in which the receptor is assumed to pre-exist in
a number of energetically similar conformations
Accordingly, the ligand selectively binds preferentially
to one of these conformers displacing the equilibrium
towards this particular conformer and, in this way,
increasing its proportion relatively to the total protein
population In the present study, the flexibilization was
applied to Tyr22, Tyr53, Tyr129 and Tyr152 At the
end of this process, a group of complexes is obtained,
with optimized affinities between CtCBM11 and each
studied ligand
To refine these results, molecular dynamics
simula-tions were performed on the best solution This
pro-cess was repeated for all the studied ligands, including
the a and b isomers
The simulations showed that all ligands have
com-mon binding poses at the CtCBM11 cavity, near the
aromatic amino acids that were flexibilized
Further-more, the ligands bind in an equidistant mode at the
CtCBM11 cleft, which suggests an apparent
symme-try at the binding cavity Most of the interaction
between the CtCBM11 cleft and each carbohydrate
occurs through hydrogen bonds, namely with the
equatorial OH groups of the glucose monomers, and
also by several van de Waals contacts that are
pro-moted by the aliphatic side chains present at the
interface, namely with Tyr22, Tyr53, Tyr129 and
Tyr152 The only exception was cellobiose, which
shows no specificity, and different binding poses at
the CtCBM11 cleft could be observed (Fig 4) This
is in agreement with the experimental work, where
no specific interaction could be detected with this
ligand
The orientation of the CH2OH groups in all docked
solutions did not change significantly, and they
com-monly appeared in alternate positions in the carbohy-drate oligomers chain (above and below the plain of the sugar rings) even if the initial calculations were performed on a conformation in which all these groups were on the same plane
The docking results obtained with madamm also revealed that there is no substantial differences between the a or b conformations of carbohydrates However, we found that, in some carbohydrates, the C1-terminal of the a conformation is turned towards the left hand side of the binding cavity, whereas the b conformation is in the opposite direction Considering that the monomers constituting the ligands are equal among themselves, this change in orientation is of no great importance for the establishment of the binding interactions between the ligand and CtCBM11, and this kind of behaviour should occur commonly in nature
From the studied carbohydrates, cellotetraose was the one that fitted perfectly inside the binding cleft of CtCBM11 In the case of b-cellotetraose, the hydrogen bonds were established with the amino acids Glu25, Asp99, Arg126, Asp128, Asp146 and Ser147 (Fig 5), which closely match the amino acids that interact with the a isomer, differing only in the Glu25 residue In the case of b-cellohexaose ligand, the carbohydrate oli-gomer interacts mainly with the amino acids Asp51, Trp54, Thr56, Gly96, Gly98, Asp99, Arg126, Asp128 and Asp146 In the case of the a-isomer, some hydro-gen bonds with amino acids Tyr22, Thr50 and Ala153 can also be observed, but not with Trp54, Gly96 and Gly98
Table 1 summarizes the most important interactions that occur between all the analyzed carbohydrate ligands, including the a and b isomers, and the neigh-bouring amino acids of the CtCMB11 cleft These average values were obtained after 2 ns of molecular dynamics simulations, with the best solution obtained with madamm as reference
Comparing all the simulated complexes, it is clear that there is a common binding site at the CtCBM11
Fig 4 Representation of the conformations
of the 3D structure of binding of the
differ-ent ligands obtained by docking (A) a- (red)
and b-cellobiose (green); (B) a- (red) and
b-cellotetraose (green); (C) a- (red) and
b-cellotetraose (green) The picture was
con-structed using the programme VMD 1.8.3.
[26].
Trang 7cleft and that all the studied polysaccharides make
sev-eral hydrogen bonds with the Asp99, Arg126, Asp128
and Asp146 amino acids and, in the case of the larger
ligands, with Asp51 as well Most of the hydrogen
bonds occur via the hydroxyl groups associated with
the C2 and C6 carbon atoms of each glucose ring,
which is in agreement with the results obtained
experi-mentally by NMR
We also found that the central glucose units
inter-act closely with several tyrosine residues The
func-tion of these residues appears to be more related to
the guiding and packing of the carbohydrate ligands
at the CtCBM11 cleft, leading to the overall
confor-mation of the bound carbohydrate chain The same
type of interaction also appears to control the overall
carbohydrate conformation in the X-ray structures of
CBM4 and CBM17 complexed with cellopentaose
and cellohexaose, respectively [13,23] The
involve-ment of the tyrosine residues in the stabilization of
the complex cannot be excluded because recent
theo-retical work, as well as NMR, has demonstrated the
existence of an important dispersive component
between the hydrogens of the sugar and the aromatic
ring of the tyrosine residues, which gives rise to three so-called nonconventional hydrogen bonds that help stabilize the complex [24,25] The initial conforma-tions adopted by these residues were responsible for the unsatisfactory results of the initial docking trials, and only after exploring the configurational space of these residues, through a multi-stage docking with an automated molecular modelling protocol (madamm software), were more reliable results obtained that are in agreement with the experimental data Previous site-directed mutagenic experiments have shown that mutating these residues to alanine causes a significant drop in the activity of the associated enzymes Con-sidering these observations, we hypothesize that the main function of these residues is to guide the poly-saccharide chain and direct it to a specific polar region in the protein populated with several aspartate residues This would disconnect the chain from other attached polysaccharide chains, such as crystalline cellulose
We also compared the computational results with another type B CBM that was crystallized in complex with a pentasaccharide (Fig 6)
Fig 5 (A,B) Representation of the most important interactions between the b-cello-tetraose and b-cellohexaose with the CtCBM11 binding cleft The distances corre-spond to the average of the last 2 ns of the molecular dynamics simulations (for further details, see Table 1).
Trang 8Many similarities were found, both in the binding
region that comprises a flat platform of the CBM
and in the type of interactions between the
carbohy-drates and CtCBM11 Regardless of the CBM,
gener-ally, we have found that the central carbohydrate
interacts with aromatic residues and several charged
amino acids that are located at the border of the
CBM cleft In the particular case of CtCBM11, close interactions with several tyrosines (Tyr22, Tyr53, Tyr129 and Tyr152), one arginine (Arg126) and sev-eral aspartate residues (Asp99, Asp128 and Asp146) were observed that closely resemble what we found in CfCBM4 (Fig 6) The interaction leads to a slight alteration of the normal chain dihedral angles of the
Table 1 Summary of the distances involved in the main interactions between the carbohydrates and the neighbouring amino acids of the CBM cleft.
Residue
a-Cellotetraose interaction
d(A ˚ )
b-Cellotetraose interaction d(A ˚ )
a-Cellohexaose interaction d(A ˚ )
b-Cellohexaose interaction d(A ˚ )
COO)MOH (C2) Glc d 2.3
COO)MOH (C3) Glc e 1.9 COO)MOH (C6) Glc f 2.4 Asp99 COO)MOH (C6) Glc b 3.0 COO)MOH (C6) Glc b 2.3 COO)MOH (C6) Glc e 2.3 COO)MOH (C2) Glc d 2.4
COO)MH (C3) Glc a 2.3
COO)MOH (C3) Glc a 2.2
Arg126 NH2MOH (C2) Glc c 1.9 NH2MOH (C2) Glc c
NH2MOH (C3) Glc c
1.9 1.9
NH2MH (C2) Glc d 3.0 NH2MOH (C2) Glc d 2.3
Asp128 COO)MOH (C6) Glc d 1.9 COO)MOH (C6) Glc d 2.9 COO)MH (C1) Glc c 2.9 COO)MOH (C6) Glc e 2.3
COO)MH (C5) Glc c 2.9 Asp146 COO)MOH (C1) Glc a 2.7 COO)MOH (C3) Glc a 2.7 COO)MOH (C2) Glc f 2.4 COO)MOH (C2) Glc a 2.6
COO)MOH (C2) Glc a 2.5 COO)MOH (2) Glc a 2.1 COO)MOH (C3) Glc f 2.1
NHMOH (C3) Glc a 2.7
Fig 6 Schematic representation of the
main interaction between (A) the
pentasac-charide with the CfCBM4 (protein databank
entry: 1GU3) [23] and (B) the
hexasaccha-ride with CtCBM11 Interactions involving
neighbouring tyrosine residues are shown in
(A1) and (B1) Residues that establish
sev-eral hydrogen bonds with the equatorial
hydroxyl groups of the glucose units are
shown in (A2) and (B2).
Trang 9fifth glucose ring that is reflected on the overall
con-formation of the bounded oligosaccharide We
pro-pose that this common CH-p stacking is responsible
for the reorientation of the carbohydrate chain and
directing it to the regions that are populated with
aspartate residues Accordingly, we propose that these
residues have a preponderant role in the reorientation
of the carbohydrate chain
Conclusions
X-ray crystallography, NMR and computational
chemistry have been shown to comprise
complemen-tary methodologies These techniques were combined
to derive structural information on the binding
interac-tion of cello-oligosaccharides and CtCBM11 at the
molecular and atomic levels because it is still unclear
whether polysaccharides adopt their normal
conforma-tion when bound to CBMs or whether these proteins
cause a change in the structure of the sugar chain
upon binding
In the present study, it was not possible to use
cello-oligosaccharides longer than cellohexaose due to their
limited solubility in aqueous buffers [17] To overcome
this limitation, we used cellobiose, cellotetraose and
cellohexaose as model compounds
Both the theoretical and experimental results
sug-gest that all ligands interact mainly by hydrogen
bonds, with a central area of CtCBM11 containing
the amino acids Asp99, Arg126, Asp128 and Asp146
and, in the case of the larger ligands, with Asp51 It
is important to emphasize that most of the hydrogen
bonds occur via the hydroxyl groups associated with
the C2 and C6 carbon atoms of each ring of glucose
This model of binding is supported by the STD and
linebroadening NMR studies performed with
cello-hexaose, which have shown that the protons of the
central glucose units are closer to the protein than
those from both ends Our theoretical and
experimen-tal results are further supported by 3D structures of
CBM–cellohexaose complexes, namely CBDCBHI,
CBDCBHII, CBDEGI [17], PeCBM29-2 [27,28] and
CfCBM2a [29]
We also observed that there are key aromatic
resi-dues at the CtCBM11 interface (i.e Tyr22, Tyr53,
Tyr129 and Tyr152) that appear to have a
preponder-ant role in guiding and packing the carbohydrate chain
and therefore in the binding process The initial
con-formations of these residues were responsible for the
negative results of the initial docking calculations, and
only after exploring the configurational space of these
residues, through a multi-stage docking with an
automated molecular modelling protocol (madamm
software), were more reliable results obtained that are
in agreement with the experimental data No signifi-cant differences in the binding conformations were detected regarding a and b isomers
Moreover, we propose that these residues have a preponderant role in the reorientation of the carbohy-drate chain, directing it to a specific polar region in the protein that is populated with aspartate residues Regarding the overall evaluation of the results obtained in the present study, we can infer a general mechanism for the interaction between CtCBM11 and cellulose A minimum number of glucose units in the polymer chain are necessary for a stable binding (four
in this case) Another feature is the strong interaction
of some residues in the putative binding site with the hydroxyl groups at positions 2 and 6 from the central glucose units of the ligand The guiding and packing
of the carbohydrates is achieved through the interac-tion of the oligosaccharide with tyrosine residues that direct it towards polar amino acids responsible for zipping the oligosaccharide at the CBM cleft As CtCBM11 is topologically similar and structurally homologous to CBMs of families 4, 6, 15, 17, 22, 27 and 29 [8], we can infer that the binding mechanism of these CBMs to their substrates should be very similar
to that of CtCBM11
Because these residues are conserved in type B CBMs, a multidisciplinary NMR, molecular modelling and X-ray crystallography study is currently in pro-gress to determine their role in the global mechanism
of interaction for several CBMs
Experimental procedures
Sources of sugars Cellobiose, cellotetraose and cellohexaose, were obtained from (Seikagaku Corporation) (Tokyo, Japan) and were used without further purification
Protein expression and purification
To express CtCBM11 in Escherichia coli, the region of the Lic26A-Cel5A gene (lic26A-cel5A) encoding the internal family 11 CBM was amplified from C thermocellum as described previously [14] The protein was purified by ion metal affinity chromatography Fractions containing the purified protein were buffer exchanged, in PD-10 Sephadex G-25M gel filtration columns (Amersham Pharmacia Bio-sciences, Piscataway, NJ, USA), into water The purified protein was then concentrated with Amicon 10 kDa molec-ular-mass centrifugal membranes (Millipore, Billerica, MA, USA)
Trang 10NMR spectroscopy
All NMR experiments were performed with a Bruker ARX
400 spectrometer or a Bruker Avance 600 or a Bruker
Avance 400 spectrometer (Bruker, Wissembourg, France)
and conducted at 300.4 K All spectra were processed with
the software topspin 2.0 (Bruker)
1
H spectrum of cellohexaose was acquired at 600 MHz
with 16 scans and a spectral width of 6009.6 Hz, centered
at 2820.93 Hz The solution of the sugar was prepared in
90% H2O and 10% (v⁄ v) D2O
The interaction between CtCBM11 and cellohexaose was
studied by STD-NMR (the pulse sequence from the Bruker
library was used) and by broadening of the resonances of
the1H spectrum of the sugar [16] The 1D STD-NMR was
performed using a solution of cellohexaose 95 lm and
CtCBM11 5 lm in D2O The spectra were recorded at
600 MHz with 8192 scans in a spectral window with
8980 Hz centered at 2824.35 Hz Selective saturation of
protein resonances at 0.6 p.p.m (12 p.p.m for reference
spectra) was performed using a series of 40 Gaussian
shaped pulses (50 ms, 1 ms delay between pulses) for a
total saturation time of 2.0 s Subtraction of saturated
spec-tra from reference specspec-tra was performed by phase cycling
Measurement of enhancement intensities was performed by
direct comparison of STD-NMR The broadening studies
were performed at 400 MHz by titration of a solution of
cellohexaose 0.79 mm prepared in D2O with CtCBM11 A
first spectrum of the pure sugar was acquired Then the
peptide was added in 5 lL and 10 lL volumes to obtain
the titration plots The peptide concentration in the
cello-hexaose solution at the end of the titration was 0.23 mm
All the spectra were acquired with 128 scans in a spectral
window with 1991.6 Hz, centered at 1881.0 Hz The spectra
were deconvoluted into individual Lorentzian lines to
deter-mine the full linewidth at half-height
The interaction between calcium and cellohexaose was
studied by titration of a solution of cellohexaose 8 mm
pre-pared in D2O with CaCl2 0.16 m A first 1H-NMR
spec-trum was acquired on the sugar alone Five further spectra
were acquired with 0.5, 1.0, 2.0, 3.0 and 6.0 equivalents of
CaCl2, respectively All the spectra were acquired at
400 MHz, with 128 scans and a spectral width of
6636.36 Hz, centered at 1879.78 Hz
Molecular modelling
The 1v0a protein databank deposited structure of
CtCBM11 [14] was used as the starting point for all the
computational studies All waters and sulfate ions (SO4 ))
were deleted and only the protein atoms were kept
Fur-thermore, all selenium atoms were substituted by sulfur
atoms
The protein is composed of 173 amino acids but the
crys-tallographic file lacks three amino acids in a loop between
Val78 and Ala82 These residues were modelled with the help of the software insight II [30] to generate the correct sequence (i.e Val78, Asp79, Gly80, Ser81 and Ala82) Once the structure was ready, hydrogen atoms were added using insightII software, considering all residues in their physio-logical protonation state
To evaluate CtCBM11, selectivity to saccharides several ligands were designed, namely, cellobiose, cellotetraose and cellohexaose [14] As glucose can exist in two forms, a-glu-cose and b-glua-glu-cose, and as these monomers have the ability
to change between these two forms very easily at the con-sidered temperature (333 K), each ligand was modelled in both forms
Molecular docking The six modelled substrates were initially docked in the structure of the unbound CtCBM11, and the best docking solutions were taken as starting structures for the subse-quent molecular dynamics simulations The docking proce-dure resorted to gold [31], a program that calculates the docking modes of small molecules into protein binding sites The program is based on a genetic algorithm that is used to place different ligand conformations in the protein binding site, recognized by a fitting points strategy Two scoring functions are a posteriori available to rank the obtained solutions (i.e GoldScore and ChemScore) [32] In our calculations, we used GoldScore as the scoring func-tion, which has four terms:
GOLD GoldScore fitness¼ Shb extþ Svdw extþ Shb intþ Svdw int
ð2Þ
in which Shb_ext is the protein–ligand hydrogen bond score and Svdw_extis the van der Walls score Shb_intis the contri-bution due to intramolecular hydrogen bonds and Svdw_int
is the sum of the intenal torsion strain energy and internal van der Walls terms in the ligand In general, the Gold-Score function appears to perform better binding energy predictions than the ChemScore function, which justifies our choice [5]
Molecular dynamics All geometry optimizations and molecular dynamics were performed with the parameterization adopted in amber 8, [33] using the general AMBER force field for the protein and the Glycam-04 parameters for the carbohydrates [34–36]
In all simulations, an explicit solvation model was used with a truncated octahedral box of 12 A˚ with pre-equili-brated TIP3P water molecules using periodic boundaries [37]
In the initial stage, the structure was minimized in two stages In the first stage, we kept the protein fixed and only minimized the position of the water molecules and ions In