TABLE OF CONTENTS Page Acknowledgements i Publication iii Table of contents iv Summary ix List of abbreviations xi List of figures xiii List of tables xv CHAPTER 1 MACROMOLECULAR X-RAY
Trang 1CRYSTAL STRUCTURE OF ARABIDOPSIS THALIANA CYCLOPHILIN 38
Trang 2CRYSTAL STRUCTURE OF ARABIDOPSIS THALIANA CYCLOPHILIN 38
(ATCYP38)
DILEEP VASUDEVAN (M Fisheries Sc.)
A THESIS SUBMITTED FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY
DEPARTMENT OF BIOLOGICAL SCIENCES NATIONAL UNIVERSITY OF SINGAPORE
2007
Trang 3ACKNOWLEDGEMENTS
This thesis is the most significant scientific accomplishment in my life so far and it is my pleasure to thank those who made this possible and who supported me in one way or other
I am grateful to my PhD supervisor, Dr Kunchithapadam Swaminathan With his enthusiasm, inspiration and great efforts to explain things clearly and simply, he helped me understand what crystallography is Without his patience and help, the structure of this thesis would not have been possible
The support of our collaborator Prof Sheng Luan and his group at University
of California, Berkeley, USA is deeply acknowledged I also thank Dr Leslie Haire of National Institute of Medical Research, London, UK for her suggestion to use vapor batch plates for my crystallization experiments I would like to thank the beamline staff at National Synchrotron Light Source, USA for all their help during my data collection
I am indebted to my peers at NUS for providing a stimulating and fun environment in which I could learn and grow I am especially grateful to my DBS labmates Asha, Gayathri and Tien Chye, as well as my friends Dileep, Thakur and Jobi
I convey my heartfelt thanks to Dr Curt Davey of School of Biological Sciences (SBS), Nanyang Technological University (NTU) for accommodating me in his lab as a Research Associate even before the completion of my PhD and letting me finish my PhD thesis, along with my job I also thank my friends at NTU
Words are not enough to thank my dearest wife Anuradha for all her support at both lab and back home Had she not been there, my PhD project would not have
Trang 4reached this point I wish to dedicate this thesis to her Also, I offer my special thanks
to my mother and our entire family for providing a loving environment
The NUS research scholarship, which supported my research and stay in Singapore, is greatly acknowledged
Trang 5PUBLICATION
Parts of this thesis have already been or will be published in due course:
Vasudevan D, Gopalan G, He Z, Luan S, Swaminathan K 2005 Expression,
purification, crystallization and preliminary X-ray diffraction analysis of Arabidopsis
thaliana cyclophilin 38 (AtCyp38) ACTA Cryst F 61: 1087-1089
Vasudevan D, Radhakrishnan A, Luan S, Swaminathan K 2007 Crystal structure of
an intermediate form of Arabidopsis thaliana cyclophilin 38 (AtCyP38) (to be
submitted)
Trang 6TABLE OF CONTENTS
Page
Acknowledgements i
Publication iii
Table of contents iv
Summary ix
List of abbreviations xi
List of figures xiii
List of tables xv
CHAPTER 1 MACROMOLECULAR X-RAY CRYSTALLOGRAPHY 1.1 PROTEIN STRUCTURE DETERMINATION 1
1.2 PROTEIN CRYSTALLIZATION 2
1.3 BASIC CONCEPTS IN PROTEIN CRYSTALLOGRAPHY 4
1.3.1 Unit-cell, lattices and Miller indices 4
1.3.2 Symmetry, point groups and space groups 7
1.3.3 Crystals and X-rays 8
1.3.4 X-ray diffraction 9
1.3.5 Bragg’s law 10
1.3.6 Reciprocal space 11
1.3.7 The Ewald sphere 13
1.3.8 Fourier transform and structure factor 14
1.3.9 Phase problem 15
1.4 GEOMETRIC DATA COLLECTION 16
1.4.1 Data reduction 17
Trang 71.5 STRUCTURE DETERMINATION 18
1.5.1 Phasing methods 18
1.5.1.1 The Multi-wavelength anomalous dispersion (MAD) method 21
1.5.1.2 Principle of anomalous scattering 23
1.5.2 Phase improvement 25
1.5.3 Model building 26
1.5.4 Refinement 28
1.5.5 Validation 31
1.5.6 Presentation 34
CHAPTER 2 BIOLOGICAL BACKGROUND 2.1 ORGAN TRANSPLANTATION & IMMUNOSUPPRESSIVE DRUGS 35
2.1.1 Cyclosporin 36
2.1.2 FK506 37
2.1.3 Rapamycin 37
2.2 IMMUNOPHILINS 38
2.2.1 PPIase activity of immunophilins and protein folding 40
2.2.2 Immunosuppressive activity of immunophilins 41
2.3 DIVERSITY OF IMMUNOPHILINS 42
2.3.1 Archaeal cyclophilins 42
2.3.2 Bacterial cyclophilins 43
2.3.3 Fungal cyclophilins 43
2.3.4 Animal cyclophilins 44
2.3.5 Plant cyclophilins 47
2.3.5.1 Domain organization & phylogenetic analysis in Arabidopsis cyclophilins 49
Trang 82.3.5.2 Evolutionary dynamics of the lumenal cyclophilins 56
2.4.2.1 Structural features of cyclosporin A binding 65 2.4.2.2 Structural features of prolyl-peptide binding 67
CHAPTER 3 MATERIALS AND METHODS
3.1 EXPRESSION AND PURIFICATION OF RECOMBINANT ATCYP38 71
3.2 EXPRESSION AND PURIFICATION OF SELENOMETHIONYLATED
3.3 CRYSTALLIZATION AND TESTING OF CRYSTAL QUALITY 75
3.3.3 Cryo-protection of the crystals and testing of diffraction quality 77
Trang 93.5 DATA ANALYSIS AND STRUCTURE DETERMINATION 78
CHAPTER 4 RESULTS AND DISCUSSION
4.1 EXPRESSION AND PURIFICATION OF NATIVE WILD TYPE
4.2 EXPRESSION AND PURIFICATION OF SELENOMETHIONYLATED
4.3 CRYSTALLIZATION, DATA COLLECTION AND ANALYSIS FOR
4.4 SELECTIVE MUTATION OF ATCYP38 RESIDUES TO AID
4.5.1 Expression, purification and crystallization 91
4.7 THREE-DIMENSIONAL STRUCTURE OF ATCYP38 (83-437) 96
Trang 104.7.1 N-terminal domain 98
4.10 INSIGHTS FROM STUDIES ON SPINACH TLP40 and PSBQ 108
Trang 11SUMMARY
Cyclophilin 38 (CyP38) is one of the highly divergent multi-domain
cyclophilins from Arabidopsis thaliana A recombinant form of intermediate AtCyP38 (residues 83-437) was expressed in Escherichia coli and purified to
homogeneity The protein was crystallized in the C2221 space group using the batch technique with PEG 6000 and t-butanol as precipitants Crystals of recombinant AtCyP38 diffracted X-rays to 2.5 Å resolution at the NSLS synchrotron The seleno-methionine derivative of the AtCyP38 protein was over-expressed, purified and crystallized in the same space group and Multi-wavelength Anomalous Dispersion (MAD) data were collected to 3.5 Å at the same synchrotron Structure determination was attempted using the MAD method However, the low resolution of the data did not allow complete model building A strategy of selective mutation of few leucine residues to methionine was chosen A mutant AtCyP38 (83-437), with selected five LÆM mutation, was expressed and purified The seleno-methionine derivative of this mutant AtCyP38 gave crystals in the same space group which diffracted up to 2.46 Å With the new MAD data, the structure was fully solved and refined
vapor-The structure of the intermediate AtCyP38 reveals that the protein has two distinct domains: an N-terminal helical bundle and a C-terminal cyclophilin domain that are connected together by an acidic loop The cyclophilin domain, even though similar in fold to all known cyclophilin structures, varies considerably in its loops and active site features A part of the N-terminus enters into the cyclophilin domain and forms part of the characteristic cyclophilin β-barrel The N-terminal domain of the protein is unique in that it contains several typical elements for protein-protein interaction, which may be of great functional significance to the protein in its mature
Trang 12form In the mature protein, the region between Val93 and Asp114 seems to be important in interacting with a trans-membrane phosphatase The structure of this intermediate form is probably the first plant multi-domain cyclophilin structure and explains how the functional domains that are needed in a mature protein are protected from being active until the protein is transported to its specific functional location, becomes mature and active
Trang 13LIST OF ABBREVIATIONS
ADSC Area Detector Systems Corporation
BnP Buffalo and Pittsburg Software
CaM calcium-loaded calmodulin
CCD charge coupled device
HEPES N-2-hydroxyethylpiperazine-N'-2-ethane sulfonic acid
HIV human immuno-deficiency virus
hnRNP heterogeneous ribonucleoprotein particle
HSP heat shock protein
IPTG isopropyl-beta-D-thiogalactopyranoside
Trang 14MAD multi-wavelength anomalous dispersion
MALDI-TOF matrix assisted laser desorption/ionization – time of flight
MIR multiple isomorphous replacement
MPTP mitochondrial permeability transition pore
MWPC multi-wire proportional chamber
NMR nuclear magnetic resonance
PAGE poly acrylamide gel electrophoresis
PDB Protein Data Bank
PEG poly ethylene glycol
PMSF phenyl methyl sulfonic fluoride
pNA para-nitroaniline
Ppi Peptidyl-prolyl isomerase
PPIase peptidyl-prolyl cis-trans isomerase
PS II photosystem II
RRM RNA recognition motif
SDS sodium dodecyl sulfate
snRNP small nuclear ribonucleoprotein
TLP thylakoid lumen protein
TM1367 Thermotoga maritima protein 1367
TPR tetratricopeptide repeat
Trang 15LIST OF FIGURES
Page Chapter 1
Figure 2 Intersection of three (234) planes within a unit-cell 7
Figure 7 The anatomy of an X-ray diffractometer 16
Chapter 2
Figure 9 The chemical structure of Cyclosporin A 36
Figure 12 Peptidyl-prolyl isomerization reaction 41 Figure 13 Domain organization in Arabidopsis cyclophilins 51 Figure 14 The phylogenetic relationships of Arabidopsis cyclophilins 52 Figure 15 Arabidopsis cyclophilin isoforms & their sub-cellular localization 54
Figure 16 Structure overlap of various cyclophilin domains 62
Figure 18 Sequence alignment (by CLUSTALW) of AtCyP38 with TLP40
Trang 16Chapter 4
Figure 19 SDS-PAGE showing the expression of soluble wt AtCyP38 79
Figure 20 SDS-PAGE showing the affinity chromatographic purification 80
Figure 21 SDS-PAGE showing the thrombin cleavage 81 Figure 22 Profile of size-exclusion chromatographic purification 82
Figure 23 SDS-PAGE of size-exclusion chromatography fraction 82
Figure 24 Native-PAGE of purified wt AtCyP38 (83-437) 83
Figure 25 Mass spectrometry for wt AtCyP38 (83-437) 84
Figure 27 Crystal of selenomethionylated AtCyP38 86
Figure 29 Secondary structural organization of AtCyP38 96
Figure 30 Overall structure of AtCyP38 (83-437) 97 Figure 31 Overall structure of AtCyP38 (83-437) in stereo view 98
Figure 32 Structure overlap for the helical bundle of AtCyP38 with PsbQ
Trang 17LIST OF TABLES
Page Table 1 Crystal parameters and data-collection statistics for wt AtCyP38 88
Table 2 Crystal parameters and data-collection statistics for the multiple
Table 3 Structure refinement statistics for mutant AtCyP38 95
Trang 18CHAPTER 1 MACROMOLECULAR X-RAY CRYSTALLOGRAPHY
1.1 PROTEIN STRUCTURE DETERMINATION
Proteins perform a variety of functions in living systems Enzymes, which are proteins, catalyze and control the rate of chemical reactions in an organism Other proteins function as structural, transport, signaling and defense entities of living systems Even if we know the amino acid sequence of a protein, we still cannot appreciate its function and properties if we do not know what exactly the protein looks like or how the active site functions Thus, knowing the 3-dimensional structure
of a protein is of great significance
Several methods are available for studying the 3-dimensional structure of proteins such as Nuclear Magnetic Resonance (NMR) spectroscopy, X-ray crystallography and cryo-electron microscopy At present, X-ray crystallography is the most effective of these techniques Certainly, the other two techniques complement crystallography and are regarded for their respective value in macromolecular structure determination X-ray crystallographic techniques were first developed in 1912 and initially applied to small molecules Based on its extraordinary success, the recording of the first diffraction pattern from a protein crystal, that of pepsin, was successfully attempted in the 1930s (Bernal & Crowfoot, 1934) However, the complete application of X-ray diffraction to protein structure determination
happened much later The first protein structures, those of myoglobin (Kendrew et al., 1960) and hemoglobin (Perutz et al., 1960), were solved only in the late 1950’s After
that macromolecular crystallography has progressed tremendously and in the last 20 years nearly 35,000 protein structures have been determined However, this collection
Trang 19represents only a very small fraction of the hundreds of thousands of various protein molecules that actively define the process of ‘life’
1.2 PROTEIN CRYSTALLIZATION
There are several bottlenecks in the determination of a crystal structure, of which obtaining a useful crystal is the most serious one If one cannot collect diffraction data of suitable quality, protein structure determination will not be possible Proteins have irregularly shaped surfaces, which result in the formation of large channels within a crystal Therefore, the non-covalent bonds that hold together the lattice must often be hydrogen bonds, through intervening water molecules (Rhodes, 2000) Furthermore, the successful production of crystals for diffraction studies depends on a number of environmental factors like pH, temperature, ionic strength etc with each protein requiring a unique condition for successful crystallization Therefore, protein crystallization is almost always a very tedious process
Crystallization of proteins is achieved by driving a protein solution to saturation in several ways such as: cooling, evaporation of water, addition of an ionic solute, e.g salt, and variation of pH Once the protein solution becomes super-saturated, protein crystals tend to form themselves upon appearance of a microscopic nucleus, and crystallization can recur very rapidly
super-Macromolecules always require some precipitants to invoke nucleation and crystal growth The most popular precipitants are inorganic salts (ammonium sulfate, sodium chloride etc.) and organic polyols, polyethylene glycol of different molecular weights, methylpentanediol and similar additives A multitude of other parameters is
varied, e.g the buffer type and pH, temperature, purity and concentration of protein
Trang 20and other components In order to obtain sufficient homogeneity, the protein should
usually be at least 97% pure Also, pH conditions, regulated by different buffers, are very important, as different pHs can result in different packing orientations within a crystal Some proteins require small amounts of special additives such as phenol, isopropanol or various co-factors or effectors to produce good-quality crystals
Many screening methods, such as full factorial, incomplete factorial, random and sparse matrix screens, are in use for the efficient identification of crystallization conditions of protein molecules Of these, the most common one is the sparse matrix method, which utilizes conditions based on known successful crystallization conditions of macromolecules Various sets of crystallization screening conditions, duly selected by sparse matrix sampling, have been proposed and many ready-made solutions are commercially available These screens make use of several conditions that have a wide range of pH, salts and precipitants Additional information regarding
a protein’s affinity for metal ions, ligands, etc also helps in the crystallization process Results from preliminary trials of crystallization screening can provide considerable information regarding the solubility of a protein, the choice of a precipitant, pH and salts and at times may even give crystals Crystals obtained initially may not always diffract well They need further optimization to give better diffraction quality
There are several crystallization techniques, including sitting drop vapor diffusion, hanging drop vapor diffusion, sandwich drop, batch, micro-batch, under oil, vapor batch, micro-dialysis and free interface diffusion The hanging drop vapor diffusion technique is the most popular method for the crystallization of macromolecules The principle of vapor diffusion is quite straightforward A drop of a mixture of a protein sample and a reagent is placed in vapor equilibration with a
Trang 21liquid reservoir of the reagent Typically, the drop contains a lower reagent concentration than that in the reservoir To achieve equilibrium, water vapor leaves the drop and eventually ends up in the reservoir As water leaves the drop, the protein undergoes an increase in relative super-saturation Equilibration is reached when the reagent concentration in the drop is approximately the same as that in the reservoir, favoring the protein to crystallize
For decades, crystals were mounted in sealed glass (or quartz) capillaries, in presence with a drop of mother liquor A breakthrough in protein crystal handling is the cryo-cooling technique Using this method, now commonly referred to as cryo-crystallography, the crystal and the mother liquor surrounding and inside the crystal in solvent channels can be solidified to a state of amorphous glass with the addition of a suitable cryoprotectant The crystal is then scooped from the liquid with a small fiber loop and quickly plunged into liquid nitrogen or placed in a very cold nitrogen gas stream at temperatures around -173 oC (Teng, 1990) The crystal is preserved in its crystalline order to retain its diffraction properties (Hope, 1988) This method reduces mechanical stress and diminishes the amount of radiation damage induced by exposure to X-rays (Garman & Schneider, 1997) Various modern modifications in crystal handling protocols have made the whole process of getting good quality data-sets from a given protein crystal a lot easier than how it used to be two decades back
1.3 BASIC CONCEPTS IN PROTEIN CRYSTALLOGRAPHY
1.3.1 Unit-cell, lattices and Miller indices
Molecules in a crystal are arranged in a translationally repeating volume, known as unit-cell, with basis vectors a, b and c, and angles α, β and γ between them
Trang 22(Fig 1) The unit-cells in a crystal are stacked in three dimensions, in an orderly fashion, with the origins of the unit-cells forming a lattice
is called triclinic If a ≠ b≠ c, α = γ = 90o and β > 90o, the cell is monoclinic If a = b ≠
c, α = β = 90o, and γ = 120o, the cell is hexagonal For cells in which all three cell angles are 90o, and a, b and c are equal, it is cubic; if a = b ≠ c, the cell is tetragonal; and if a ≠ b≠ c, the cell is orthorhombic The arrangement of molecules in a unit-cell
is governed by symmetry (next section) and this symmetrical arrangement defines the system of that crystal A particular crystal system is named after the unit-cell it demands to adapt to its symmetry needs However, the trigonal crystal system can be treated only with the hexagonal unit-cell in some symmetry arrangements and the rhombohedral unit-cell (a = b = c; α = β = γ ≠ 90º) and its equivalent hexagonal unit-cell for other symmetry arrangements Furthermore, in some cases, the geometry of a unit-cell may mislead to the crystal system named after that unit-cell and will not represent that crystal system Only the symmetry elements that are present in that crystal will decide the crystal system For instance, if a unit cell is found to have all
Trang 23the three angles even very close to 90o, it does not imply that the crystal has to belong
to the orthorhombic system It could be even triclinic as well To be orthorhombic, a crystal must have the minimum point group symmetry of 222 (next section)
In a crystal, unit-cells are imagined to be arranged in a contiguous way to fill space Imagine the origin of the unit-cell is represented by a point In the ‘Primitive’
arrangement, designated as P, only one lattice point is present per unit-cell For
convenience, when smaller primitive cells are enclosed in a larger non-primitive cell, two or more lattice points will be present in the non-primitive unit-cell They are
designated A, B or C – A, if bc face bears the origin lattice point (or center) of the original small unit-cell, B for ac centering and C for ab centering If all the faces are centered, the designation is F, and if there is a lattice point at the center of the non- primitive unit-cell, it is body-centered, designated as I The cubic crystal system can have P, F, or I lattices; the hexagonal system has P; the trigonal system can have a
primitive hexagonal lattice or a rhombohedral lattice; the tetragonal system can have
P and I; the orthorhombic system can have P, F, I, A/B/C; while the monoclinic system can have the P or C lattice, the triclinic system can only have a primitive
lattice The collection of these 14 possible lattices is called ‘Bravais lattices’
For the purpose of X-ray diffraction (section 1.3.4), the unit-cell is imagined
to be sliced into planes and the planes are labeled using ‘Miller indices’ The directions of the lattice vectors a, b and c are first identified, where a, b, c are the
dimensions of the axes A set of parallel planes is called in the form (h k l) where h, k,
l are the integral number of cuttings made by the planes on the a, b and c axes of each
unit-cell, respectively Thus the (1 1 1) plane intercepts all three axes at 1, the ends of each axis The (1 0 0) plane intercepts the a axis at 1 but never intercepts the b and c axes; thus, the (1 0 0) plane is perpendicular to the a axis and lies parallel to the bc
Trang 24plane However, there is no set of planes assumed with fractional values, which means
it is not possible no have a set of parallel planes that will cut the unit-cell axes fractionally and still obey the law of rational indices, Fig 2
a
b c
Figure 2 Intersection of three (234) planes within a unit-cell
1.3.2 Symmetry, point groups and space groups
Symmetry is defined as the operation that, when applied to a molecular object, generates a copy of the molecular structure that is crystallographically indistinguishable from the original object (Drenth, 2001) However, pseudo symmetry differs by generating a copy of a similar object that is not crystallographically identical
When a molecule moves along a, b or c of the unit cell, translational symmetry
is applied and the generated objects are identical with the original object Other symmetry operators in crystallography are rotation, reflection and inversion In the rotational symmetry operation, only rotational angles of 60, 90, 120, 180 and 360o are allowed The type of rotation angle applied is represented by an integer n (n = 1, 2, 3,
4, 6, whereby n is the number of times a molecule is repeated within 360o, or 360/n
Trang 25will thus indicate the rotational angle) Another type of rotational symmetry axis is the screw axis, which combines rotation with the translation operation For example, in a two-fold screw operation, a molecule is first rotated by 180o and then translated half
of the unit-cell length in the positive direction of the axis Reflection and inversion symmetry operations are not applicable to protein molecules which will warrant protein molecules with D-amino acids, which do not exist in nature in proteins
Point groups describe the assembly of crystallographic symmetry elements without any translation in the unit-cell For example in point group 222 (the three digits represent operations carried out on a, b and c axis, respectively), has two fold rotation on the a, b and c axes Space groups, on the other hand, provide more detailed information of how molecules are arranged in the unit-cell A space group symbol includes information about the lattice type of the crystal, the point group and the translation operations For example, space group P21212 has 2 two fold screw operations on both the a and b axes and a two fold rotation on the c axis The unit-cells arrangement in the crystal is primitive as indicated by ‘P’ The smallest unit that can be rotated and translated to generate one unit cell using only the symmetry operators allowed by the crystallographic symmetry is called an asymmetric unit It may be one molecule or one subunit of an oligomeric protein, but it can also be more than one
1.3.3 Crystals and X-rays
In a simple microscope, an illuminated object is placed within the focal point
of the biconvex lens, which forms a virtual and larger image beyond the focal point at the opposite side of the lens Unfortunately, X-rays cannot be focused by any lens and thus one has to measure the directions and strengths (intensities) of the X-rays
Trang 26diffracted by a molecule and then reconstruct the image of the molecule using a computer X-ray scattering from a single molecule would be unimaginably weak and could never be detected above the noise level, which would include scattering from air and water However, a crystal accommodates a huge number of molecules so that the waves that are scattered by these molecules can add up and raise the signal to a measurable level In a sense, a crystal acts as an amplifier The goal of crystallization
is usually to produce a well-ordered crystal without any contaminants and large enough to provide a diffraction pattern when hit with X-ray The resulting diffraction pattern can then be analyzed to discern the protein’s three-dimensional structure
If we want to visualize an object using electromagnetic radiation, the radiation should have a wavelength comparable to the smallest features that we wish to resolve
We often use the characteristic Kα X-rays that are emitted from a copper target when bombarded with high energy electrons Their wavelength of 1.5418 Å is quite close to the distance between two bonded carbon atoms and hence is well suited for protein structure studies
1.3.4 X-ray diffraction
X-rays, which are electromagnetic waves, interact with matter, particularly electrons A wave can be described by its amplitude (the height of the peak), and its frequency (how many times it repeats per unit time) If we call the amplitude A, time t and the frequency through circular velocity ω, then the equation describing this wave
is given by:
X-rays interact with matter and get scattered (or re-emitted) in all directions from the electrons they encounter These scattered rays will travel different distances
Trang 27as they originate from different positions in a crystal and hence they will differ in their relative phases When two waves interfere with each other, the resulting amplitude is the sum of the individual amplitudes, if they are in phase with each other
or the resulting amplitude is the difference of the individual amplitudes if they are out
of phase with each other, Fig 3
Figure 3 Interference of two waves In the left panel, two waves that
are in exact phase add up while in the right panel two waves with
opposing phases annul each other
The atoms, and electrons, of a crystal interact with X-ray waves in such a way
to produce interference This interaction can be explained as if the atoms reflect the incoming X-ray waves Furthermore, the reflections occur from what appears to be regularly arranged planes within a crystal, which is composed of an orderly arrangement of atoms
1.3.5 Bragg’s law
When X-rays of wavelength λ impinge upon a set of parallel planes with index
hkl and inter-planar spacing d hkl at an angle θ and reflect at the same angle, they will produce a diffracted beam, only if θ meets the condition:
where n is an integer (Rhodes, 2000)
Trang 28Consider two parallel rows of lattice points with inter-planar spacing d hkl, Fig 4 Two rays R1 and R2 are reflected from them at an angle θ Line AC is drawn from the point of reflection A of R1 and perpendicular to ray R2 If ray R2 is reflected at B, then the diagram shows that R2 travels the same distance as R1 plus an additional distance
of 2BC Because AB in the triangle ABC is perpendicular to the atomic plane, and
AC is perpendicular to the incident ray, the angle CAB equals θ, the angle of incidence Because ABC is a right-angled triangle, the sine of angle θ is BC/AB or BC/dhkl Thus BC equals d hklsin θ, and the additional distance 2BC traveled by ray R2
is 2d hklsinθ
If this difference in path length for rays reflected from successive planes is equal to an integral number of wavelengths of the impinging X-rays (satisfying equation 1.2), then the rays are in phase with each other, interfering constructively to produce a strong diffracted beam
Trang 29recorded farther from the direct beam position than that for a set of planes with a greater inter-planar spacing
Figure 5 The reciprocal lattice
In Fig 5, the Bragg planes and the incoming and reflected rays are shown, as before, for two diffraction angles, but now a vector perpendicular to the Bragg plane
is added for each The phase of a wave diffracting from an object lying in between two Bragg’s planes depends on the fraction of the distance of the object from one Bragg plane to the next If the position of the object is considered as a vector, then the distance of that object from one of the Bragg planes can be obtained by projecting that vector on the plane normal The phase shift can then be obtained by dividing the projected distance by the Bragg spacing between the planes Mathematically, we can carry out the projection and division by giving the plane normal a length equal to the reciprocal of the Bragg spacing, and then computing the dot product between the position vector and the plane normal Because the plane normal is a vector with a length reciprocal to the spacing in the object, we call it a vector in reciprocal space
Trang 301.3.7 The Ewald sphere
Bragg’s law can be rearranged in the reciprocal space
Fig 6 demonstrates how each reciprocal lattice point must be arranged with respect to the X-ray beam in order to satisfy Bragg’s law and produce a reflection
from the crystal Since the Ewald sphere is three dimensional, reflections by hkl
planes when exposed to X-rays form a three dimensional network of diffraction spots
If Ewald’s sphere has a diameter of 2/λ, then any reciprocal-lattice point within a distance 2/λ from the origin can be rotated into contact with the sphere to form a diffraction spot
Figure 6 Ewald sphere
1.3.8 Fourier transform and structure factor
1
hkl d
d
Trang 31The atomic arrangement (or precisely, the electron density) in a crystal is related to all diffraction spots through the Fourier transformation Thus, the electron density at any point can be calculated by Eq 1.4
) ( 21
),,
hkl
e F V
z y
density This transformation is accurate and in principle complete In the above
equation, if we know Fhkl , the structure factors (the inverse space from diffraction by
electrons) we can calculate the actual real structure (the density of electrons in real
space)
The structure factor Fhkl for a reflection h,k,l is a complex number derived
quite straight-forward as follows:
(1.5)
∑
=
+ +
= n
j
lz ky hx i j hkl
j j j e f F
1
) (
2 π
This is a simple summation, which extends over all atoms j, with fractional coordinates x j , y j , z j The term f j is the scattering factor of atom j and depends on its atomic number and the diffraction angle of the corresponding reflection (h,k,l) The X-ray scattering power of an atom, f, is obviously higher for heavier atoms and decreases exponentially with increasing scattering angle A plot of scattering factor f
in units of electrons vs sinθ / λ shows this behavior Note that for scattering angle zero, the value of f equals the number of electrons of the atom In actual cases there
will be additional weakening of the scattering power of atoms by the temperature factor of the atoms Or,
Trang 32f B = f e -B(sinθ/λ) (1.6) where B is related to the mean displacement of a vibrating atom <u> by the Debye-
factor of a reflection This approach is logically uacceptable that we have to know the atomic distribution of a unit-cell to calculate an intermediate structure factor Fhkl, which will be used to calculate the electron density distribution (or the atomic distribution) In other words, to solve a structure, information on all diffracted waves
is required, that is their amplitude, frequency and phase The frequency is determined
by the X-ray wavelength while the amplitude is directly obtained by measuring reflection intensities The phase information, which is dependent on the position of each atom in the unit-cell, is however not measurable This is called the ‘phase problem’ in crystallography Solving the phase problem is the crucial step in crystal structure determination
The methods with which we can get the estimate of the phase angle are discussed later The phase obtained from these methods is however only an estimate
To improve the accuracy of the phase, refinement at both reciprocal and real space is carried out and the Fourier transformation allows the inter-conversion between reciprocal and real spaces
Trang 331.4 GEOMETRIC DATA COLLECTION
For crystal structure determination the intensities of all diffracted beams must
be measured All corresponding reciprocal lattice points must be brought to the diffracting position by rotating the lattice points First the geometry of diffraction, including the shape, size and symmetry of the reciprocal and direct lattices, is confirmed Second, the intensity of every point in the reciprocal lattice is measured which may be ultimately related to the distribution of diffracting electrons in the unit-cell
Figure 7 The anatomy of an X-ray diffractometer
An X-ray diffraction instrument consists of two parts, namely a mechanical part to rotate the crystal and a detecting device to measure the intensities of diffracted beams (Fig 7) A crystal can be rotated on three independent axes of rotation (ω, χ and φ) so that any reciprocal lattice point can be brought into the diffraction position
Trang 34Different detectors utilize different physical mechanisms to record X-ray reflections The most popular systems include photographic film, multi-wire proportional chamber (MWPC), the solid state TV and charge coupled device (CCD) and the
image plate (IP, which records the signal via color centers followed by a laser scan)
1.4.1 Data reduction
In a diffraction experiment we measure the intensities and positions of
reflections From the position of a reflection we can determine its index triple (h,k,l)
and associate its intensity This intensity is proportional to the square of |F hkl |, the
structure amplitude Correct Miller indices are assigned and intensities are measured for all observed reflections The reflections are scaled to remove any systematic errors that are introduced by effects such as absorption (arising from non-regular crystal shape), non-linearity in the monitoring of incident beam intensity by detector, and changes in the average diffracted intensities due to variation in the total diffracting volume of the crystal sample arising when part of the crystal moves in or out of the incident beam Also, data are to be corrected for the thermal motion of atoms (which causes the fall-off of intensity with increased scattering angle) and radiation damage (which contributes to reduction in the intensity as a function of resolution)
The usefulness of a dataset depends on its highest resolution limit It has been shown that the correctness of a solved structure increases with the resolution of the data used (Hubbard and Blundell, 1987) Before the availability of sophisticated crystal cooling devices, which reduce severe radiation damage to the crystal, diffraction data from several crystals would be collected In practice, multiple observations of symmetry related reflections (either from different crystals or from the same crystal) are merged to give only the unique reflections for that particular
Trang 35crystal system The quality of merged data is verified by Rsym, Eq 1.8, with typical values of less than 3% for low resolution data and up to 20% for data near the high resolution diffraction limit (Ealick, 1995)
Rsym = ∑hkl ∑i |I i (hkl) – <I(hkl)>| / ∑ hkl ∑i I i (hkl) (1.8)
Cowtan and Main (1996) have shown how missing reflections affect the reproduction of an image from diffraction data The completeness of low resolution data is important in the placement of missing parts of the structure, while refinement may benefit from the inclusion of high resolution data even if the merging R is up to
40% (Dodson et al., 1996) Depending on the structure determination method, the Bijvoet pairs of a reflection can either be treated separately to give h,k,l,F+,F-
(acentric) or averaged to give h,k,l,F (centric)
1.5 STRUCTURE DETERMINATION
1.5.1 Phasing methods
As discussed earlier, to solve a structure, information on all diffracted waves that originate from a crystal is required The frequency is determined by the X-ray wavelength while the amplitude is directly measured from reflection intensities The phase information, which is dependent on the position of each atom in the unit-cell, is however not measurable The phase information must somehow be obtained Most new structures are being solved by experimental phasing methods such as isomorphous replacement and anomalous scattering The molecular replacement method is used when a protein molecule has high sequence, and hence structural, similarity to an already solved protein structure Some structures, under favorable conditions, may be solved by direct methods as well
Trang 36The original method used to obtain phase angles for solving unknown protein structures was a purely experimental method, developed in the 1940s, and is known as multiple isomorphous replacement (MIR) In this technique, first an X-ray data set for the protein crystal (known as the native crystal) is collected Following this, a specific heavy atom (i.e., one with a high atomic number), such as mercury, platinum or gold,
is allowed to seep into the protein crystal This goal is achieved either by soaking the metals into pre-formed protein crystals, or by co-crystallization of the protein:metal complex A complete X-ray diffraction data set for the protein-metal complex is also collected Since the heavy atoms contain more electrons than the light atoms that compose the protein, they profoundly influence the resulting diffraction pattern and alter the intensity of the spots In addition, these heavy atoms act as reference markers, allowing their location in the protein crystals to be determined unambiguously with the principles of the Patterson methods This information about the heavy atom position, in combination with the observed changes in the spot intensities of the diffraction pattern, allows an initial set of phases to be determined, based largely on the phases calculated for the heavy atom positions alone In order for the MIR method
to work, two criteria must be satisfied: first, the technique requires the investigator to collect a full X-ray diffraction data set for the unmodified protein and additional data sets for at least two different types of heavy atom derivatives (e.g., a mercury derivative and a platinum derivative), wherein the heavy atoms bind at different parts
of the protein and reside at different positions in the unit cell This requirement for multiple heavy atom derivatives accounts for the name of the multiple isomorphous replacement method The second criterion that must be fulfilled in order for the technique to work is that the crystal structure of the protein containing the heavy
Trang 37atoms must be essentially identical (isomorphous, meaning ‘same shape’) to that of the unmodified protein alone
If an experimenter knows beforehand that the protein in question is similar in structure to a protein with a known structure, the known protein structure can then be used as a model to search for the orientation and position of the unknown structure in the unit-cell, thus solving the ‘phase problem’ Here the known protein’s phases are grafted onto the intensities which are experimentally determined The known protein
is called the phasing model and the method is called the molecular replacement method This method uses the Patterson map (basically an inter-atomic vector map) of both the phase model and the unknown protein The Patterson map of the phase model
is rotated and translated in the unit-cell and the best match between the Patterson maps of both the molecules will reveal the phase information Molecular replacement works well if the starting assumption is correct, that is, if the structures of the query protein and the reference protein are truly similar The technique can give poor results, sometimes deceivingly no results, if the starting assumption is invalid
Direct or the ab initio methods, which rely on the existence of mathematical
relationships among certain combinations of phases, are less used in protein crystallography, as these algorithms are likely to succeed only if a protein molecule has less than 100 residues and the data are of high resolution, say 1.5 Å or even better However, direct methods are used successfully in conventional crystallographic structure determination of small molecules, where these limitations pose few problems With direct methods, all that is required is a single set of native diffraction intensities The direct methods use sophisticated probability theory and an assumption
of approximately equal and resolved atoms to estimate reflection phases from measured intensities
Trang 381.5.1.1 The Multi-wavelength anomalous dispersion (MAD) method
As the structure reported in this thesis was determined using this method, it is explained in a relatively detailed manner In the vicinity of the X-ray absorption edge
(essentially resonance absorption) of an atom, its scattering factor f is given by
of the selenomethionine containing protein are grown and are used for MAD experiments
When X-rays interact with atoms that are heavier than carbon, nitrogen and oxygen, apart from scattering, the phenomenon of resonance occurs A fraction of the incident X-ray is absorbed, causing the electrons to undergo a quantum transition This is followed by re-emission of the X-ray at the same wavelength, but out of phase
by 90º This type of scattering is referred to as ‘anomalous’ dispersion or scattering For most heavy atoms, at typical X-ray wavelengths, this effect is quite small and is usually ignored For example, we ignore this effect in the discussion of MIR above
Trang 39However, the selenium in selenomethionine (and certain other heavy atoms) is relatively efficient in anomalous scattering at a corresponding wavelength
In a typical MAD experiment, a crystallographer collects three complete X-ray scattering data sets from a single crystal that contains an anomalous scatterer, say selenium One data set is collected in the resonance wavelength (called the peak wavelength) where anomalous scattering by selenium is maximum A second data set
is collected in a wavelength corresponding to the edge of X-ray resonance absorption (known as inflection) The third data set is collected in a wavelength that is far away
or remote from the resonance wavelength for selenium where the anomalous scattering effect can be largely ignored The use of multiple wavelengths to exploit the strengths of anomalous scattering is the reason why this technique is named as multiple wavelength anomalous dispersion, or MAD
The net effect of the MAD method helps the crystallographer solve the protein structure in two ways: first, the selenium atoms, like any heavy atom, alter the intensity of spots in the diffraction pattern When the anomalous scattering effect is maximized, Friedel’s law does not hold valid, i.e Fhkl ≠ F-h-k-l This difference in the
intensity between symmetrically related reflections in the diffraction pattern allows the position of the selenium atoms to be determined and an initial set of starting phase angles for each of the spots in the diffraction pattern can be assigned Secondly, when the selenium positions are known, these can act as anchor points or landmarks for methionine residues and thus help in tracing and modeling the protein chain
The starting phase angles are combined with the observed intensities of the spots as input into a Fourier transform, the result of which is an initial rough map of electron density of the protein that scattered X-rays An initial structure for the unknown molecule is then built and is used to derive new and improved phase angles,
Trang 40which are then combined with the observed intensities to recalculate an improved map
of electron density This iterative process of phase refinement and structural modeling eventually converges to a clear map and a model until no further refinement will be needed MAD phasing is now more-or-less the main technique in modern X-ray structural studies
1.5.1.2 Principle of anomalous scattering
The effect of anomalous scattering on a heavy-atom structure factor F PH,
consisting of two perpendicular contributions, the real ∆F r and the imaginary ∆F i, is depicted in Fig 8
When data are collected at wavelength λ1, which is away from the absorption edge, there is no anomalous scattering Let us represent this structure factor as When data are collected at wavelength λ2, near the absorption edge of the heavy-atom, anomalous scattering occurs If we call this structure factor as , then
1 λ
PH F
2 λ
PH F
i r PH
At λ1, Friedel's law still holds good, |Fhkl| = |F-h-k-l|, and αhkl = -α-h-k-l So
is the mirror image of in the real axis The real contribution and to the reflections of a Friedel pair are, like the structure factors themselves, reflections of each other in the real axis However, the imaginary contribution is the inverted value of That is, is obtained by reflecting in the real axis and then reversing its sign or pointing it in the opposite direction Because of this difference between the imaginary contributions to reflections, is not the mirror image
of From this disparity between Friedel pairs, the phase information can be extracted
−
1 λ
PH F
+
1 λ
PH F