Table 12: Constructs created for use in protein expression for E47 and MYOD1 human proteins showing their theoretical biochemical properties estimated by ProtParamWilkins, et al., 1999 .
Trang 1STRUCTURAL CHARACTERIZATION AND
BIOCHEMICAL ANALYSIS OF ID2, AN INHIBITOR OF
NATIONAL UNIVERSITY OF SINGAPORE
2012
Trang 3! i!
ACKNOWLEDGEMENTS
I would like to thank my supervisor Dr Prasanna R Kolatkar for the opportunity to
work in his lab and for the valuable insight given during the course of this project
I would like to thank Dr Paaventhan Palasingham and Dr Jeremiah Joseph for
their mentorship and help in the structural determination at various stages
I am grateful to Dr Robert Robinson and Dr Howard Robinson for their assistance
with X-ray beamtime and data collection
I am thankful to my parents and sister who are always there when help is needed
I am also grateful to my husband for his support
Finally, I would like to acknowledge all the students and lab mates who made life
in the lab a great experience
Trang 4! ii!
TABLE OF CONTENTS
!
TABLE OF CONTENTS ii!
SUMMARY v!
LIST OF TABLES vi!
LIST OF FIGURES vii!
LIST OF SYMBOLS xi!
CHAPTER 1: INTRODUCTION 1!
1.1! Classes(of(basic(helix0loop0helix((bHLH)(proteins( (1!
1.2! bHLH(structures( (3!
1.3! IDs(are(Group(D(HLH(proteins( (7!
1.4! ID(proteins(in(development( (12!
1.5! ID(proteins(and(myogenesis( (14!
1.6! ID(proteins(and(neurogenesis( (14!
1.7! IDs(in(cancer( (16!
1.8! Properties(and(roles(of(ID2( (17!
1.9! Aim(and(Scope(of(Project( (20!
CHAPTER 2: MATERIALS and METHODS 22!
2.1! Cloning( (22!
2.2! Site(directed(mutagenesis( (24!
2.3! Protein(expression(optimization( (26!
2.4! Native(protein(expression( (26!
2.5! Seleno0Methionine((Se0Met)(substituted(protein(expression( (27!
2.6! Cell(Harvesting( (27!
2.7! Protein(Purification( (28!
2.8! Electrophoretic(mobility(shift(assay( (28!
Trang 5! iii!
2.9! Crystallization( (29!
2.10! X0ray(data(collection(and(processing( (30!
CHAPTER 3: RESULTS and DISCUSSION 31!
(Expression to X-ray Data Collection) 31!
3.1! Cloning(and(Small0scale(Protein(Expression( (31!
3.2! Protein(Expression(and(Purification( (35!
3.3! Protein(Identification( (39!
3.4! Crystallization( (40!
3.5! Data(Collection( (42!
CHAPTER 4: RESULTS and DISCUSSION 46!
(Structure Solution and Insights) 46!
4.1! Structure(solution(and(Refinement( (46!
4.2! Overall(Structure( (50!
4.3! Dimer(Interface( (53!
4.3.1! Hydrophobic(Core( (53!
4.3.2! Hydrogen(Bonds( (53!
4.3.3! Comparison(of(ID3(homology(model(homodimer(interactions( (57!
4.3.4! Disulfide(bond(in(ID2(homodimer(formation( (59!
4.4! Loop(region( (59!
4.5! N0terminal(Helix01(region( (61!
CHAPTER 5: RESULTS and DISCUSSION 64!
(Biochemical Studies) 64!
5.1! ID2(protein(activity( (64!
5.2! ID(heterodimer(binding(specificity(and(affinity( (66!
5.3! ID(helix01(residues(in(binding(specificity( (68!
5.4! Exploring(other(differences(in(ID(residues( (71!
Trang 6! iv!
5.5! MASH1(and(the(ID(proteins( (76!
CHAPTER 6: CONCLUSION and FUTURE DIRECTIONS 78!
6.1! Conclusions( (78!
6.2! Future(Directions( (81!
BIBLIOGRAPHY: 83!
LIST OF PUBLICATIONS 97!
Appendix 1: Protein Sequences (Human) 98!
Appendix 2: Purified proteins used in EMSA studies 99!
Appendix 3: E47 & MYOD1 cloning, expression and purification for EMSA studies 100! Appendix 4: Summary of expression and purification protocols for ID mutants 103!
Appendix 5: ID1 & ID3 cloning, expression and purification 104!
Appendix 6: ID2 as a dimer in solution Gel filtration profile 105!
Appendix 7: ID2 coordinates 106!
Trang 7! v!
SUMMARY
!
!
The ID proteins, a class of transcription regulators, were named for their role as
inhibitors of DNA-binding and differentiation They contained a helix-loop-helix
(HLH) domain without a basic DNA-binding domain and worked by dimerizing with
basic-HLH transcription factors to inactivate their DNA-binding abilities Although the
HLH domain was highly conserved and shared similar topology, the IDs preferentially
antagonized group A bHLHs such as E47 (TCF3) but not the group B MYC
In general, group A bHLHs contained proteins that bound the enhancer-box
(E-box) motif CANNTG and the consequences of their transcriptional inactivation were
implicated in cell cycle regulation, cell lineage determination, differentiation,
myogenesis, neurogenesis and tumourigenesis
ID2, a member of the ID family, was used to study this protein family Cloning
strategies to overcome the instability of this protein family were explored in addition
to the expression and purification approaches required to produce enough soluble
protein for crystallization
The crystal structure of ID2 was solved to 2.1 Å using a seleno-methionine
template model in molecular replacement The structure showed for the first time, a
loop ion that was previously unreported in HLH structures Residues involved in
ion-interactions were investigated for their roles in the structure of ID2 Besides the
hydrophobic core, an inspection of the ID2 structure showed that specific hydrogen
bonds were required for dimerization Comparisons of the ID2 crystal structure with
homology models, previous studies of specific residues, and the ID3 NMR structure
were done to examine how these residues might play a role in the structure and
function of ID2
Finally, mutations to key residues would be made and their activities tested in
competitive EMSAs to gauge their importance in dimerization of the ID protein family
Trang 8! vi!
LIST OF TABLES
Table 1: Representative structures of bHLH-containing proteins from the PDB for
each group 4!
Table 2: ID2 constructs and their theoretical biochemical properties estimated by ProtParam (Wilkins, et al., 1999) Constructs described in detail (yellow highlight) 23!
Table 3: Primer base for BP cloning (Invitrogen) to create the entry clone for Gateway LR reaction (Invitrogen) attB sites (italics), sequence transferred into pDonr vector during BP reaction (bold), protease sites (underlined) Final selected protease is highlighted in yellow 23!
Table 4: Sequences for each construct were added to the primer base in Table 3 to complete the primer sequences used for BP cloning 23!
Table 5: Mutagenesis primers Mutation shown after first underscore and changed residue denoted by red bold letter Forward and reverse primers denoted by _F and _R respectively Changed nucleotide (s) denoted by grey highlight 24! Table 6: Domain prediction results for ID2 from Ensembl release 67 32!
Table 7: LC/MS/MS mass spectrometry top hits for the purified proteins (Figure 8) Searches were done against all nr as well as human nr to show that the fragments captured always belonged to ID2 Note that the N-HLH-82-L contained the intact N-terminus (matched peptides in bold red) whereas the shorter form HLH24-82-L and the seleno-methionine version did not 39!
Table 8: Crystallographic Data Collection Statistics 44!
Table 9: Phasing statistics of Se-Met construct HLH24-82-L-Se-Met 47!
Table 10: Refinement statistics for native ID2 N-HLH82-L construct 48!
Table 11: Positions of 3 residues thought to be important for heterodimerization with MYOD1 62!
Table 12: Constructs created for use in protein expression for E47 and MYOD1 human proteins showing their theoretical biochemical properties estimated by ProtParam(Wilkins, et al., 1999) 100!
Table 13: Changes to ID2 protocol for expression and purification of ID2 and ID3 mutants 103!
Table 14: ID1 & ID2 constructs and their theoretical biochemical properties estimated by ProtParam(Wilkins, et al., 1999) 104!
Table 15: Changes to ID2 protocol for expression and purification of ID1 and ID3 HLH domains 104!
Trang 9! vii!
LIST OF FIGURES
Figure 1: Hydrophobic core packing of bHLH-containing proteins 5!
Figure 2: Cartoon representation of ID3 (PDB: 2LFH) NMR structure Monomer
shown as dark blue N-terminal residual tag, green unfolded N-terminus, pale red helix 1, green loop, red helix 2 6!
Figure 3: T-coffee multiple alignment of full length ID proteins to show the highly
conserved HLH region and the divergent N & C-termini with only a few small regions of similarity such as the D-box (destruction box) element 9!
Figure 4: Reported binders and non-binders of ID proteins The general structure of
binders had shorter helices unlike non-binders, such as MYC, which had the additional leucine zipper Overall, topology conformed to the same 4 helical bundle 10!
Figure 5: Cartoon representation of ID3 (PDB: 2LFH) aligned with E47 (Ellenberger
private communication) to illustrate how the heterodimerization might take place ID3 in red, E47 in blue 11!
Figure 6: Representative small-scale protein expression tests 33!
Figure 7: Stability of HLH24-82-L containing polypeptide stabilizer over 6 days at
room temperature (25°C) SDS-PAGE 12% gel: marker (lane M), Day 0 (lane 1), Day 1 (lane 2), Day 3 (lane 3), Day 6 (lane 4) 35!
Figure 8: ID2 proteins’ expression and purification 37!
Figure 9: ID2 proteins’ purity check by SDS_PAGE: marker (lane M, kDa)
N-HLH82-L (gel A, lane 1), HN-HLH82-LH24-82-N-HLH82-L (gel B, lane 2), HN-HLH82-LH24-82-N-HLH82-L-Se-Met (gel C, lane 3) 38!
Figure 10: HLH24-82-L crystals in 0.1 M MES pH 6.5, 2.5 M Lithium Acetate grown at
18°C 41!
Figure 11: Crystals from manual hanging-drop optimization grown at 18°C 42!
Figure 12: HKL view of reflections in the kl plane in reciprocal space for N-HLH82-L
crystal at 2.1Å resolution 45!
Figure 13: Ramachandran plot of ID2 N-HLH82-L by RAMPAGE
(http://www-cryst.bioc.cam.ac.uk/rampage/) (Lovell, et al., 2003) 49!
Figure 14: Diagrammatic representation of ID2 HLH structure 51!
Figure 15: Cartoon representation of the crystal structure of ID2 at 2.1Å resolution
showing the positive loop ion and missing basic region 52!
Figure 16: Ribbon representations of ID2 homodimer interactions ID2: chain A in
purple, chain B in brown, loop in green and potassium ion in grey 55!
Figure 17: Loop region mutants of ID2 and ID3 SDS-PAGE: marker (kDa, lane M),
before induction (lane U), insoluble pellet fraction (lane P), soluble fraction (S)
Trang 10! viii!
Red boxes denote expected expression region Gel A and B expression vector was pDest-565 induced at 17°C Gel C expression vector was pDest-HisMBP induced at 17°C 56!
Figure 18: Predicted interactions based on ID3 homology model (Wibley, et al., 1996)
were not found in either the ID2 crystal structure nor ID3 NMR structure 58!
Figure 19: Structural alignment of the bHLH domain of ID proteins and their binding
partners Alignments were done manually using Pymol’s align function as a guide 58!
Figure 20: E47 homodimer showing the network of glutamines that were predicted to
form hydrogen bonds but the distances were too far for most of them Perhaps E47 also had a positive ion in the loop coordinated by two of the glutamines that held it rigid? (grey sphere) 60!
Figure 21: Ribbon representation of ID2 and ID3 opposing chains to illustrate 3
residues thought to play an important role in heterodimerization with MYOD1 Residues from ID2 (Y37, D41) and ID3 (D42, H46) pointed away from the dimer interface ID2-K47 and potentially ID3-R52 had interactions with the loop ion that was necessary for homodimer formation of ID2 63!
Figure 22: EMSA controls 64!
Figure 23: EMSA 6% native gel showing that increasing concentration of ID2
inhibited E47 binding to DNA Lanes without ID2 (lanes 1 and 2) denoted by
“-“ Number of “+” denoted relative concentration of ID2 added All lanes contained 2 μM E47 This showed that the purified ID2 used for crystallization was active 65!
Figure 24: EMSA 6% native gel showing the different ID-HLH binding affinities to 0.05
µM human E47 Residues for each human ID protein given in parentheses “+” denoted presence of E47 All lanes contained 200nM DNA Concentrations of each ID protein provided in the table above the gel All ID proteins bound E47
to varying degrees 66!
Figure 25: EMSA 6% native gel showing the different ID-HLH binding affinities to 0.2
µM human MYOD1 (tagged with His-MBP) Residues for each human ID protein given in parentheses “+” denoted presence of MYOD1 All lanes contained 100nM DNA Concentrations of each ID protein provided in the table above the gel ID1 and ID2 showed weak interactions with MYOD1 where a large fraction seemed to form an intermediate rather than complete inhibition ID3 did not bind MYOD1 67!
Figure 26: EMSA 6% native gel showing the different ID-HLH binding affinities to 0.2
µM human MYOD1 (tagged with His-MBP) heterodimerized with E47 (0.05µM) Residues for each human ID protein given in parentheses “+” denotes presence of MYOD1 and/or E47 All lanes contained 200nM DNA Concentrations of each ID protein provided in the table above the gel MYOD1 had high propensity to bind E47 All IDs showed the same binding pattern as seen in Figures 24 and 25 67!
Trang 11! ix!
Figure 27: EMSA 6% native gel showing ID2 helix-1 mutants binding affinities to 0.2
µM human E47 “+” denotes presence of E47 All lanes contained 100nM DNA Concentrations of each ID protein provided in the table above the gel All mutants bound to E47 69!
Figure 28: EMSA 6% native gel showing ID2 helix-1 mutants binding affinities to 0.2
µM human MYOD1 (HisMBP tagged) “+” denotes presence of MYOD1 All lanes contained 100nM DNA Concentrations of each ID protein provided in the table above the gel All ID2 helix-1 mutants bound to MYOD1 weakly just like wild-type ID2 69!
Figure 29: EMSA 6% native gel showing ID2 helix-1 mutants binding affinities to 0.2
µM human MYOD1 (HisMBP tagged) heterodimerized with 0.2 µM E47 “+” denotes presence of MYOD1 and/or E47 All lanes contained 100nM DNA Concentrations of each ID protein provided in the table above the gel IDs bound with similar affinities as with the E47 and MYOD1 homodimers 70!
Figure 30: EMSA 6% native gels showing ID2 loop region mutants wt = wild-type ID2,
E47 concentration=100nM, DNA concentration=100nM, MYOD1 concentration=200nM Concentrations of ID2 are labeled Top gel shows binding to E47, bottom gel to MYOD1 Apart from the double mutant, the other ID2 mutants bound to E47 and MYOD1 as well as wild-type ID2 72!
Figure 31: EMSA 6% native gel showing ID2 loop mutants wt = wild-type ID2, E47
concentration=100nM, DNA concentration=100nM, MYOD1 concentration=200nM Concentrations of ID2 are labeled Top gel shows binding to E47, bottom gel to MYOD1 Both mutants showed partial binding loss compared to wild-type ID2 73!
Figure 32: EMSA 6% native gels showing ID3 loop region mutants wt = wild-type ID3
(His-MBP tag), E47 concentration=100nM, DNA concentration=100nM, MYOD1 concentration=200nM Concentrations of ID3 are labeled Top gel shows binding to E47, bottom gel to MYOD1 R60Q and R60A were both tagged with His-MBP R60Q appeared to bind better than wild-type ID3 74!
Figure 33: EMSA 6% native gels showing ID3 loop region mutants wt = wild-type ID3
(His-MBP tag), E47 concentration=100nM, DNA concentration=100nM, MYOD1 concentration=200nM Concentrations of ID3 are labeled Top gel shows binding to E47, bottom gel to MYOD1 R60Q and R60A were both untagged Tagged (Figure 32) or untagged, R60Q showed better binding than wild-type ID3 75!
Figure 34: EMSA 6% native gel showing ID proteins bound to MASH1 (left gel) and
MASH1-E47 heterodimer (right gel) E47 concentration=50nM, MASH1 concentration = 0.5µM, DNA concentration=100nM Concentrations of ID2 are labeled on top of the gels IDs did not bind MASH1, only E47 76!
Figure 35: EMSA 6% native gel showing ID proteins bound to MASH1-MYOD1
heterodimer MYOD1 concentration=0.2µM, MASH1 concentration = 0.5µM, DNA concentration=100nM Concentrations of ID2 are labeled on top of the gel IDs bound weakly to MYOD1 but did not bind to MASH1 77!
Trang 12! x!
Figure 36: SDS-PAGE 4-12% gels showing proteins used in EMSA studies in
Chapter 5 Marker in kDa (lane M), U = before induction Gel A & B are the ID2 helix-1 mutants Gel C is ID1-HLH, Gel D is His-MBP-ID3 fusion protein Gel E is E47 Gel F has both the fusion MYOD1 as well as untagged MYOD1 99!
Trang 13BP – lambda recombination reaction involving attB & attP sites
cDNA – complementary DNA
Cy5 – Cyanine 5
C-terminal – carboxy-terminal
D-box – destruction box
E-box – enhancer box
emc – extramachrochaetae gene in Drosophila melanogaster
EMSA – Electrophoretic mobility shift assay
Ile (I) – Isoleucine
IPTG – Isopropyl β-D-1-thiogalactopyranoside
Trang 14! xii!
LB – Luria broth
Leu (L) – Leucine
LR – lambda recombination reaction involving attL & attR sites
MBP – maltose binding protein
MAD – multiple anomalous disperson
Met (M) – Methionine
MCK – muscle creatine kinase
N-box – variation of enhancer box (E-box)
N-terminal – amino terminal
TB – Terrific broth
PAGE – polyacrylamide gel electrophoresis
PCR – polymerase chain reaction
Pro (P) – Proline
SAD – single anomalous disperson
SDS – sodium dodecyl sulphate
Trang 15! 1!
CHAPTER 1: INTRODUCTION
Helix-loop-helix (HLH) proteins are characterized by two alpha helices linked
together by a loop of varying lengths A group of transcription factors (TF) containing
this domain are found in virtually all eukaryotes In addition, these TFs include a
basic domain usually found at the N-terminal end of the HLH that binds DNA and
initiates transcription They tend to exist as dimers and have key roles in the
regulation of developmental events such as cell lineage determination and
differentiation as well as developmental processes such as neurogenesis and
myogenesis
With over 200 known helix-loop-helix proteins identified from yeast to humans,
this section aims to introduce some of the members of the basic-helix-loop-helix
(bHLH) family of transcription factors and then focus on a special group of
HLH-containing proteins that antagonizes the function of these bHLH TFs
1.1 Classes of basic helix-loop-helix (bHLH) proteins
!
The bHLH family of transcription factors are generally known for their ability to
homo- or hetero-dimerize on the canonical Enhancer box (E-box) motif (CANNTG)
(Ephrussi, et al., 1985) that is found in the muscle creatine kinase (MCK) promoter
Due to the large number of bHLH-containing proteins identified, and the sheer
diversity of their functions, several groups have come up with classification
mechanisms to cluster them
According to the classical groupings done by tissue distributions, dimerization
capabilities and DNA-binding specificities (Murre, et al., 1994), bHLHs were broken
up into 7 classes Class I contained ubiquitous proteins such as E12 and E47 (Murre,
et al., 1989), Class II were the tissue specific MyoD (Davis, et al., 1987) and NeuroD
(Poulin, et al., 1997) Class III contained the Myc family of transcription factors