Extra terminal residues have a profound effect on the foldingSushil Prasad Sati1, Saurabh Kumar Singh1, Nirbhay Kumar2and Amit Sharma1 1 Malaria Group, International Centre for Genetic E
Trang 1Extra terminal residues have a profound effect on the folding
Sushil Prasad Sati1, Saurabh Kumar Singh1, Nirbhay Kumar2and Amit Sharma1
1
Malaria Group, International Centre for Genetic Engineering and Biotechnology, Aruna Asaf Ali Marg, New Delhi, India;
2 Department of Molecular Microbiology and Immunology, Hopkins Malaria Research Institute, The Johns Hopkins University Bloomberg School of Public Health, Baltimore, Maryland, USA
The presence of extra N- and C- terminal residues can play a
major role in the stability, solubility and yield of
recombi-nant proteins Pfg27 is a 27Ksoluble protein that is essential
for sexual development in Plasmodium falciparum It was
over-expressed using the pMAL-p2 vector as a fusion
pro-tein with the maltose binding propro-tein Six different constructs
were made and each of the fusion proteins were expressed
and purified Our results show that the fusion proteins were
labile and only partially soluble in five of the constructs
resulting in very poor yields Intriguingly, in the sixth
con-struct, the yield of soluble fusion protein with an extended
carboxyl terminus of 17 residues was several fold higher
Various constructs with either N-terminal or smaller
C-ter-minal extensions failed to produce any soluble fusion
pro-tein Furthermore, all five constructs produced Pfg27 that
precipitated after protease cleavage from its fusion partner The sixth construct, which produced soluble protein in high yields, also gave highly stable and soluble Pfg27 after clea-vage of the fusion These results indicate that extra amino acid residues at the termini of over-expressed proteins can have a significant effect on the folding of proteins expressed
in E coli Our data suggest the potential for development
of a novel methodology, which will entail construction of fusion proteins with maltose binding protein as a chaperone
on the N-terminus and a C-terminal solubilization tag This system may allow large-scale production of those proteins that have a tendency to misfold during expression
Keywords: expression; fusion protein; precipitation; protein folding; solubility
Despite the widespread use of fusion protein-based
over-expression vectors for the production and purification of
proteins in Escherichia coli, the molecular or structural
elements which determine protein stability and solubility for
recombinant proteins are not well understood The
solubi-lity of non membrane-bound proteins is a complex
biochemical phenomena, and it is generally believed that
properly folded proteins are reasonably soluble in aqueous
solutions There are many factors that affect protein
solubility, and one such player is the amino acid sequence
variation at the amino (N-) and carboxyl (C-) termini In a
cellular environment, partially folded or misfolded proteins
are generally prone to aggregate formation, and the cell
machinery gets rid of these aggregates by the combined
action of chaperones and intracellular proteases [1–3]
Several studies have shown that the nature of terminal
residues in proteins (i.e polar or nonpolar) can play a role in
recognition and subsequent action by cellular proteases
[4,5] In many cases, polar residues at the carboxyl terminal
are able to prevent recognition by tail-specific proteases
[4–7] Together, these studies point to a complex and
multifactorial nature of protein folding, and indicate that phenomena like protein stability or its lability are not completely understood yet The molecular and structural elements which determine protein folding are significant players in the success or failure of over-expression tech-niques
We were interested in producing large amounts of a 27K cytoplasmic protein (Pfg27) from Plasmodium falciparum for biochemical and biophysical studies This protein plays a crucial role in the sexual development of P falciparum, and parasites lacking its gene fail to develop sexually [8] Initial efforts to express soluble Pfg27 as a fusion protein with His-tag in the pRSET vector system (Invitrogen) were unsuccessful as the over-expressed Pfg27 aggregated after purification resulting in precipitation Expression systems with the maltose binding protein (MBP) have been used routinely to enhance the solubility and yield of fusion products Aside from being an efficient tag for affinity chromatography, MBP is able to act as a molecular chaperone to enhance the solubility of fused partners [9,10] Therefore, various MBP-Pfg27 fusion constructs were engineered to study the behavior of the expressed proteins These constructs had variations in the sequence and length of extra amino acid residues at the termini of Pfg27 All six MBP-Pfg27 constructs produced equivalent amounts of fusion protein as judged by induction analysis Intriguingly, only one construct provided proteins of both high stability and solubility which could be used in biochemical and structural studies Our results indicate that the critical element in obtaining high quality and quantity of
Correspondence to A Sharma, Malaria Group, International Centre
for Genetic Engineering and Biotechnology, Aruna Asaf Ali Marg,
New Delhi 110067, India Tel/Fax: +91 11 6711731,
E-mail: asharma@icgeb.res.in
Abbreviations: MBP, maltose binding protein.
(Received 11 July 2002, revised 26 August 2002,
accepted 6 September 2002)
Trang 2Pfg27 was the presence of a carboxyl terminal extension of
17 residues, which seems to function as a solubilization tag
The exact mechanism for this phenomenon where an extra
stretch of residues is able to confer enhanced solubility
properties to a protein is not yet known
M A T E R I A L S A N D M E T H O D S
Oligonucleotides were purchased from Genosys, USA
Various enzymes, the pMAL-p2 vector and amylose resin
were bought from New England Biolabs (NEB), USA All
other chemicals were obtained from Sigma-Aldrich Co.,
USA
Expression constructs
The Pfg27 constructs were PCR amplified from P
falcipa-rum (3D7 strain) genomic DNA by using the following
primers (the restriction sites are shown in bold and
the respective enzymes in parentheses): Pfg27A: 5¢-AAA
CTGCAGATGAGTAAGGTACAAAAG-3¢ and 5¢-AAA
AAGCTTAATATTGTTGTGATGTGGTTCATC-3¢
(PstI-HindIII); Pfg27B: 5¢-AAACTGCAGATGAGTAAGGTA
CAAAAG-3¢ and 5¢-AAACTGCAGTTAAATATTGTTG
TGATGTGGTTCATC-3¢ (PstI); Pfg27C: 5¢-AAAAAGC
TTATGAGTAAGGTACAAAAG-3¢ and 5¢-AAAAAGC
TTTTAAATATTGTTGTGATGTGGTTCATC-3¢
(Hin-dIII); Pfg27D: 5¢-AAAGAATTCATGAGTAAGGTACA
AAAG-3¢ and 5¢-AAACTGCAGTTAAATATTGTTGT
GATGTGGTTCATC-3¢ (EcoRI-PstI); Pfg27E:5¢-AAAC
TGCAGATGAGTAAGGTACAAAAG-3¢ and 5¢-AAAA
GCTTTCACTTCGAATTCCATGGTACCAG-3¢
(PstI-HindIII); Pfg27F: 5¢-AAACTGCAGATGAGTAAGGTA
CAAAAG-3¢ and 5¢-AAAAAGCTTTTACGACGTTGT
GTGATGTGGTTCATC-3¢ (PstI-HindIII)
PCR was performed using standard protocols and the
product purified using Qiagen PCR purification kits DNA
fragments were cloned into the expression vector pMAL-p2
resulting in constructs designated Pfg27A–F All constructs
were verified by DNA sequencing in an ABI 310 automated
sequencer using a forward primer that anneals to the malE
gene upstream of the polylinker region and a reverse primer
that anneals to the lacZ alpha sequence downstream of the
first in-frame stop codon in pMAL vector
Expression and analysis of recombinant proteins
DNA constructs were transformed into BL21 (B834 DE3)
strain of E coli by the heat shock method and
transform-ants grown in LB broth in presence of carbenicillin
(50 mgÆmL)1) and 0.2% glucose For protein production,
100 mL of bacterial cultures were induced by 0.3 mMIPTG
at D of 0.6 and grown for another two hours The cultures
were spun down at 6000 g, pellets suspended in lysis buffer
(20 mM Tris pH 7.5, 200 mM NaCl, 1 mM EDTA and
1 mM phenylmethanesulfonyl fluoride) and subjected to
sonication The cell suspensions were then centrifuged at
26 000 g for 30 min, and the lysates loaded on
pre-equilibrated amylose columns The columns were washed
with 12 volumes of lysis buffer and the MBP-Pfg27 proteins
eluted with 10 mMmaltose Protein fractions were analyzed
by SDS/PAGE and the fusion proteins cleaved with factor
Xa in lysis buffer containing 10 mM CaCl In case of
constructs Pfg27B–F, the protease cleavage mixtures were centrifuged at 15 000 g for 10 min in a microcentrifuge The resulting supernatants and pellets were loaded on an SDS/ PAGE gel to check the solubility of MBP and Pfg27 after cleavage All protein concentrations were measured by UV absorption at 280 nm and verified by SDS/PAGE Stand-ard protocols were followed for western analysis where Pfg27 was probed with polyclonal anti-Pfg27 Ig produced in mice
R E S U L T S
Design of the pRSET construct
We first attempted to express recombinant Pfg27-(His)6 fusion protein by taking advantage of a prokaryotic expression vector, pRSET-C (Invitrogen) The coding sequence of Pfg27 was PCR amplified using gene specific primers The antisense primer lacked the final stop codon and the PCR product was cloned into the PvuII site of pRSET-C Subsequent sequence analysis as well as expres-sion upon induction confirmed production of the fuexpres-sion protein However, this construct produced insoluble protein which aggregated and precipitated upon purification, making it unacceptable for either functional or structural studies
Design of various MBP + Pfg27 fusion constructs The different constructs generated to examine the role of extra amino acid residues at N- and C-termini were as follows (Fig 1):
Pfg27A: This construct encodes a fusion protein in which Pfg27 has 12 extra residues at its N-terminal (from the vector backbone) and 17 extra amino acids at its C-terminal The natural end of Pfg27 is…HHNNI but in this construct Pfg27 ends with …HHNNI + LVPWNSKLGTGR RFTTS (the terminal polar residues are underlined) It also has an unexpected D to N mutation at the seventh residue position of Pfg27 This construct produces highly soluble and stable protein due to its 17 residue extension
Fig 1 Diagrammatic representation of the seven Pfg27 over-expression constructs The arrow shows the factor Xa cleavage site, and wedge represents the D to N mutation in some constructs Extra N- and C-terminal sequences are shown along with the original pRSET vector construct (Pfg27 HIS ) for reference The underlined residues indicate the polar end residues.
Trang 3Pfg27B: This construct has the same 12 extra residues at
the N-terminal as Pfg27A, retains the D to N mutation, but
has no C-terminal extension Any difference observed
between the behavior of fusion proteins Pfg27Aor Pfg27B
can be directly attributed to the 17 residue extension in
Pfg27A This construct produced insoluble protein as it
lacks the 17 residue C-terminal extension In addition, it
clearly shows that the accidental D to N mutation does not
effect protein solubility
Pfg27C: This construct has 15 extra amino acid residues at
the N-terminal, does not have the D to N mutation and also
lacks any C-terminal extension This construct indicates that
the N-terminal variation has no effect on protein solubility
Pfg27D: This construct does not have extra residues at
either termini and contains no mutation It is noteworthy
that this construct is designed to express Pfg27 in its native
state without any extra residues It therefore represents a
typically preferred construct This is a control construct
designed to make wild-type Pfg27 without extensions or the
D to N mutation
Pfg27E: This construct has extensions at the N- and
C-termini and also the D to N mutation It is closest to
Pfg27Ain design but has a truncated C-terminal extension
of only seven amino acids which end in polar residues This
construct indicates that the length of the C-terminal
extension plays an important role
Pfg27F: This construct has the same N-terminal extension
as Pfg27A,B,E but lacks a C-terminal extension It was
designed to address whether presence of polar residues at
the C-terminal is sufficient to confer increased solubility to
the fusion protein The end sequence of native Pfg27
(…NNI) was changed to (…TTS) Therefore, the terminal
three residues are identical to the ones in Pfg27A
Stability and solubility of various MBP-Pfg27 proteins
The Pfg27A–F construct DNAs were freshly transformed
into E coli cells, and the transformants grown, induced and
processed under identical conditions (see Materials and
methods) The cell pellets were sonicated and the resulting
lysates were loaded onto pre-equilibrated amylose columns
Sufficient and equivalent amounts of amylose resin were
used for each construct so that the yield differences were
normalized All experiments were conducted in 100 mL
cultures and the data presented in Table 1 has been scaled to
represent the yields for one liter cell culture These
constructs were used to express and purify MBP-Pfg27A–F
fusion proteins by maltodextrin affinity chromatography
according to manufacturer’s instructions All constructs showed equivalent levels of protein induction but the resulting fusion protein had varying solubilities (Fig 2, Table 1) The six constructs produced 200 mgÆL)1 of fusion protein However, the behavior of the proteins varied once the cells had been lysed for downstream processing and protein purification Pfg27Aproduced the highest protein levels in a stable and soluble form (Table 1) Approximately 33% of the total induced protein could be purified off the amylose column for Pfg27A In contrast, only7.5% was eluted off the amylose resin for Pfg27B–F Further, no protein was detected in the Pfg27Acolumn flow through but
5% of the fusion proteins from Pfg27B–Fdid not bind to amylose resin and were found in the flow through fractions (Table 1) The latter observation suggests improper folding
of Pfg27 in the case of Pfg27B–Ffusions due to which these interacted poorly with the amylose resin, a behavior noted earlier [9]
In the MBP fusion system, a factor Xa protease site has been engineered in the multiple cloning site such that MBP can be released from the over-expressed fusion protein In
an attempt to obtain native Pfg27, the six fusion proteins were incubated with appropriate amounts of the factor Xa protease for cleavage All fusion proteins cleaved success-fully under identical reaction conditions The stability and solubility of MBP after factor Xa cleavage was found high and identical in all cases However, the resulting Pfg27 proteins from constructs Pfg27B–Fwere labile, and precipi-tated immediately after cleavage (Fig 3) The identity of the precipitated protein (Pfg27) was confirmed by doing western blot analysis using anti-Pfg27 polyclonal antibodies In sharp contrast, Pfg27A produced Pfg27 that remained
Table 1 Expression and purification analysis of six MBP + Pfg27
constructs.
Amount of fusion protein (mg)
Construct Flow through
Eluted fusion protein
Final yield
of Pfg27
Fig 2 (A–F) Protein expression and purification profile of Pfg27A–F constructs by SDS/PAGE analysis Lane 1, protein standards; lane 2, uninduced cell pellet; lane 3: induced cell pellet; lane 4, induced cell supernatant; lane 5, induced cell pellet; lane 6, amylose column flow-through; lane 7, eluted MBP-Pfg27–70Kfusion protein marked by an arrow.
Trang 4soluble could be purified to homogeneity, and subsequently
crystallized The final yield of Pfg27 from Pfg27A was
24 mgÆL)1of starting culture while it was 0 for Pfg27B–F
Effect of temperature on the expression profiles
We examined the role of temperature in enhancing the
solubility properties of various fusion proteins For some
constructs, cell cultures were grown at 37C but the
temperature dropped to 30C or 25 C after induction
However, no discernible difference in the relative protein
solubility profiles was observed Indeed, the drastic
differ-ence in fusion protein quality between Pfg27Aand the rest of
the constructs was retained Therefore, the differing protein
solubilities were inherent in the constructs and could not be
modulated by varying the ambient growth conditions
D I S C U S S I O N
The routine production of recombinant proteins in soluble
and biologically active form remains a challenge In this
context, MBP and other fusion protein expression systems
have been widely used, both because the fusions serve as
efficient purification tags and because they are able to
promote the solubility of the fused partner [9,10] In the
present study, we engineered several constructs with the
overall aim of obtaining high quality Pfg27 protein for
biochemical and biophysical applications In the first
instance, we used one of the most commonly used vehicles
for over-expression, namely the pET vector as part of
pRSET system Although reasonable amounts of Pfg27
were produced upon induction, this construct failed to
produce protein in a soluble form Subsequently, the MBP
system was used and proved useful in producing at least
partially soluble proteins from constructs Pfg27A–F We
found that fusion to MBP did indeed promote the proper
folding of Pfg27 into a native and stable form but with a
caveat Our results suggest that the dramatic positive effect
on stability and solubility of Pfg27 produced using the
construct Pfg27A can be attributed to the presence of 17
extra residues at the C-terminal end It is probable that
MBP and the C-terminal extension both contribute to the
high yield and solubility of Pfg27Ain a complex process of assisted protein folding
To dissect further the structural elements responsible for the enhanced solubility phenomenon observed in Pfg27A,
we identified three key issues: (a) the role of the N-terminal extension; (b) the role of the D to N mutation; and (c) the role of the sequence and length of the C-terminal extension
To address whether presence of extra residues at the N-terminal of Pfg27Awere responsible for its high yields, we engineered constructs Pfg27B, Pfg27C, Pfg27Eand Pfg27F which share similar N-terminal extensions Clearly, the presence of these extensions was not enough to yield soluble fusion protein Next, we addressed whether the differing solubilities were due to the accidental D to N mutation in Pfg27A It is well documented that the solubility of proteins can be affected severely by single amino acid mutations However, in the present study we can exclude this possibility
as the D to N mutation is conserved in four of the six constructs (Fig 1), which nonetheless retain the contrasting solubility profiles for their respective fusion proteins Finally, to address whether the presence of terminal polar resides and the exact length of the C-terminal extension were responsible for the observed solubilization effect in Pfg27A,
we engineered two constructs Pfg27E and Pfg27F Once again, these constructs yielded only partially soluble fusion protein
The postcleavage precipitation of Pfg27 from constructs Pfg27B–Findicated misfolding of Pfg27 It is probable that these fusions were maintained in solution due to the well known solubilization effects of MBP Although MBP remained soluble after cleavage, Pfg27 precipitated These experiments also highlight the central issue that solubility of fusion proteins is not necessarily indicative of either proper folding or of increased solubility of the protein of interest
We propose that the success of fusion protein systems should be ascertained only once the protein of interest has been cleaved off and shown to retain both its folded state and its biological activity
It is possible that the exact sequence and length of the
17 residue extension together contribute solubilizing and stabilizing effects observed in Pfg27A This phenomenon, where extra C-terminal residues affect the stability of
an over-expressed protein, has some precedence (see Table 2) A more extensive study can now be undertaken
to verify whether segments like the 17 residue extension at the C-terminus of Pfg27A can be used in a more generic fashion
In the postgenomic era of functional genomics, structural genomics and an expanding scope for biotechnology products, the production of large amounts of soluble native protein has necessarily taken the center-stage Our findings
Fig 3 SDS/PAGE analysis of Pfg27 B (A) and Pfg27 A (B) cleavage
with factor Xa (A) Lane 1, protein standards; lane 2, cleavage mixture;
lane 3, cleavage mixture supernatant; and lane 4, cleavage mixture
pellet Proteins Pfg27 C–F gave identical precipitation profiles (B) Lane
1, Pfg27Afusion protein; and lane 2, cleavage mixture of MBP-Pfg27A
fusion protein Protein produced from constructs Pfg27 C–F showed
identical precipitation behavior.
Table 2 Effect of C-terminal extensions on various proteins expressed in
E coli.
Protein Variation Effect Reference Arc repressor C-terminal tail Increased stability [4] Lambda repressor
protein
C-terminal tail Increased stability [5] Ara C C-terminal tail Increased stability [11] Aldehyde
dehydrogenase
C-terminal tail Increased stability [12]
Trang 5highlight the contribution of extra C-terminal residues in
producing recombinant Pfg27 in a native, folded state which
is now suitable for biochemical, biophysical and
immu-nological characterization Structural elements like the
17 residue C-terminal extension may have widespread
application in recombinant protein production More
significantly, this study highlights the complex and
multifactorial nature of protein folding
A C K N O W L E D G E M E N T S
We thank the past and present members of the Malaria Group,
ICGEB, New Delhi for help and discussions N.K is supported by the
National Institutes of Health grant AI46760 A.S is supported by an
International Wellcome Trust Senior Research Fellowship.
R E F E R E N C E S
1 Maurizi, M.R., Trisler, P & Gottesman, S (1985) Insertional
mutagenesis of the lon gene in Escherichia coli: lon is dispensable.
J Bacteriol 164, 1124–1135.
2 Keller, J.A & Simon, L.D (1988) Divergent effects of a dnaK
mutation on abnormal protein degradation in Escherichia coli.
Mol Microbiol 2, 31–41.
3 Straus, D.B., Walter, W.A & Gross, C.A (1988) Escherichia coli
heat shock gene mutants are defective in proteolysis Genes Dev 2,
1851–1858.
4 Bowie, J.U & Sauer, R.T (1989) Identification of C-terminal extensions that protect proteins from intracellular proteolysis.
J Biol Chem 264, 7596–7602.
5 Silber, K.R., Keiler, K.C & Sauer, R.T (1992) Tsp: a tail-specific protease that selectively degrades proteins with nonpolar C ter-mini Proc Natl Acad Sci USA 89, 295–299.
6 Keiler, K.C & Sauer, R.T (1996) Sequence determinants of C-terminal substrate recognition by the Tsp protease J Biol Chem 271, 2589–2593.
7 Gottesman, S., Roche, E., Zhou, Y & Sauer, R.T (1998) The ClpXP and ClpAP proteases degrade proteins with carboxy-terminal peptide tails added by the SsrA-tagging system Genes Dev 12, 1338–1347.
8 Lobo, C.A., Fujioka, H., Aikawa, M & Kumar, N (1999) Dis-ruption of the Pfg27 locus by homologous recombination leads to loss of the sexual phenotype in P falciparum Mol Cell 3, 793–798.
9 Kapust, R.B & Waugh, D.S (1999) Escherichia coli maltose-binding protein is uncommonly effective at promoting the solu-bility of polypeptides to which it is fused Protein Sci 8, 1668– 1674.
10 Riggs, P (2000) Expression and purification of recombinant proteins by fusion to maltose-binding protein Mol Biotechnol 15, 51–63.
11 Ghosh, M & Schleif, R.F (2001) Stabilizing C-terminal tails on AraC Proteins 42, 177–181.
12 Rodriguez-Zavala, J & Weiner, H (2001) Role of the C-terminal tail on the quaternary structure of aldehyde dehydrogenases Chem Biol Interact 130–132, 151–160.