These studies suggest that each well-packed structure is stabilized by a num-ber of intra- or intermolecular interactions, invoked by the appropriate alignment of amino acid residues, an
Trang 1natively unfolded protein, induced by addition of a
seven-residue peptide fragment
Mitsugu Araki and Atsuo Tamura
Graduate School of Science, Kobe University, Nada, Japan
In order to elucidate the architectural principle of
pro-tein structure, the relationship between protein
sequence and tertiary structure has been studied from
various perspectives Factors determining the
mecha-nism of structural stability have long been explored,
mostly by studying the stability and folding kinetics of
natural and mutated proteins [1–3] Recently,
compu-tational protein designs have become advanced and
provide new insights into the factors determining
pro-tein structure, stability and folding [4], i.e the redesign
of naturally occurring proteins [5–8] and the de novo design of novel structures [9,10] These studies suggest that each well-packed structure is stabilized by a num-ber of intra- or intermolecular interactions, invoked by the appropriate alignment of amino acid residues, and the number of proteins with well-packed structures seems extremely small compared with all possible sequences (primary sequence space) In such cases, how have the existing proteins attained their well-packed structures? If the majority of the possible
Keywords
folding; intrinsically unstructured protein;
protein stability; self-association; solubility
Correspondence
A Tamura, Graduate School of Science,
Kobe University, Nada, Kobe 657-8501,
Japan
Fax ⁄ Tel: +81 78 803 5692
E-mail: tamuatsu@kobe-u.ac.jp
Database
The atomic coordinates have been
deposited in the Protein Data Bank (PDB
ID code 2KFQ for FP1)
(Received 26 September 2008, revised 10
February 2009, accepted 12 February 2009)
doi:10.1111/j.1742-4658.2009.06961.x
To elucidate the architectural principle of protein structure, we focused on sequestration from solvent, which is a common characteristic of folding and self-associative precipitation Because protein solubility can be regarded as a basis for the potential ability to sequester from solvent, we assume that poorly soluble proteins tend not only to precipitate, but also
to form solution structures To examine this, the solubility of a 25-residue, natively unfolded protein, modified from a zinc-finger domain of transcrip-tion factor Sp1, was disturbed by adding a seven-residue hydrophobic pep-tide fragment to the C-terminus NMR and ultracentrifuge measurements
of the resulting sequence showed that a dissolved species forms an a-helical structure in a 15–20 molecule oligomer To elucidate the mechanism by which the structure forms, we prepared two variants in which the added fragments are less hydrophobic; the structural stabilities were then mea-sured at various pH values A fairly good correlation was observed between stability and hydration potential, whereas a much stronger correla-tion was observed between stability and solubility, indicating that the stability is more strongly dependent on the ability to precipitate than on dehydration These results show that, among poorly soluble protein mole-cules, dissolved species can be transformed from the solvent-exposed unfolded state into a loosely packed structure via intermolecular inter-actions Because decreasing the protein solubility does not require the primary sequence to have a sophisticated design, such a protein structure might form readily and frequently, compared with the well-packed structure found in native proteins
Abbreviations
FP, final protein; IP, initial protein; DGdissol, dissolution free energy; DGf, folding free energy; DGhyd, hydration potential.
Trang 2sequences result in the unfolded state, the probability
of natural proteins exploring and evolving well-packed
structures would be extremely limited It is thus
assumed that moderately structured states needed for
decent biological function might arise frequently This
assumption is supported by protein-folding studies,
which shed light on the early evolution of natural
pro-teins First, in 74-residue library based on a binary
patterning of polar or nonpolar residues, most proteins
formed fluctuating structures reminiscent of molten
globule intermediates [11] Second, using the lattice
model, folding simulations implied that a randomly
chosen sequence of amino acids frequently encodes a
globular conformation [12] This simulation is based
on the concept that hydrophobicity is a driving force
in protein folding, in which a protein excludes part of
the molecule from the solvent water in a
geometry-specific manner [13]
In this study, we attempted to identify a crucial
fac-tor in the formation of a moderately structured state
by focusing on the fact that sequestering from a
solvent is a common characteristic of folding and
self-associative precipitation or aggregation, the latter
frequently occurring during the handling of natural
and artificial proteins Because the tendency of a
pro-tein molecule to precipitate is represented by its
solu-bility, this physicochemical property can be regarded
as a criterion for the potential ability to squeeze out
solvent In the case of a poorly soluble protein, which
disfavors exposure to solvent, it is likely that the
dis-solved species tend to form solution structures
seques-tered from the solvent In order to confirm this, we
attempted to transform a 25-residue, soluble unfolded
protein (the initial protein; IP) into a structured
pro-tein (the final propro-tein; FP) by altering the solubility
First, the amino acid sequence of the IP was
deter-mined on the basis of a zinc-finger domain of
tran-scription factor Sp1, which folds into a well-defined
structure consisting of a b hairpin and an a helix upon
binding to Zn2+; it is unfolded in the absence of the
metal [14] Next, the solubility of the IP was decreased
by the addition of seven hydrophobic amino acids to
the C-terminus; we anticipated that the added
frag-ment would induce long-range interactions between
amino acids separated in the primary sequence [15] As
a result, NMR, CD and ultracentrifuge measurements
showed that a dissolved species of the resulting
sequence, FP1, takes the form of an a-helical structure
in a 15–20 molecule oligomer, without addition of
Zn2+ We thus scrutinized the dependence of protein
solubility or hydration potential on the structural
stability using two variants of FP1 by varying the
pH A strong correlation between the stability and
solubility elucidated mechanism of formation of the loosely packed structure, induced by the association with other copies of the same chain
Results
Sequence and structural property of IP Among several candidates for IP, which needs to be unstructured in the native condition, we chose the third zinc-finger domain Sp1f3 of transcription factor Sp1, with two histidines (His21, His25) and two cyste-ines (Cys5, Cys8), to bind coordinately to Zn2+ [14] Sp1f3 is known to fold into a well-defined structure upon binding to Zn2+, and is unfolded in the absence
of the metal To suppress the excessively high solubil-ity of Sp1f3, which is caused by the high frequency of ionizable amino acid residues (Table 1), residues 26–29 (Gln-Asn-Lys-Lys) in the C-terminal region were removed and Lys1, Lys2, Glu7 and His17 were replaced by alanine or tyrosine In addition, His25, which binds to Zn2+in Sp1f3, was replaced by alanine
to suppress any possible interactions with trace amounts of metal ions in solution, resulting in the sequence of IP given in Table 1 In NOESY and
TOC-SY spectra of 3 mm IP at pH 3.0, most of the NOE peaks overlapped with TOCSY cross-peaks, indicating that these NOE peaks are intraresidual The remaining NOE peaks, identified as non-intraresidual, came from the sequential signals, CaH-NH, CbH-NH, NH-NH and those related to CdH of prolines In addition,
far-UV CD spectra of IP, which is typical of unfolded proteins, did not change in the concentration range 0.4–2.9 mm (Fig 1) All of these NMR and CD ana-lyses show that IP remains unfolded up to a con-centration of 3 mm
Sequence and structural properties of FP
We attempted to add a peptide fragment to the C-ter-minus of the IP, anticipating that a solution structure might be formed throughout the length of the mole-cule The number of additional residues was limited to six, because contiguous hydrophobic residues in a pro-tein resulted in a reduced yield in peptide synthesis
Table 1 Sequences of Sp1f3, IP and FPs.
Sp1f3 KKFACPECPK 10 RFMRSDHLSK 20 HIKTHQNKK
FP1 YAFACPACPK RFMRSDALSK HIKTAFIVVA 30 LG
Trang 3Among the various hydrophobic scales reported, we
used the hydration potential from the gaseous to the
aqueous phase (DGhyd) of amino acids [16,17], because
it is quantitatively related to each side-chain and
main-chain component We chose Gly, Pro, Leu, Ile, Val,
Ala, Phe, Cys and Met, which have notably larger
hydration potentials (greater than )2 kcalÆmol)1)
than those of any other amino acids (less than
)5 kcalÆmol)1), as candidates for amino acid
resi-dues in the fragment Next, the sequence (X1, X2, .,
X6) of the extra region was chosen to be
complemen-tary [18] to P6, C5, A4, F3, A2 and Y1 in the
N-termi-nal region, assuming that the interactions between the
N- and C-terminal regions are important for folding of
the whole molecule The resulting sequence of the extra region in the final protein, FP1, became FIVVAL (Table 1) In a NOESY spectrum of 3 mm FP1 at
pH 3.0, a number of NOE peaks, including long-range NOEs, i.e Y1CdH–I27CcH and A2CbH–V29CbH, and medium-range NOEs, i.e cross-peaks of daM(i,i+2) and dab(i,i+3), were observed in addition to intraresi-due and sequential NOEs However, in the case of 0.9 mm FP1, intraresidue and sequential NOE peaks, detected mostly in the case of IP, were observed In a NOESY spectrum of 1.5 mm FP1, long- and medium-range NOEs observed at 3.0 mm FP1 were partially observed In addition to the NMR analyses, far-UV
CD spectra of FP1 showed that the shape is dependent
on the protein concentration (Fig 1A) The [Q] value
at 222 nm for 0.4 mm FP1 was approximately )2000, which is close to that for 0.4 mm IP, whereas it became more negative, to approximately )4000, with
an increase in the protein concentration to 3 mm (Fig 1B) All of these NMR and CD analyses show that the amount of secondary and tertiary structure in FP1 increases with the protein concentration
Self-association of FP1
We examined the degree of protein association at vari-ous concentrations by measuring the sedimentation equilibrium (Fig 2A) and sedimentation velocity (Fig 2B) At 0.9 mm FP1, sedimentation equilibrium measurements showed that most plots of the apparent molecular mass are distributed over the range 0–10 000 Da, which corresponds to a molecular mass for a monomeric or dimeric form of FP1 of 3500 or
7000 Da, respectively, although a few plots are
> 10 000 At 1.5 mm FP1, although a large number of plots are distributed over the range 0–10 000 Da, the number of plots above 10 000 becomes noticeably higher At 3.0 mm, most plots are in the range
50 000–70 000, which corresponds to a molecular mass for an oligomer of 15–20 molecules Sedimentation velocity measurements were also performed by chang-ing the FP1 concentration At 3.0 mm, the distribution
of the sedimentation coefficients showed two main peaks, one at 4.4 and the other at 4.6 (Fig 2B) These peaks could be assigned to oligomers of 15–20 mole-cules, because it was shown that most FP1 molecules form this type of oligomers at 3 mm, according to the equilibrium measurements At 2.3 mm, the distribution showed a distinct peak at 1, which is the smallest observed sedimentation coefficient, in addition to the main peak at 4.5 The sedimentation coefficient of monomeric FP1 can be estimated by using the equation correlating the sedimentation coefficient (S) and the
A
B
Fig 1 Protein concentration dependence of far-UV CD spectra of
IP and FP1 (A) Spectra of FP1 (B) [Q] values at 222 nm for IP and
FP1 [Q] is molar ellipticity per residue.
Trang 4molecular mass (M): S = M(1)qvs)D⁄ RT, where q is
the density of a solution, vs is the partial specific
vol-ume of the solute, D is the diffusion constant of the
sol-ute, R is gas constant and T is absolute temperature S
was calculated to be 0.4 by using M = 3500 gÆmol)1,
q = 1 gÆcm)3, vs= 0.7 cm3Æg)1, which was the general
value of native proteins [19], R = 8.3 JÆK)1mol)1,
T= 293 K and D= 9.3· 10)11m2Æs)1, obtained
using pulsed-field gradient NMR spectroscopy, as described previously [20,21] Therefore, the peak at 1 can be assigned to the monomer or small oligomer These sedimentation equilibrium and velocity measure-ments indicate that the amount of monomeric FP1 decreases with an increase in the concentration, whereas the number of 15–20 molecule oligomers increases
pH dependence of solubility and stability The peptide fragment added to the C-terminus of IP in FP1 consists of hydrophobic amino acid residues, which have notably large hydration potentials, DGhyd
In order to identify a determining factor in the struc-tural formation, we prepared two variants whose hydrophobicity is lower than that of FP1: FP2 (Phe26-Tyr) and FP3 (Phe26Tyr and Val28Ser) (Table 1) The hydration potential of FP2 or FP3 can be calculated to
be 5.4 or 12.5 kcalÆmol)1, respectively; lower than that
of FP1, based on DGhyd derived from model com-pounds [16] The solubility of these mutants was mea-sured at various pH values (Fig 3A) The individual solubility of FPs increases gradually with a decrease in
pH in the range 6.5–7.3, presumably because of an increment in the net charges caused by protonation of the imidazole group in His21 and anionic sulfhydryls in Cys5 and Cys8 (Fig 3B) At all these pH values, the solubility of FP2 is as high as that of FP1, and that of FP3 is clearly the highest Plots of the solubility at pH
< 6.4 were omitted because they were severely scat-tered owing to the steep slope Experimentally obtained plots for each FP were fitted using Eqn (5), where rp values for the FPs were set to 384 A˚, because the amino acid compositions of the FPs were almost identi-cal This rp value was obtained by fitting FP3, for which the error in the determination of rp is smallest among FPs Values of lpðsÞ lo 0
pðsolÞ
=RT obtained by fitting FP1, FP2 and FP3 were )22.8 ± 0.1, )22.5 ± 0.1 and )20.5 ± 0.0, respectively
The pH dependence of the structural stabilities represented by NOE intensities for 3 mm FP1 is given
in Fig 4A,B With an increase in pH, integrated inten-sities of long-range NOE peaks (Fig 4A), as well as short- and medium-range NOE peaks (Fig 4B), increased The increment was also confirmed for FP2 and FP3 at 3 mm (data not shown)
Discussion
Solution structures of FPs Complete assignment of the proton chemical-shift resonances was achieved for FPs, excluding the amide
A
B
Fig 2 Sedimentation equilibrium and sedimentation velocity
mea-surements of FP1 (A) The distribution of apparent molecular mass
(Mw app ) against the location in the cell, obtained from
sedimenta-tion equilibrium measurements at protein concentrasedimenta-tions of 0.9
(black), 1.5 (blue) and 3.0 m M (red) Apparent molecular mass was
calculated at respective points in the cell, i.e the higher the A 250 ,
the closer to the bottom of the cell (B) Distribution of
sedimenta-tion coefficients obtained from sedimentasedimenta-tion velocity
measure-ments at protein concentrations of 2.3 (blue) and 3.0 m M (red).
Trang 5protons of the N-termini, which were not detected
because of rapid exchange with solvent Some of the
NOE peaks in the NOESY spectra of 3 mm FPs,
how-ever, overlapped and could not be identified separately
Torsion angle restraints, obtained from DQF-COSY,
and distance restraints, obtained from clearly separated
NOE peaks at a mixing time of 200 ms, are given in
Table 2 for 3 mm FP1 at pH 3.0 Because the sedimen-tation equilibrium and velocity measurements of FP1 showed that most of the dissolved species form oligo-mers consisting of 15–20 monooligo-mers at a protein con-centration of 3 mm, it is likely that some distance restraints are because of intermolecular interactions caused by the added fragment However, the ratio of long-range NOEs related to the added fragment to the total of long-range NOEs is 73%, which is notably higher than that of intraresidue (21%), short-range (30%) and medium-range (28%) NOEs, showing that long-range interactions are generated mainly by the added fragment We deduced that long-range distance restraints are caused by intermolecular interactions, whereas other distance restraints are produced intra-molecularly The final structural calculation of a FP1 molecule was performed using a total of 342 intraresi-due, short-range and medium-range distance restraints and 25 backbone / dihedral angle restraints (Table 2) The resulting r.m.s.d from the mean structure for the backbone atoms is 2.79 ± 0.71 A˚, which is worse than that derived from typical native proteins (< 0.5 A˚) because long-range distance restraints were not included in the structural calculation A stereoview of the 10 best FP1 structures (Fig 5A) shows that the backbone residues Phe12–Ile22 adopt an a helix By contrast, in the Sp1f3–Zn2+ complex, the backbone residues Asp16–Gln26 form an a helix or a 310helix, whereas Phe12–Ser15 forms a turn between the second
b strand and the helix Intermolecular interactions were drawn on the lowest energy structures of FP1, assuming that long-range restraints excluded in the
A
B
Fig 3 pH dependence of the solubilities of FPs (A)
Experimen-tally obtained plots of the natural logarithm of the solubility, S
(molÆL)1), in the pH range 6.4–7.4 (B) Solubilities calculated using
rp= 384 A ˚ and l
pðsÞ l o 0
pðsolÞ
=RT values of )22.8 ± 0.1, )22.5 ± 0.1 and )20.5 ± 0.0 for FP1, FP2 and FP3, respectively, in
the pH range 2.5–6.0 Errors in ln S calc for FPs are < 0.1 (Inset)
Net charge curves of FPs calculated using pK values for the amino
acid side chains, a-COOH and a-NH3+ termini: Tyr = 10.9,
Cys = 8.3, Lys = 10.8, Arg = 12.5, Asp = 3.9, His = 6.0; NH3+ of
the N-terminus = 9.1, and COOH of the C-terminus = 2.4 [22].
Table 2 Structural statistics for the 10 lowest energy structures of FP1.
Number of distance restraints
Short-range (|i )j | = 1 residues) 118 Medium-range (|i )j | = 2–4 residues) 74 Long-range (|i )j | > 4 residues) (52) a Number of torsion angle restraints
Geometric statistics r.m.s.d from the mean structure (A ˚ ) Backbone atoms (residues 3–30) 2.79 ± 0.71 All heavy atoms (residues 3–30) 3.51 ± 0.81 Ramachandran analysis
a Long-range distance restraints were not used in the structure calculation (see text).
Trang 6structural calculation are induced intermolecularly (Fig 5B) The long-range interactions can be divided into three categories: (a) interactions between the N-terminus (Tyr1–Cys8) and the C-terminus (His21– Ala30), (b) interactions between the N-terminus and the middle (Arg11–Arg14), and (c) interactions between the middle and the C-terminus The N- and C-termini contain mainly hydrophobic amino acids, and the middle also includes the hydrophobic amino acids, Phe12 and Met13 Because most of the long-range NOEs were assigned to these hydrophobic resi-dues, the intermolecular interactions are presumably hydrophobic CD analysis and structural determination showed that the amount of a helix increases with an increase in FP1 concentration Therefore, it is likely that the a helix, whose constituent interactions are mainly intramolecular, is induced by the hydrophobic interactions between FP1 molecules It should also
be noted that the structural specificity is apparently low and the structure is therefore loosely packed because the proton chemical shifts do not disperse compared with those observed in typical native proteins, despite the appearance of a number of NOE peaks
Physicochemical factors that determine the structural stability
Here, we scrutinize the dependence of protein solubil-ity or hydration potential on structural stabilsolubil-ity to elu-cidate the determining factor in structural formation First, the stability was derived from the fraction of the structured molecules at each pH for 3 mm FPs by using six, clearly separated, short- and medium-range NOE peaks, in which distances related to NOE inten-sities were set to the best structure of FP1 [15] The fractions obtained are shown in Fig 4C Fractions of FP1 and FP2, which are close to 0 at pH 2.5, increase
to 0.4 as pH is increased to 4.2 The structural stabilities could not be calculated at pH values > 4.2 because of an increase in the aggregation rate By con-trast, the FP3 fraction is close to 0 at pH 3.8, and increases to 0.4 as pH increases to 5.6 The folding free energies (DGf) can be calculated using the fractions (Table 3) according to the following simple scheme: 15D$ N15; because ultracentrifuge
measure-A
B
C
Fig 4 pH dependence of the structural stabilities of FPs (A) Inte-grated intensities of long-range NOE peaks of FP1 (B) InteInte-grated intensities of short- and medium-range NOE peaks of FP1 (C) Frac-tions of structured molecules of FP1 (red), FP2 (blue) and FP3 (green).
Trang 7ments suggested that, up to 3 mm, the dissolved
spe-cies contain mainly the monomer and 15–20 molecule
oligomer Second, the hydration potential (DGhyd) was
evaluated at each pH value as follows: DGhyd values
for the FPs were calculated using the hydration
poten-tials of the amino acid side chains and the backbone
[16,17] In addition, because the individual hydration
potential for ionizable side chains, a-COOH and
a-NH3+ termini depends on the pH of the solution,
i.e a protonated cation or deprotonated anion is
stabilized or destabilized, respectively, with a decrease
in pH [16], pH dependence was taken into account
using pK values (Table 3) [22] Third, solubility was
represented as the dissolution free energy of solute p
(DGdissol), which can be calculated using DGdissol=
)RT ln Sp, where R is gas constant, T is absolute
temperature and Sp is the solubility of solute
p DGdissol reflects the free energy of transfer of solute
p from the solid phase to aqueous solution For a
protein as a solute, solubility is known to depend on
polarity, hydrophobicity and net charge [23–25], the
latter of which increases positively with a decrease in
pH The solubility at each pH value for the FPs was
calculated by taking these factors into account and
using Eqn (5) (Fig 3B and Table 3)
Plots of DGf, obtained using NOE peaks as described above, against DGhyd show a fairly good correlation (r = )0.70; Fig 6A), i.e the less hydrated the protein, the more stable the solution structure However, plots of DGf against DGdissol show a much stronger correlation (r =)0.86; Fig 6B), i.e the more insoluble the protein, the more stable the structure These results indicate that the structural stability is more strongly dependent on the precipitation capabil-ity than on the dehydration capabilcapabil-ity This means that, even if it becomes more hydrated, a protein that prefers precipitation retains the stable structure, as shown in the case of Phe26Tyr replacement These pre-cipitable proteins might tend to form the solution structure because both structure formation and precipi-tation require the self-association of protein molecules
to sequester from solvent
Mode of formation of loosely packed structure through intermolecular interactions
When the hydrophobic peptide fragment of FIV-VALG was added to the C-terminus of unstructured
IP consisting of 25 residues, formation of the overall protein structure was induced in FP1, with a drastic
Fig 5 NMR structures of FP1 (A) Backbone traces of the 10 best structures Backbones of residues 12–22, which adopt an a helix, are drawn in red (B) Schematic diagram of long-range interactions between FP1 molecules Long-range interactions are represented by blue lines between the lowest energy structures of FP1 The side chains of Tyr1, Phe3, Ala4, Cys5, Pro6, Ala7, Cys8, Arg11, Phe12, Met13, Arg14, His21, Ile22, Lys23, Ala25, Phe26, Ile27, Val28, Val29 and Ala30, which are related to the long-range interactions, are indicated in green In addition, NOE peaks including aromatic protons of Phe3 and Phe12 in FP1 are not clearly separated because chemical shifts of aromatic protons of Phe3 and Phe12 are close to those of Phe26 Therefore, long-range interactions including aromatic protons of Phe3 and Phe12 in FP2, which is the variant Phe26Tyr of FP1, are added.
Trang 8decrease in solubility, i.e the solubility of IP was
> 2.2 mm at pH 7.1 (data not shown), whereas that
of FP1 was 10 lm (Fig 3A) This loosely packed
structure is maintained by intermolecular interactions,
indicating that the added peptide fragment confers
the ability to form the protein structure by having
low-specificity interactions How much hydrophobicity
is needed in the added fragment to form this
struc-ture? Phe26Tyr replacement in the added fragment
resulted in a similar stability in FP2, while keeping
the same solubility However, the additional Val28Ser
replacement led to a drastic decrease in the stability
of FP3, which showed higher solubility than FP2,
indicating that hydrophilic replacement of two of
seven hydrophobic residues in the added fragment
deprives the whole protein of the ability to form
structures In fact, a decrease of 1.5 kcalÆmol)1 in
the dissolution free energy by the replacements
Phe26Tyr and Val28Ser resulted in a decrease of
5 kcalÆmol)1 in structural stability (Fig 6B) These
results demonstrate that, among poorly soluble
pro-teins, dissolved species tend to be transformed from
the solvent-exposed unfolded state into a loosely
packed solution structure through intermolecular
interactions
The mechanism of structural formation through intermolecular interactions is similar to that of intrinsi-cally unstructured proteins, which are devoid of the well-defined secondary and tertiary structure in
A
B
Fig 6 Relationships between structural stability and hydrophobic indices The correlation between the folding free energy (DGf) and (A) the hydration potential (DG hyd ), defined as the free energy of transfer of a protein solute molecule from the gaseous phase into water, or (B) the dissolution free energy (DGdissol= )RT ln S), for FP1 (red), FP2 (blue) and FP3 (green) Lines represent linear fits with correlation coefficients of )0.70 and )0.86, respectively To calculate DGhyd of FPs, hydration potentials of 18 amino acid side chains, excluding Pro and Arg, were taken from the values measured by Wolfenden et al [16] Those of Pro and Arg side chains and the backbone are taken from the values measured by Privalov et al [17] pK values of the amino acid side chains, a-COOH and a-NH 3+ termini are taken from the values in the legend to Fig 3 [22].
Table 3 pH dependence of folding free energy (DG f ), dissolution
free energy (DG dissol ) and hydration potential (DG hyd ) of FPs Errors
in pH and DGdissolare < 0.02 and 0.1, respectively The folding free
energy was calculated according to DGf= )RT ln ([N 15 ] ⁄ [D] 15 ),
where [N 15 ] and [D] are the concentrations of the structured
oligo-mer and unfolded monooligo-mer, respectively A error in the DGfvalue
for FP3 at pH 5.65 could not be obtained because only the
spec-trum at a mixing time of 300 ms was analyzed owing to an
increase in the aggregation rate.
pH
DG f (kcalÆmol)1)
DG dissol (kcalÆmol)1)
DG hyd (kcalÆmol)1)
Trang 9isolation, but adopt relatively rigid conformations
upon binding a specific molecular partner of ligands or
substrates [26–29] In the case of FPs, it is
unstruc-tured in isolation, as in the case of intrinsically
unstructured proteins, whereas a helix (Phe12-Ile22) is
induced with a gain in the concentration One reason
that a local conformation consistent with a helix is
formed could be attributed to the potential ability
[30,31] and⁄ or local structural preference [32] in the
sequence, because the zinc-finger domain Sp1f3,
ini-tially chosen as a basis for the FPs, also forms a
ter-tiary structure containing a helix (Asp16–Gln26) upon
binding to Zn2+[14] We show that the local structure,
stabilized originally by coordinate bonds to the metal
ion, could be induced by interactions with other copies
of the same chain, after the addition of a proper
hydrophobic segment responsible for the decrease in
solubility Furthermore, this inductive mechanism
indi-cates that hydrophobicity could be regarded as a
driv-ing force in the structural formation of FPs, as
observed in protein folding in general [13] By using a
simple model of short, self-avoiding flexible chains on
lattices, in which the only energetic feature of the
sequence is the hydrophobic interaction, protein
fold-ing simulations imply a significant probability that a
random sequence of amino acids will encode a
globu-lar conformation, in general, and a particuglobu-lar native
structure, in specific [12] The globular conformation is
interpreted as being like a molten globule, stabilized
by intramolecular hydrophobic interactions [12,33]
However, our results show that the unfolded protein
can be transformed into the structured assembly by
altering the solubility Because decreasing the protein
solubility does not require a sophisticated design for
the primary sequence, it is implied that the loosely
packed structures with intermolecular interactions
shown in FPs may arise readily and frequently
com-pared with well-packed structures The existence of this
moderately structured state might serve as an
interme-diate stage in the search for the well-packed structures
of natural proteins in the vast primary sequence space
In addition, the high occurrence of a primary sequence
that prefers self-association seems closely connected to
the inherent tendency of natural proteins to aggregate
and form potentially harmful deposits such as amyloid
fibrils [34–38]
Materials and methods
Protein synthesis and purification
Proteins were synthesized using the Pioneer Peptide
Synthe-sis System (PerSeptive Biosystems, Foster City, CA, USA)
with Fmoc solid-phase chemistry, and were cleaved from the resin with a solution containing 82.5% trifluoroacetic
trifluoro-acetic acid) Protein identity was confirmed by a laser
(SHIM-ADZU, Kyoto, Japan) Protein samples for all studies were lyophilized and stored under anaerobic conditions These
CD measurements
spec-tropolarimeter with 0.1, 0.2, 1, 5 and 10 mm pathlength cuvettes on 0.004 to 3 mm protein samples After each pro-tein was dissolved in a buffer containing 25 mm acetic acid,
NaOH or HCl Protein concentrations were determined spectroscopically by measuring the amount of protein sulfhydryls with Ellman’s reagent [39]
Ultracentrifuge measurements
Each protein sample was prepared as described above Sedi-mentation velocity and sediSedi-mentation equilibrium measure-ments were performed using a Beckman-Coulter Optima XL-1 analytical ultracentrifuge (Fullerton, CA, USA) with
an An-60 rotor and two-channel charcoal-filled Epon cells
was measured at protein concentrations of 0.9, 1.5 and 3.0 mm; sedimentation velocity was measured at 2.3 and 3.0 mm Data were analyzed using ultrascan 6.01 (http:// www.ultrascan.uthscsa.edu/)
NMR spectroscopy
NMR measurements were performed on a Bruker
sam-ples After each protein was dissolved in a buffer containing
25 mm acetic acid, 0–4 mm NaOH and 50 mm NaCl in
objective pH with NaOH or HCl Pulsed-field gradient
contain-ing 40 mm 1,4-dioxane at 0.5 mm FP1, at which concentra-tion, CD and sedimentation equilibrium measurements suggested that FP1 was in the monomer state All chemical shifts were referenced to the sodium salt of trimethylsilyl-propionate Pulsed-field gradient NMR spectroscopy, 2QF COSY, TOCSY (mixing time of 30 and 80 ms) and NOE
suppression was achieved by selective presaturation or
Trang 10field-gradient pulses [40] Proton resonances were assigned using
the sequential assignment procedure [41] Fractions of
structured molecules were obtained by analyzing NOESY
spectra at mixing times of 100, 150, 200, 250 and 300 ms
for the 3.0 mm protein solution
Structure calculations
Distance restraints were obtained by converting integrated
NOE peak intensities into distance upper limits, using the
macro CALIBA in dyana [42] Standard pseudo atom
distances were used when they were needed Torsion
7.5–8.5 and > 8.5 Hz, respectively With a cut-off of
0.2 A˚ for upper bound NOE violations, 50 structures
were generated by using dyana and the 10 lowest
energy structures were selected to represent 3D structures
Ramachandran analysis was evaluated by using procheck
[43]
Solubility measurements
Solubility was estimated using saturated protein solutions
as follows Samples of 200–400 lL protein suspensions in
buffers containing 25 mm phosphate and 50 mm NaCl in
incu-bated samples were centrifuged at 17 000 g for 20 min at
of the individual supernatant solutions were measured
Pro-tein concentrations were determined spectroscopically by
measuring the amount of sulfhydryls with Ellman’s reagent
[39]
Analysis of protein solubility
The chemical potential of a solute p in a real solution
(lp(sol)) is generally expressed by
lpðsolÞ¼ l0
pðsolÞþ RT ln cpSp ð1Þ
pðsolÞ is the chemical potential in the ideal solution
at a standard concentration of p, R is gas constant, T is
pðsolÞ could be divided into the free
the ion, Z, and a term independent on the charge of the
pðsolÞ
:
l0pðsolÞ¼ l0pðsolÞ0 þ DGsolv;p0 ð2Þ
DG0 solv;p¼ Z
2e2Na
8pe0rp 11
er
ð3Þ
by extended the Debye–Hu¨ckel law:
log cp¼ AZ
2 ffiffi I p
1þ Brp
ffiffi I
where A and B are constants, and I is the ionic strength in
can obtain an equation for the solubility of p by using Eqns (1–4):
ln Sp¼ ln 10A
ffiffi I p
1þ Brp
ffiffi I
p þC
rp
!
Z2þlpðsÞ l
0 0
pðsolÞ
where
C¼ e
2Na
8pe0RT 1
1
er
ð6Þ
Assuming that the dissolved protein is a spherical ion
solvent [24], experimentally obtained plots of individual protein solubility were fitted using Eqn (5) with A = 0.512 (L1 ⁄ 2Æmol)1 ⁄ 2), B = 0.329 (A˚)1ÆL1 ⁄ 2Æmol)1 ⁄ 2) and C = 281
Acknowledgements
We thank Miyo Sakai (Institute for Protein Research, Osaka University) for the ultracentrifuge measurements This work was supported in part by Grants-in-Aid for Science from the Ministry of Education, Culture, Sports, Science and Technology
of Japan
References
1 Carlsson U & Jonsson BH (1995) Folding of beta-sheet proteins Curr Opin Struct Biol 5, 482–487
2 Chakrabartty A & Baldwin RL (1995) Stability of alpha-helices Adv Protein Chem 46, 141–176
3 Jackson SE (1998) How do small single-domain proteins fold? Fold Des 3, R81–R91
4 Kuhlman B & Baker D (2004) Exploring folding free energy landscapes using computational protein design Curr Opin Struct Biol 14, 89–95
5 Ponder JW & Richards FM (1987) Tertiary templates for proteins Use of packing criteria in the enumeration