We subsequently investigated the biophysical features of HCV-1a and HCV-1b Core+1/S proteins using sequence analysis and complementary biophysical approaches [fluorescence, CD, dynamic li
Trang 1ARFP/Core+1/S protein
Anissa Boumlic1,2,*, Yves Nomine´1,*, Sebastian Charbonnier1, Georgia Dalagiorgou2, Niki
Vassilaki2, Bruno Kieffer3, Gilles Trave´1, Penelope Mavromara2 and Georges Orfanoudakis1
1 Oncoproteins Group, Universite´ de Strasbourg, CNRS FRE 3211, Ecole Supe´rieure de Biotechnologie de Strasbourg, Illkirch, France
2 Molecular Virology Laboratory, Hellenic Pasteur Institute, Athens, Greece
3 Biomolecular NMR Group, UMR CNRS 7104, Institut de Ge´ne´tique et de Biologie Mole´culaire et Cellulaire, Illkirch, France
Introduction
Hepatitis C virus (HCV) is the major etiological agent
of chronic hepatitis, with more than 170 million people
being infected worldwide [1,2] Persistent HCV
infec-tion progresses, in 20% of cases, to liver cirrhosis
within 20 years of infection, with the possible
develop-ment of hepatocellular carcinoma (HCC) in 1–4% of
cases [3] No prophylactic vaccine against HCV exists,
and the efficiency of therapies is hindered by the
extreme heterogeneity of the HCV genome [4,5] HCV,
a Hepacivirus genus member of the Flaviviridae family,
is a small, enveloped RNA virus [6] Its genome is a positive, single-stranded 9.6 kb RNA containing 5¢-UTRs and 3¢-UTRs involved in viral protein trans-lation and viral replication [7–9] The genome encodes
a large precursor polyprotein that undergoes proteoly-sis, generating HCV structural proteins (Core, E1, and E2) and nonstructural proteins (p7, NS2, NS3, NS4A, NS4B, NS5A, and NS5B) An alternative reading frame (Core+1 ORF) overlapping the Core protein gene in the +1 frame was recently reported [10–13]
Keywords
ARFP/Core+1/S; hepatitis C virus (HCV);
intrinsic disorder; IUP/IDP; NMR
Correspondence
G Orfanoudakis, Oncoproteins Group,
Universite´ de Strasbourg, CNRS FRE 3211,
Ecole Supe´rieure de Biotechnologie de
Strasbourg, Illkirch, France
Fax: +33 3 68 85 47 70
Tel: +33 3 68 85 47 65
E-mail: georges.orfanoudakis@unistra.fr
*These authors contributed equally to this
work
(Received 25 October 2009, revised 30
November 2009, accepted 1 December
2009)
doi:10.1111/j.1742-4658.2009.07527.x
The hepatitis C virus (HCV) Core+1/S polypeptide, also known as alter-native reading frame protein (ARFP)/S, is an ARFP expressed from the Core coding region of the viral genome Core+1/S is expressed as a result
of internal initiation at AUG codons (85–87) located downstream of the polyprotein initiator codon, and corresponds to the C-terminal part of most ARFPs Core+1/S is a highly basic polypeptide, and its function still remains unclear In this work, untagged recombinant Core+1/S was expressed and purified from Escherichia coli in native conditions, and was shown to react with sera of HCV-positive patients We subsequently under-took the biochemical and biophysical characterization of Core+1/S The conformation and oligomeric state of Core+1/S were investigated using size exclusion chromatography, dynamic light scattering, fluorescence, CD, and NMR Consistent with sequence-based disorder predictions, Core+1/S lacks significant secondary structure in vitro, which might be relevant for the recognition of diverse molecular partners and/or for the assembly of Core+1/S This study is the first reported structural characterization of an HCV ARFP/Core+1 protein, and provides evidence that ARFP/Core+1/
S is highly disordered under native conditions, with a tendency for self-association
Abbreviations
ARFP, alternative reading frame protein; DLS, dynamic light scattering; HCC, hepatocellular carcinoma; HCV, hepatitis C virus; HSQC, heteronuclear single quantum coherence; IDP, intrinsically disordered protein; IMAC, immobilized metal ion affinity chromatography; MBP, maltose-binding protein; OG, n-octyl-b- D -glucoside; SSP, secondary structure propensity; TEV, tobacco etch virus.
Trang 2This ORF is responsible for the expression of various
alternative reading frame proteins (ARFPs), also
named Core+1 proteins, resulting from mechanisms
such as ribosomal frame shifting and internal initiation
at alternative AUG or non-AUG codons [10–12,14–
17] Core+1 proteins were recently shown not to be
required for HCV replication [18,19] However, the
presence of specific antibodies and T-cell-mediated
immune responses in serum from HCV-infected
patients suggests the expression of the Core+1 ORF
during HCV infection [10–12,20,21] Furthermore,
Core+1 proteins were found to interfere with
apopto-sis and cell cycle regulation [22,23], suggesting a
possi-ble role of these proteins in HCV pathogenesis
One remarkable ARFP is Core+1/S, a small
poly-peptide with a length varying from 38 to 76 residues
among HCV genotypes Core+1/S corresponds to the
C-terminal fragment of the Core+1 ORF, and to date
is the shortest ARFP form described Its translation
results from internal initiation at alternative AUG
codons (85–87) located downstream of the polyprotein
codon initiator Recently, two different groups
observed that Core+1/S is the predominant alternative
form when the Core+1 ORF is introduced into
mam-malian expression systems [16,24] In addition,
Core+1/S was found to be downregulated by the Core
protein and degraded in a proteasome-dependent
man-ner [25,26]
In order to further our understanding of these
pro-teins, we undertook biochemical and biophysical
stud-ies of the Core+1/S proteins derived from HCV-1a
and HCV-1b isolates The Core+1/S proteins were
produced in bacteria and purified in native conditions
ELISA experiments using the purified recombinant
Core+1/S of HCV-1b demonstrated the ability of the
protein to react with sera from HCV-infected patients
We subsequently investigated the biophysical features
of HCV-1a and HCV-1b Core+1/S proteins using
sequence analysis and complementary biophysical
approaches [fluorescence, CD, dynamic light scattering
(DLS), and NMR] We provide evidence that ARFP/
Core+1/S is highly disordered under native
condi-tions, with a tendency for self-association
Results
Sequence analysis of Core+1/S predicts the
largely disordered character of the protein
Sequence alignments were performed to analyze the
degree of Core+1/S amino acid conservation among
reference sequences of different HCV genotypes
(Fig 1A) [5] The N-terminal sequence is well conserved
and exhibits hydrophobic patches, encompassing resi-dues 1–6, 14–25, and 32–35 In contrast, considerable variability was observed in the location of the stop codon on the RNA sequence (data not shown), leading
to variation in the lengths of protein sequences Amino acid sequences were analyzed using the disorder predic-tion tools globplot and pondr globplot evaluates the sum of the disorder propensity for each amino acid among the sequence, and pondr analyzes the mean net charge and hydrophobicity of the polypeptide chain This combination of properties seems to be a prerequi-site for the absence of compact structure in native con-ditions [27] globplot predicted disordered regions encompassing amino acids 6–28 and 42–52 for HCV-1a Core+1/S, and amino acids 6–28 and 42–58 for HCV-1b Core+1/S, whereas pondr suggested that most of the Core+1/S sequence is disordered
In order to assess whether the disorder prediction is also confirmed by the absence of secondary structure, four algorithms (phd, gor4, simpa96, and sopma) were used to predict the secondary structure contents of both HCV-1a and HCV-1b Core+1/S proteins (Fig 1B)
A consensus is drawn for residues with at least three out of four identical secondary structure predictions Such a consensus suggested that the majority of resi-dues are not embedded in secondary structure elements, with the exception of short residue stretches mainly located in the second and third hydrophobic patches The combination of secondary structure and disordered region predictions strongly suggests that the N-terminal and C-terminal regions of HCV Core+1 proteins are largely unstructured and highly disordered (Fig 1A) These predictions are supported by the high degree of conservation of several disorder-promoting residues, such as alanines, arginines, glycines, and serines (Fig 1B) [28]
Expression and purification of Core+1/S proteins
in native conditions
We cloned and expressed the HCV-1a and HCV-1b Core+1/S proteins encompassing residues 85–160 and 85–142 of the full Core+1 ORF, respectively These constructs were fused to the C-terminus of either His6, His6–maltose-binding protein (MBP) (Fig S1), or His6–NusA (Fig 2A) Screenings of optimal yield and solubility conditions were first performed on the three constructs of HCV-1a Core+1/S by varying the induc-tion temperatures between 37 and 22C Analysis on Tris/Tricine SDS gels showed expression of proteins at the expected molecular mass (Fig 2B; Fig S1), with
an optimum induction temperature at 22C However, both His-tagged and MBP-tagged HCV-1a Core+1/S
Trang 3proteins were found largely in the insoluble fractions
after cell lysis (Fig S1), even after incubation at low
temperatures In contrast, NusA-tagged HCV-1a
Core+1/S was largely soluble even after tobacco etch
virus (TEV) protease cleavage (Fig 2C) The
solubiliz-ing properties of NusA have already been described in
the literature [29] However, it was surprising to
observe MBP fusion proteins in the insoluble fractions,
as the MBP carrier is also a well-known protein
solu-bilizer Despite its small size, Core+1/S seems,
there-fore, to promote aggregation of the fusion protein
when fused to the MBP carrier As NusA solubilized
HCV-1a Core+1/S, we fused the same carrier protein
to the HCV-1b Core+1/S After TEV protease-medi-ated proteolysis, both HCV Core+1/S proteins remained soluble (Fig 2C)
When Core+1/S production was scaled up, the use
of the optimal expression and purification conditions
as described above led to protein aggregation In order
to prevent this, we lowered the expression temperature
to 15C and systematically supplemented the purifica-tion buffer with l-arginine and l-glutamic acid at a final concentration of 50 mm each These additives are known to prevent protein aggregation [30] Finally,
A
B
Fig 1 Sequence analysis of Core+1/S proteins (A) Alignment of 17 Core+1/S amino acid reference sequences for different HCV genotypes Protein sequences were obtained after translation of the Core+1 ORF nucleotide sequences retrieved from the GenBank database (acces-sion numbers are given in parentheses) Core+1/S amino acid sequences were aligned using CLUSTALW Similarity percentages are indicated
on the right, according to CLUSTALW calculations Hydrophobic residues are boxed (B) Disorder and structure predictions Disorder predictions were made using GLOBPLOT and PONDR Disordered and ordered regions are indicated by ‘D’ and ‘.’, respectively Secondary structure predic-tions were performed with GOR 4, SOPMA , SIMPA 96 and PHD , using both HCV-1a and HCV-1b Core+1/S amino acid sequences as inputs (see Experimental procedures) Structure predictions for each residue position are indicated as a-helix (H), extended strand (E), b-turn (T), or ran-dom coil (C) Uppercase letters indicate a prediction rate higher than 80% A consensus was reported when three or more predictions over the four algorithms provide identical secondary structure prediction Residues are numbered from the start of Core+1/S and correspond to residues 85–161 and 85–144 of the Core+1 ORF, and nucleotides 599–827 and 599–776 of the Core/Core+1 RNA sequence, for HCV-1a and HCV-1b, respectively.
Trang 4buffers were routinely supplemented with dithiothreitol
and argon to reduce protein oxidation [31] Final
yields were approximately 1 mg of expressed protein
per liter of bacterial culture
Upon size exclusion chromatography, both HCV
Core+1/S proteins eluted as monomers, according to
column calibration (Fig 3A) MS analysis of the
puri-fied proteins gave experimental masses of 7630.7 ± 0.8
and 6076.0 ± 0.1 Da for HCV-1a and HCV-1b
Core+1/S, respectively The mass of HCV-1b Core+1/
S corresponds to the calculated value (6075.9 Da),
whereas that of HCV-1a Core+1/S showed loss of the
GA sequence that is usually left after TEV protease
pro-teolysis and the N-terminal methionine Purified
recom-binant Core+/1S proteins were also verified through
SDS/PAGE (Fig 3B), and were specifically recognized
by polyclonal antibodies against the Core+1 ORF in
western blot experiments (Fig 3C)
Sera from HCV-1-infected patients are reactive
against native HCV-1b Core+1/S
HCV-1b Core+1/S was used in ELISA to test the
reactivity of sera from patients positive for HCV
genotype 1 Figure 4 shows a high prevalence ( 60%)
of Core+1 antibodies in patient sera as compared with the cutoff value, defined as the average of the negative controls plus two standard deviations The presence of antibodies against Core+1/S indicates that the purified recombinant untagged protein remains immunoreac-tive, and suggests that the protein is present in patients infected with HCV of genotype 1
Intrinsic fluorescence of Core+1/S proteins HCV-1a and HCV-1b Core+1/S proteins contain tryptophans at positions 34, 49, 66, and 74, and posi-tions 6, 34, and 49, respectively Intrinsic fluorescence spectroscopy was therefore used to evaluate the solvent accessibility of these residues As all tryptophans are simultaneously excited, the emission spectrum results from the sum of the signals of individual emitters The maxima of fluorescence emission for HCV-1a and HCV-1b Core+1/S proteins were observed at wave-length of 354 and 353 nm, respectively (Fig 5A) These values are close to that of soluble tryptophan in aqueous solution (355 nm) [32], indicating that all try-ptophans of Core+1/S proteins are exposed to the
sol-A
B kDa
17 28
11
55 72
C
17 28
11
55 72
P S P S P S
TEV TEV
HCV-1a Core+1/S
Core+1/S NusA
T7
6xHis
TEV
17 28
11
55 72 NusA-HCV-1a Core+1/S
17 28
11
55 72
22 28
HCV-1a Core+1/S
NusA-HCV-1b Core+1/S
NusA-HCV-1b Core+1/S
HCV-1b Core+1/S
Fig 2 Expression and purification screenings of native NusA–HCV Core+1/S proteins (A) Cloning strategy for expression of Core+1/S The sequence His6–NusA is fused at the 5¢-terminus of the Core+1/S DNA sequence (B) Pellet/supernatant assays After transformation, expression of recombinant proteins was monitored for 2, 4 h or overnight at 37, 28 or 22 C, respectively Fifty microliters of bacterial cul-ture was sonicated and centrifuged for 15 min at 16 000 g Supernatants (S) and pellets (P) were analyzed by Tris/Tricine SDS/PAGE (C) IMAC purification of NusA–Core+1/S proteins followed by TEV protease digestion Labeled or unlabeled His6-NusA–Core+1/S proteins were expressed under optimized conditions, and purified on Ni2+–nitrilotriacetic acid resin in the presence of arginine and glutamic acid (50 m M each) After IMAC purification, fusion proteins were desalted and subjected to TEV protease cleavage to release Core+1/S proteins Lane 1: bacterial lysate Lane 2: IMAC elution at 250 m M imidazole Lane 3: desalted NusA–HCV-1a Core+1/S before TEV protease cleavage Lane 4: NusA–HCV-1a Core+1/S after TEV protease cleavage Lane 5: desalted NusA–HCV-1b Core+1/S before TEV protease cleavage Lane 6: NusA–HCV-1b Core+1/S after TEV protease cleavage Arrows on the right indicate the bands for soluble NusA-HCV-Core+1/S, NusA, TEV and Core+1/S proteins.
Trang 5vent In a second step, an HCV-1b Core+1/S sample
was subjected to a 20 min heat pulse 16 h prior to
flu-orescence analysis (Fig 5B) No change in either the
wavelength or the intensity of the maximum
fluores-cence emission was observed This observation
indi-cates an absence of precipitation, suggesting resistance
of the protein to heat treatment, a feature that is often
associated with disordered proteins [33]
Self-assembly of HCV Core+1/S proteins
DLS allows the oligomeric status of proteins in
solu-tion to be evaluated Hydrodynamic radius
distribu-tions were derived from DLS data recorded for each
protein sample under various conditions, assuming a
coil model as implemented in dynals (Fig 6) In the
absence of any treatment or additive (Fig 6, upper
panels), the average hydrodynamic radii (Rh) were
4.5 ± 2.4 and 2.5 ± 1.2 nm for purified HCV-1a and
HCV-1b Core+1/S proteins, respectively Assuming a
coil model, these radii are equivalent to particles of
nearly 15 and five monomers for 1a and
HCV-1b, respectively The radius distribution indicates the
polydisperse character of both isoforms As the
proteins eluted as monomers in a size exclusion chro-matography column, it appears that multimerization occurs during and/or after concentration
We previously showed that HCV-1a Core+1/S
is localized in the endoplasmic reticulum membranes [24] Under the hypothesis that HCV-1b Core+1/S contains membrane localization determinants, we added octyl glucoside [n-octyl-b-d-glucoside (OG)], a nonionic detergent that is frequently used to solubilize integral membrane proteins The presence of OG in Core+1/S proteins sharpened the size distributions as observed with DLS, and thus lowered the polydisper-sity in particle sizes, although the average hydrody-namic radii were not significantly altered (Fig 6, middle panels)
When the proteins were subjected to a heat pulse, the average hydrodynamic radii shifted from 4.5 ± 2.4
to 1.8 ± 0.6 nm for HCV-1a Core+1/S, and from 2.5 ± 1.2 to 1.6 ± 0.6 nm for HCV-1b Core+1/S (Fig 6, lower panels), suggesting a transition to lower-size oligomers In addition, the polydispersity significantly decreased Thus, high temperature is able
to disrupt Core+1/S multimers without leading to protein precipitation
12
8
4
20 40 60 80
0
Elution volume (mL)
6.5 13.7 29 43 67
100
28 55
11
c
kDa
kDa
A
C
(b)
(c)
HCV-1b Core+1/S
NusA
HCV-1a Core+1/S
28 55
11
28 55
11
11
kDa
17
28
Ponceau Red
Anti-Core+1 HCV-1a
Anti-Core+1 HCV-1b
Fig 3 Biochemical analysis of purified native HCV Core+1/S proteins (A) Size exclusion chromatography of HCV Core+1/S proteins After TEV protease proteolysis, proteins were injected onto a Hiload 16/60 Superdex 75 column in the presence of argi-nine and glutamic acid (50 m M each) The mass distribution in the eluant is indicated
at the top Both HCV-1a Core+1/S (dotted line) and HCV-1b Core+1/S (bold line) eluted
as monomers, according to the column cali-bration (B) Coomassie blue staining of puri-fied proteins by Tris/Tricine SDS/PAGE Molecular masses are given on the left, and arrows indicate the expected expression products (C) Western blot analysis of puri-fied Core+1/S proteins After purification and concentration, Core+1/S proteins were analysed by western blotting using anti-HCV-1a Core+1 or anti-HCV-1b serum Left panel: Ponceau staining of HCV Core+1/S proteins and HPV16 E6 Middle panel: HCV-1a Core+1/S revealed by anti-HCV-HCV-1a Core+1 serum Right panel: HCV-1b Core+1/S revealed by anti-HCV-1b Core+1 serum Molecular masses are indicated on the left.
Trang 6CD analysis of potential secondary structure of
Core+1/S proteins
CD spectra were recorded for both proteins in the
far-UV region Globally, CD spectra for HCV-1a
(Fig 7A) and HCV-1b (Fig 7B) Core+1/S proteins
did not show the characteristics of a full random coil
conformation (a strong negative minimum at 195–
198 nm, and a weak negative signal at 220 nm) [34]
Instead, we observed a maximum at 195 nm and a
minimum at 220 nm, suggesting the existence of
b-sheet secondary structure Deconvolution of the CD
data was performed using three sets of reference
pro-teins and the algorithms provided by the cdpro suite
[35] As selcon3 failed several times to fit the
CD data, this program was not used for data
analysis However, both cdsstr and contin/ll gave
consistent results, and allowed the contributions of
structural elements to be estimated The percentages
of a-helix (a), b-sheet (b) and unordered (U)
struc-tures were 5%, 30% and 65%, respectively,
with a typical range of variation of 10–20%
(Fig 7C) Although the high content of unordered
structure is consistent with disorder prediction, a sig-nificant amount of b-sheet content seems to be pres-ent The presence of such a signal might be due to the presence of intrinsic b-sheet structure in Core+1/
S protein Alternatively, it might also correspond to b-sheet structure formed at the interface of Core+1/
S monomers upon multimerization, as it has been shown that b-sheet structure is predominant in aggre-gates and is often associated with intrinsically disor-dered proteins [36]
Finally, the CD spectrum recorded for an HCV-1b Core+1/S sample subjected to a heat pulse was slightly different from that of an unheated sample (Fig 7D) In contrast, the addition of OG induced drastic changes in the CD spectrum as compared with the untreated sample spectrum for both Core+1/S proteins (Fig 7A,B), suggesting an effect of OG on the conformation of HCV-1b Core+1/S However, the addition of OG prevented the recording of data at wavelengths below 206 nm, hindering the deconvolu-tion of data
NMR analysis of HCV-1b Core+1/S
In order to further investigate the structural properties
of Core+1/S proteins, NMR 1H–15N heteronuclear single quantum coherence (HSQC) experiments were performed for both HCV-1b Core+1/S (Fig 8A) and HCV-1a Core+1/S (Fig S2) Both spectra exhibit a rather narrow amide proton chemical shift dispersion, limited to 0.7 p.p.m Such a range is characteristic of a lack of structural organization of the backbone [37] The spectrum recorded for HCV-1a Core+1/S showed
a high number of overlapping peaks, impeding the accurate counting of peaks In contrast, the HSQC spectrum of HCV-1b Core+1/S allows the counting of
a number of peaks consistent with that expected from the protein sequence
In order to assign backbone frequencies of the poly-peptide, three-dimensional NMR experiments were performed on a 15N,13C-labeled HCV-1b Core+1/S
0.4
0.0
0.3
0.2
0.1
HCV/HCC Controls
Fig 4 Reactivity of sera from genotype 1 HCV-infected patients
against HCV-1b Core+1/S The sera from HCV-infected patients
were tested by enzyme immunoassay, using the native HCV-1b
Core+1/S Controls correspond to HCV-negative patient sera.
was determined as the average of HCV-negative sera absorbance
plus two standard deviations.
305 325 345 365 385 400
Wavelength (nm)
Heated HCV-1a Core+1/S
0
6
HCV-1b Core+1/S
HCV-1b Core+1/S
305 325 345 365 385 400
Wavelength (nm)
0
6
3
Fig 5 Intrinsic fluorescence of Core+1/S
proteins UV fluorescence emission spectra
of Core+1/S proteins were recorded in
20 m M sodium phosphate buffer (pH 6.8,
2 m M ) (A) Fluorescence emission spectra of
HCV-1a and HCV-1b Core+1/S proteins in
buffer (B) Fluorescence emission spectra of
HCV-1b Core+1/S proteins were recorded
after boiling the protein for 20 min and
cool-ing to room temperature.
Trang 7sample Near-complete1HN,15N-backbone and13
C-res-onance assignment could be achieved for HCV-1b
Core+1/S (Fig 8A), with the exception of His14 and
Ser38, as well as the first two residues (Gly-Ala)
remaining from the TEV protease site The lack of
His14 resonances might be due to
protonation–depro-tonation equilibrium of the imidazole ring [38] The
absence of Ser38 resonances needs to be further
inves-tigated We used experimental carbon chemical shifts
to probe the presence of helical or b-sheet secondary
structures For all residues of HCV-1b Core+1/S, Ca
secondary chemical shifts were below 1.0 p.p.m
(posi-tive or nega(posi-tive) (Fig 8B), confirming the absence of
stable secondary structure elements in HCV-1b
Core+1/S However, a consensus was observed for
residues encompassing the region between residues 32
and 35, suggesting that this region might have a
ten-dency to b-sheet character Interestingly, the same
region was predicted to contain b-sheet elements by
the majority of the secondary structure prediction
methods, and also corresponds to a nondisordered
region according to globplot analysis (Fig 1B)
Methods based on chemical shifts are often used to
depict secondary structure elements, but quantitative
interpretation of secondary chemical shifts alone
remains difficult, because the expected values for fully
formed secondary structures vary for different amino
acids [39] In order to quickly visualize the fractional
deviation of the experimental chemical shifts from pure
a-helix or b-sheet secondary shifts, residue-specific
sec-ondary structure propensity (SSP) scores of HCV-1b
Core+1/S were calculated on the basis of ssp software
recommendations [40] ssp combines chemical shifts
from different nuclei weighted by their sensitivity to a-helix or b-sheet structures into a single SSP score varying between 0 and 1, or 0 and )1, for a-helix and b-sheet structures, respectively These scores represent the expected fraction of a-helix or b-sheet secondary structure for each residue Calculated scores of HCV-1b Core+1/S are very close to zero values, indicating
an overall low SSP In particular, the SSP profile shows almost no propensity to adopt a helical confor-mation along the protein sequence Although a mild propensity to adopt a b-sheet conformation is visible for residues encompassing the regions between 3 and
8, 32 and 35, and 41 and 44, it is very limited as compared to the maximal amplitude expected for a full b-sheet conformation
Finally, the1H–15N-HSQC NMR spectrum recorded for HCV-1b Core+1/S in the presence of 6% OG (Fig 8D) showed a few notable changes for Val21, Ile33, Trp34, Val35, Thr47, and five glycines distrib-uted all over the sequence (Gly7, Gly8, Gly22, Gly30, and Gly50) These results suggest a possible weak interaction of HCV-1b Core+1/S with OG
Discussion
HCV Core+1/S proteins are intrinsically disordered Core+1/S proteins correspond to the C-terminal parts
of most of the described HCV ARFPS To date, nei-ther biochemical nor biophysical data have been described for ARFPs Here, we succeeded in producing the Core+1/S proteins from HCV-1a and HCV-1b genotypes, using the standard Escherichia coli BL21
Hydrodynamic radius (nm) Hydrodynamic radius (nm)
0.0
0.8
0.4 0.0
0.4
0.2
0.8
0.4
R ave : 2.5 nm s: 1.2 nm
R ave : 2.6 nm s: n/a
R ave : 1.6 nm s: 0.6 nm
R ave : 4.5 nm s: 2.4 nm
R ave : 4.0 nm s: n/a
R ave : 1.8 nm s: 0.6 nm
0.0
10.0 8.0 6.0 4.0 2.0 0.0 0.0
0.4
0.2
10.0 8.0 6.0 4.0
2.0
0.8
0.4
0.0
0.8
0.4
HCV-1a Core+1/S HCV-1b Core+1/S
control
OG
Heat pulse
Fig 6 Size distribution histograms of HCV-1a and HCV-1b Core+1/S proteins deter-mined by DLS Twenty microliters of 80–100 l M protein samples in 20 m M sodium phosphate buffer (pH 6.8, 400 m M NaCl) were directly analyzed, incubated with
OG, or subjected to a heat pulse prior to analysis Samples were analyzed by DLS, and the hydrodynamic radius distributions of Core+1 proteins were determined using DYNALS , assuming a coil model Solid lines are the three-parameter nonlinear least squares fits of the size distribution profiles using a Gaussian model, yielding average radii (R ave ) and widths at the half-height (s) When the profile exhibits only two values,
an average radius was determined by weight averaging of the intensities.
Trang 8bacterial system We optimized the expression and
purification processes under native conditions, and
obtained substantial amount of native, highly pure,
untagged proteins We detected antibodies against recombinant 1b Core+1/S in the sera of HCV-infected patients, suggesting that the protein might be expressed during HCV infection, either alone or as a part of a larger ARFP
Combining the results of complementary biophysical techniques, our study showed that Core+1/S proteins lack secondary and tertiary structure 1H–15N-HSQC NMR experiments performed on both HCV-1a and HCV-1b Core+1/S constructs showed a limited chemi-cal shift dispersion of amide proton resonances into a narrow range (0.7 p.p.m) This is indicative of a disor-dered state, as inherent flexibility and rapid intercon-version between multiple conformations generally lead
to a poor chemical shift dispersion Exceptions are the
15N-backbone resonances in 1H–15N-HSQC spectra of Core+1/S proteins These resonances are influenced both by residue type and by the local amino acid sequence, and therefore remain well dispersed, even in fully unfolded states [41] In addition, the distribution
of correlation peaks around 10 p.p.m in the HSQC spectrum, which are assigned to tryptophan side chains, indicates that these residues lie in a very similar environment, in agreement with fluorescence data indicative of solvent-exposed tryptophans Together with the absence of consensus in the backbone carbon chemical shift differences, these observations suggest a lack of secondary structure for HCV-1b Core+1/S This conclusion is further reinforced by the high con-tent of unordered conformation ( 65%) determined
by CD spectroscopy Finally, the HSQC spectrum recorded for HCV-1a Core+1/S also displays a poor proton chemical shift distribution, suggesting that this protein is also disordered
When subjected to a heat pulse, folded proteins commonly unfold and precipitate, owing to solvent exposure of hydrophobic residues, whereas nonfolded peptides may remain in solution [33] We demonstrated that HCV-1b Core+1/S remains soluble after heat pulse treatment, as observed on fluorescence spectra Moreover, DLS shows that the mass distribution shifts
to lower molecular masses This is confirmed by the observation in NMR spectra of more intense peaks following a heat pulse (data not shown) No significant change was observed in CD spectra after such treat-ment, indicating that this treatment does not influence the global conformation of the polypeptide
Intrinsically disordered proteins (IDPs) are defined
as proteins containing at least one disordered region, and were recently recognized as a new protein class [42] Disordered proteins are gaining considerable attention, owing to their capacity to perform numer-ous biological functions despite their lack of defined
Wavelength (nm)
Wavelength (nm)
buffer OG
10
15
buffer OG
HCV-1a Core+1/S
HCV-1b Core+1/S
0 50 100
C
U
0
–5
–10
–15
–25
5
–20
A
B
10
15
0
–5
–10
–15
–25
5
–20
α β
θ[MR
2·dmol
θ[MR
2·dmol
Fig 7 Far-UV CD analysis of HCV Core+1/S proteins Data are
rep-resented as molar ellipticity per residue Core+1/S proteins (4 l M )
in 20 m M sodium phosphate buffer, 50 m M NaCl, and 0.15 m M
dith-iothreitol (A, B) CD spectra of HCV-1a and HCV-1b Core+1/S
pro-teins in buffer (solid line), after incubation with 6% OG (C) Far-UV
data were analyzed with the CDP ro package, using two algorithms
( CONTINLL , and CDSSR ) and three protein databases (SP43, SMP56,
and SDP48) a, a-helix; b, b-sheet; U, turns and unordered
second-ary structure.
Trang 9structure [42–47] Under native conditions, Core+1/S
proteins remain unstructured, and should therefore be
classed as IDPs This character is also confirmed by
disorder and structure predictions based on protein
sequences This is not the first time that an HCV
protein has been reported to be at least partially
disordered Indeed, the first 82 amino acids of the
N-terminal part of Core protein and domain 2
of NS5A protein have already been classed as IDPs
[48–50] Domain 3 of NS5A is also natively unfolded
[51] More generally, intrinsic disorder is commonly
found in viruses For instance, among Flaviviridae,
Dengue virus, West Nile virus and bovine viral
diar-rhea virus capsid proteins contain flexible, basic
regions [52–54] Proteins from other virus families
were also identified as being partially or completely
disordered, such as the Nef protein of simian immu-nodeficiency virus [55], HIV tat protein [56], and the nucleoprotein and phosphoprotein of the measles virus [57,58] As virus genomes are restricted in molecular size, the flexible nature of disordered regions of pro-teins may allow efficient interaction with several tar-gets [59]
HCV Core+1/S proteins tend to self-associate The deconvolution of Core+1/S CD spectra suggested the presence of a significant proportion of b-sheet sec-ondary structures (30%), in disagreement with the NMR-derived SSP A first hypothesis to explain this
is the difference in concentration range used to obtain CD and NMR data However, the position and
Cα
Cβ
C0
Amino acid sequence
B
–1
0
1
2
–2
–1
0
1
2
–2
–1
0
1
2
–2
Without OG
With OG 6%
V21
I33 V35
W34 A17
G30
G22
G7 T47 G50
G8
8.0 8.2
114
118
122
126 110
130
A17
A56
W34
I33 A2
A5
R36
A43
R42R4
A44
R53
I39
V32
F52
W49 V35
A23 C13 S3 L20 R31 W6 V21
V29 V16
M3
Q9 T26
S55
D10
M25
M1
S48 S54 S45
S12
S41
G22
T51 T47 G50
G28
G7
G30
G8
1 H (p.p.m.)
8.0 8.2
10.0 130
Amino acid sequence
–0.5
0.5 1.0
–1.0 0.0
C
Fig 8 NMR results for HCV-1b Core+1/S (A) Standard 2D 1 H– 15 N-HSQC spectrum recorded at 600 MHz and 22 C on a
100 l M sample of HCV-1b Core+1/S Each cross-peak corresponds to a correlation between an amide hydrogen atom and a nitrogen atom Assignments have been deposited in the BMRB (Ref 16487) (B) Differences between experimental carbon chemical shifts and random coil values as a function of sequence number (C) SSP of HCV-1b Core+1/S Carbon chemical shifts were used to calculate the residue-specific SSP scores of HCV-1b Core+1/S by follow-ing the SSP software recommendations Positive values ranging from 0 to 1 and neg-ative values ranging from 0 to )1 represent the propensities to form pure a-helix and b-sheet structures, respectively (D) Effects
of the nonionic detergent OG on HCV-1b Core+1/S The superimposition of 2D
1 H– 15 N-HSQC spectra of HCV-1b Core+1/S
in the absence (blue) or presence (green) of 6% OG is shown.
Trang 10bandwidth of peaks from HSQC spectra recorded with
30 or 400 lm HCV-1b Core+1/S samples are strictly
identical (data not shown), suggesting the absence of a
concentration effect, at least in this concentration
range On the other hand, the fact that the NMR
tech-nique is a very powerful method, allowing recording of
data at an atomic level, raises the question of potential
problems with experimental CD data collection and/or
inappropriate reference databases used to fit the CD
data First, CD data were collected and analyzed
fol-lowing the key considerations well described by
Green-field [60], allowing us to reasonably rule out data
collection issues, although they are not fully excluded
Second, the reference databases are derived from
glob-ular soluble proteins, and include only a few
disor-dered proteins For instance, the SDP48 reference
database employed in the present study contains only
five denaturated proteins in a total of 48 proteins
Therefore, the use of these databases for nonglobular
proteins is not really appropriate, as peptides or
disor-dered proteins tend to adopt multiple conformations in
equilibrium rather than a single structure
Although the CD results might overestimate the
b-sheet content, both CD and NMR data qualitatively
indicate a b-sheet secondary structure propensity This
observation suggests that the detected b-sheet signal
could be due to partial oligomerization of the natively
disordered HCV Core+1/S proteins This hypothesis
is also supported by the DLS results, which reveal the
existence of relatively high molecular mass particles in
protein samples, although previously purified in a
monomeric form by size exclusion chormatography
The residues involved in such oligomerization might be
located in the core of the protein between Ile33 and
Val35, as suggested by the chemical shift deviations
from random coil values Despite their lack of
folded and globular structure, intrinsically disordered
states of proteins often possess significant amounts of
transient structure [47]
Biological roles of Core+1/S proteins
Most HCV proteins contain membrane anchor
domains [61] The presence of hydrophobic patches on
Core+1/S protein sequences supports the hypothesis
that the proteins might contain membrane association
determinants, which may partially explain the
polydis-perse behavior of the protein in aqueous solution
Interestingly, confocal microscopy and Triton X-100
cell fractionation have previously demonstrated that
HCV-1a Core+1/S localizes in internal membranes
and the endoplasmic reticulum of transiently
transfect-ed Huh7 cells [25] Furthermore, the Core protein itself
has been found to be associated with membranes [48] The influence on Core+1/S behavior of OG, a non-ionic detergent known to solubilize integral membrane proteins, was therefore investigated further DLS showed that OG micelles reduce Core+1/S dispersity Moreover, the CD spectrum showed a change on the addition of the detergent Finally, HSQC experiments showed that only a few residues are affected by the presence of OG Taken together, these results are indicative of a possible weak interaction with the detergent, as is often observed for IDPs However, further experimental data on the structural character-ization of a putative interaction between Core+1/S proteins and membranes and comparison with the membrane association properties of the Core protein would be required
The presence of circulating antibodies against the HCV Core+1/S proteins suggests that their expression might occur at a certain stage of HCV infection Furthermore, the facts that Core+1/S proteins are disordered under native conditions, and that their ORFs are well conserved among HCV genotypes, support the hypothesis that the disordered nature of Core+1/S proteins might have some roles during HCV infection The disordered nature of the Core+1/
S proteins, which confers conformational and recogni-tion plasticity to the proteins, may be required for the binding of different partners through the same region,
as is typical for natively disordered proteins [59] This feature is often found for proteins involved in cell signaling and regulation [44] Our study contributes to the characterization of the Core+1/S proteins, provid-ing new insights into their biophysical properties Further studies will be required to identify the cellular targets of Core+1/S proteins, enabling the characteri-zation of the role of Core+1/S proteins in HCV pathogenicity
Experimental procedures
Protein sequence analysis
To analyze the degree of conservation of the Core+1/S amino acid sequence among HCV genotypes, Core+1/S amino acid sequences were deduced from the Core+1 ORFs of different HCV genotypes retrieved from the NCBI website (http://www.ncbi.nlm.nih.gov) [5] and aligned using
was performed using globplot [63] and pondr [64] Sec-ondary structure predictions were performed on HCV-1a and HCV-1b Core+1/S, using four algorithms (sopma,
website (http://pbil.ibcp.fr/)