Báo cáo khoa học: Folding of epidermal growth factor-like repeats from human tenascin studied through a sequence frame-shift approach pdf

Folding of epidermal growth factor-like repeats from human tenascin studied through a sequence frame-shift approach Francesco Zanuttin, Corrado Guarnaccia, Alessandro Pintar and Sa´ndor

Trang 1

Folding of epidermal growth factor-like repeats from human tenascin studied through a sequence frame-shift approach

Francesco Zanuttin, Corrado Guarnaccia, Alessandro Pintar and Sa´ndor Pongor

International Centre for Genetic Engineering and Biotechnology (ICGEB), Protein Structure and Bioinformatics Group, Trieste, Italy

In order to investigate the factors that determine the correct

folding of epidermal growth factor-like (EGF) repeats

within a multidomain protein, we prepared a series of six

peptides that, taken together, span the sequence of two EGF

repeats of human tenascin, a large protein from the

extra-cellular matrix The peptides were selected by sliding a

window of the average length of tenascin EGF repeats over

the sequence of EGF repeats 13 and 14 We thus obtained

six peptides, EGF-f1 to EGF-f6, that are 33 residues long,

contain six cysteines each, and bear a partial overlap in the

sequence While EGF-f1 corresponds to the native EGF-14

repeat, the others are frame-shifted EGF repeats We carried

out the oxidative folding of these peptides in vitro, analyzed

the reaction mixtures by acid trapping followed by LC-MS,

and isolated some of the resulting products The oxidative

folding of the native EGF-14 peptide is fast, produces a single three-disulfide species with an EGF-like disulfide topology and a marked difference in the RP-HPLC retention time compared with the starting product On the contrary, frame-shifted peptides fold more slowly and give mixtures of three-disulfide species displaying RP-HPLC retention times that are closer to those of the reduced peptides In contrast to the native EGF-14, the three-disulfide products that could be isolated are mainly unstructured, as determined by CD and NMR spectroscopy We conclude that both kinetics and thermodynamics drive the correct pairing of cysteines, and speculate about how cysteine mispairing could trigger di-sulfide reshuffling in vivo

Keywords: EGF; folding; disulﬁde; extracellular proteins

Tenascin-C [1–3] is a large extracellular matrix glycoprotein

expressed during embryonic development and in

prolifera-tive processes such as wound healing and tumorigenesis

Although the function of tenascin is not fully understood,

careful studies on tenascin-C-deﬁcient mice recently

high-lighted the function of tenascin in hematopoiesis [4] and

identiﬁed behavioral abnormalities that point to a role of

tenascin in the development and maintenance of proper

brain chemistry [5] The cloning of tenascin unraveled its

modular architecture [6–8] The N-terminal region, which is

responsible for tenascin oligomerization, is followed by a

series of 14 epidermal growth factor-like (EGF) tandem

repeats, 15 ﬁbronectin type III domains and a C-terminal

ﬁbrinogen-like domain Several studies attempted to map

the different biological activities of tenascin to selected

domains Recently, a role of tenascin EGF repeats as

immobilized, low afﬁnity ligands for the EGF receptor

(EGFR) has been proposed [9], stemming from the

observation that selected EGF repeats of tenascin-C bind

and directly activate EGFR and induce mitogenesis in

mouse ﬁbroblasts

EGF domains [10] are 30–50 residue long repeats characterized by the strict conservation of six cysteine residues that form three disulﬁde bonds with the topology 1–3, 2–4, 5–6 The common structural feature of EGF domains is a two-stranded b-sheet from which the three disulﬁde bonds depart to connect the N- and C-terminal loops, to make a rather compact structure [11] Beside the six cysteines, a wide variability in the length and compo-sition of the stretches connecting the cysteines has been observed Probably because of its capability to accommo-date very different sequences on a common scaffold, the EGF domain is one of the most frequently employed building blocks in modular proteins EGF domains are found in more than 300 human extracellular proteins [12]

As EGF domains occur very frequently as multiple tandem repeats, the total number in human proteins exceeds 4000 [12] The oxidative folding in vitro of human EGF [13], as well as the folding of other small three-disulﬁde domains [14], has been studied in detail The only general conclusion

is that while the disulfide topology of the final product is well conserved, the oxidative folding pathway of EGF domains is rather complex and unpredictable [14] In human EGF, the rapid formation of the second and third disulfide bonds leads to an intermediate species that accumulates and acts as a kinetic trap [13]; the native product, however, is not developing through the formation of the third disulfide bond, which is slow, but rather from the scrambling of disulfide bonds in other three-disulfide, non-native pro-ducts The disulfide bonds lock the conformation of the protein into a stable structure [15] even though not all the disulfide bridges are equally important for the maintenance

of the 3D structure [16] The human EGF precursor protein

Correspondence to A Pintar, International Centre for Genetic

Engineering and Biotechnology (ICGEB), Protein Structure and

Bioinformatics Group, AREA Science Park, Padriciano 99, I-34012

Trieste, Italy Fax: +39 040 226555, Tel.: +39 040 3757354,

E-mail: pintar@icgeb.org

Abbreviations: EGF, epidermal growth factor; TBTU,

O-(benzotri-azol-1-yl)-1,1,3,3-tetramethyluronium tetraﬂuoroborate; Trt, trityl.

(Received 28 June 2004, revised 26 August 2004,

accepted 8 September 2004)

Trang 2

itself is in fact expressed as a 150–180 kDa multidomain

type I membrane protein [17] containing nine EGF-like

domains The soluble 53 residue EGF corresponds to the

ninth EGF domain in the precursor, from which it is

released by proteolytic cleavage

A long stretch of 14 tandem EGF repeats is found in the

N-terminal region of human tenascin [8] The 14 EGF

repeats, which are encoded by a single exon [7], show a

variable degree of similarity within each other, which ranges

from 35 to 74% identity, and have the peculiarity to be only

31 residues long, with a spacing of 25 residues between the

ﬁrst and the last cysteine The correct pairing of cysteines to

form disulﬁde bridges is critical to reach the ﬁnal native fold,

and we wondered, on the one hand, what determines the

pairing of cysteines to give the correct disulﬁde bond pattern

within each repeat and, on the other hand, what drives this

topology to be repeated in the same frame along the amino

acid sequence of a multirepeat protein In fact, little is

known about the folding of large modular proteins that are

targeted to the extracellular environment, and the inherent

complexity of oxidative folding in cysteine-rich proteins

requires a simple model system that can be studied in detail

by physico-chemical methods

With this purpose, we prepared, by solid-phase synthesis,

a series of six peptides (Fig 1) that, taken together, span the

sequence of the two last EGF repeats of human tenascin,

EGF-13 and EGF-14 The peptides were designed by

selecting a window of 33 amino acids, which corresponds to

the average length of the tenascin EGF repeats, and sliding

this window over the amino acid sequence of tenascin EGF

repeats 13 and 14 (residues 560–622) The window was slid

by one cysteine at each step, thus obtaining six peptides

named EGF-f1 to EGF-f6, that are all 33 residues long,

contain six cysteines, and bear a partial overlap in their

sequences While f1 corresponds to the putative

EGF-14 repeat, the others are frame-shifted EGF repeats We

carried out the oxidative folding of these peptides in the

presence of a redox couple, analyzed the reaction mixture by

acid trapping followed by LC-MS, compared the different folding proﬁles, and characterized some of the three-disulﬁde products that are formed

We discuss the signiﬁcance of the frame-shift approach in terms of the kinetic and thermodynamic aspects that drive the correct folding of EGF repeats within multidomain proteins, and in relation to the folding in vivo of disulﬁde-rich proteins

Experimental procedures Reagents

Fmoc-protected amino acids were purchased from Chem-Impex International (Wood Dale, IL, USA), Fluka (Buchs, Switzerland), Advanced Biotech Italia (Seveso, Italy) and NovaBiochem (Darmstadt, Germany) TentaGel S trityl (Trt) resins loaded with the required Fmoc-protected amino acids (Fluka) were chosen as solid supports The resin capacity ranged from 0.18 to 0.2 mmolÆg)1 Synthesis-grade reagents employed in the peptide synthesis were from Biosolve LTD (Valkenswaard, the Netherlands) except 2,6-dimethylpyridine and diisopropylethylamine, which were obtained from Aldrich (Steinheim, Germany)

Chemicals used in cleavage and deprotection steps were from Aldrich and Fluka, triﬂuoroacetic acid from Biosolve HPLC grade acetonitrile for chromatographic separations was obtained from Riedel-deHaen (Seelze, Germany) Endopeptidase AspN (27750 UÆmg)1) and thermolysin (8560 UÆmg)1) were from Calbiochem (Darmstadt, Germany)

Peptide synthesis The 33 amino acid peptide corresponding to residues 590–

622 of human tenascin-C (Swiss-Prot: TENA_HUMAN), EGF-14 (Fig 1), was synthesized by solid-phase Fmoc based strategy The synthesis was automatically performed

Fig 1 Amino acid sequence of human tenascin EGF-repeats 13 and 14 Amino acid sequence of human tenascin (Swiss-Prot: TENA_HUMAN, residues 560–622) EGF-repeats 13 and 14, and of the synthesized peptides; f1 corresponds to EGF-14, while f2–f6 correspond to the diﬀerent frame-shifted peptides Cysteines are highlighted in gray, non-native residues are in italics, the limit between the two EGF-repeats is shown by an arrow A model of the tandem repeats is also shown on top as a Ca trace After a search for a suitable template with 3 D - PSSM [38] the model was built by

MODELLER [39] using the structure of an EGF pair from ﬁbrillin (PDB: 1emn) as template Because of the low sequence similarity between tenascin and ﬁbrillin EGF repeats (38% identity) the model is only approximate To map the synthesized peptides over the structure, peptide limits are pinpointed by spheres and labeled by residue number.

Trang 3

with a PS3 Protein Technology (Tucson, AZ, USA)

synthesizer on a 0.07-mmol scale The Fmoc protected

amino acid, the coupling reagent [TBTU;

O-(benzotriazol-1-yl)-1,1,3,3-tetramethyluronium tetraﬂuoroborate], and

the base (diisopropylethylamine), in a ratio of 1 : 1 : 2,

were dissolved in dimethylformamide using a four molar

excess in respect to the initial resin substitution, to give a

ﬁnal amino acid concentration of 0.3M Each coupling

step took 45 min from S2 to C16 and 1.5 h from C16 to

G33 Fmoc deprotection was carried out in 20% (v/v)

piperidine in dimethylformamide for 5 min and the

reaction repeated twice Cysteine residues were added

manually as N-a-Fmoc-S-trityl-L-cysteine

pentaﬂuoro-phenyl ester [Fmoc-Cys(Trt)-OPfp] dissolved in

dimethyl-formamide in a 2-h reaction in order to avoid cysteine

racemization [18] The side chain-protected peptide-resin

was washed with dichloromethane, dried, cleaved and

deprotected in 90% (v/v) triﬂuoroacetic acid, 5% (v/v)

1,2-ethanedithiol, 2.5% (v/v) triisopropylsilane, 2.5% (v/v)

water and phenol (0.5M) for 2 h at room temperature

The solution was ﬁltered in order to remove the resin and

triﬂuoroacetic acid was evaporated in vacuum The

deprotected peptide was dissolved in water, the solution

extracted ﬁve times with 6–8 volumes of diethyl ether to

remove scavengers, and ﬁnally freeze-dried The crude

EGF-14 peptide was puriﬁed by RP-HPLC on a Gilson

chromatographic apparatus using a Zorbax 300SB-C18

9.4· 250 mm column (Agilent) with a linear gradient of

triethylammonium acetate buffer (25 mM, pH 7) and

triethylammonium acetate buffer (25 mM, pH 7) in

water/acetonitrile 1 : 9 (v/v)

Frame-shifted peptides, f2 to f6 (Fig 1) were manually

synthesized by standard stepwise solid-phase procedure on a

0.1-mmol scale In f3, an Ala residue was inserted instead of

Ile at the N-terminus and an extra Ser residue was added at

the C-terminus to avoid the presence of bulky aliphatic

residues or Cys, respectively, at the peptide ends The

coupling reactions were performed with 4 eq of the

Fmoc-protected amino acids and activator (TBTU), and 8 eq of

diisopropylethylamine in dimethylformamide for 1 h

Kai-ser’s ninhydrin test [19] was systematically applied after each

coupling in order to check reaction completion Fmoc

protecting groups were removed by a 20% (v/v) solution

of piperidine in dimethylformamide containing 0.1M

1-hydroxybenzotriazole When necessary, a second coupling

reaction was made using

1H-benzotriazol-1-yl-oxy-tris(pyr-rolidino)phosphonium hexaﬂuorophosphate (PyBop) or

O-(7-azabenzotriazol-1-lyl)-1,1,3,3-tetramethyluronium

hexa-ﬂuorophosphate as coupling reagents in

dimethylforma-mide The double coupling procedure was systematically

adopted with cysteine amino acids Also in this case,

Fmoc-Cys(Trt)-OPfp was employed in the ﬁrst reaction to

minimize Cys racemization [18] In the second coupling, a

four molar excess solution of Fmoc-Cys(Trt)-OH, TBTU,

2,6-dimethylpyridine, in 1 : 1 : 2 ratio in dichloromethane/

dimethylformamide 1 : 1 (v/v) was used Peptides were

cleaved from the resin and deprotected as described for

EGF-14 Preparative RP-HPLC of frame-shifted peptides

f2–f6 was carried out on the same Gilson chromatographic

apparatus using a PrePak Cartridge 25· 100 mm (Agilent)

casted on a PrepLC Universal Base apparatus (Waters) and

a Zorbax 300SB-C18 9.4· 250 mm (Agilent) Samples

were eluted using a linear gradient of water/triﬂuoroacetic acid 0.1% (v/v) (buffer A) and acetonitrile/triﬂuoroacetic acid 0.1% (v/v) (buffer B)

Analysis of all peptides was carried out on the same chromatographic system Sample elution was followed by

UV detection at 214 nm Two Zorbax 300SB-C18 columns (Agilent) of different diameters were used: 1.0· 150 mm, 3.5 lm and 4.6· 150 mm, 3.5 lm with the same buffers The identity of the peptides was checked by LC-MS (see below)

Oxidative folding After puriﬁcation by RP-HPLC in acidic conditions, the reduced and lyophilized peptides (EGF-14 and f2–f6) were dissolved in an acidic water solution [triﬂuoroacetic acid 0.01% (v/v)] and immediately diluted 10· in the refolding buffer [0.1Mammonium acetate, 2 mMEDTA, Cys/cystine

20 : 1 (w/w), pH 8.5] previously ﬂushed with nitrogen The molar ratio between peptide cysteines and the cystine in the redox couple was 10 : 1 Comparable amounts of each peptide, as estimated by UV absorbance, were dissolved in a ﬁnal volume of 5 mL and used in time course refolding experiments Aliquots of the reaction mixtures were quenched at selected times (2.5, 5, 10, 15, 20, 30, 40, 60,

90, 120 min, 4 and 24 h) by acidification with trifluoroacetic acid to yield a final 2% (v/v) trifluoroacetic acid concen-tration and stored at ) 80 C The different species were identified by LC-MS (Mass spectrometry section) analysis and quantified by peak integration of the RP-HPLC profile (UV detection at 214 nm)

The ﬁnal products from the folding reactions of EGF-14, f5, and f6 were puriﬁed by RP-HPLC using a Zorbax 300SB-C18 (4.6· 150 mm, 3.5 lm) column with the same buffers A and B

Disulfide bond determination of EGF-14

To define the intramolecular disulfide topology, EGF-14 was treated with thermolysin and AspN In the first reaction

40 lg of the puriﬁed peptide dissolved in a 10-mM, pH 6.0 buffer was digested with 12 lg of thermolysin for 12 h at

37C A part of the digest was further incubated with AspN for 6 h at 37C Peptide fragments obtained from the digestions were fractionated by RP-HPLC with a water/ acetonitrile 0.1% (v/v) triﬂuoroacetic acid linear gradient and analyzed by LC-MS Reactions in the absence of the enzyme and in the absence of the substrate were used as negative controls

Mass spectrometry

MS analysis was carried out with an API 150 EX single quadrupole mass spectrometer (PE/Sciex, Thomhill, Can-ada) equipped with an ion spray source The identity of the synthesized peptides was checked and the digestion mixtures analyzed by LC-MS using a Zorbax 300SB C18 column (1.0· 150 mm, 3.5 lm) (Agilent) with a linear gradient of buffer A and B at a ﬂow rate of 50 lLÆmin)1 The analysis was achieved in positive-ion mode Time-course refolding experiments were followed by LC-MS in the same condi-tions In order to detect all possible disulﬁde species, the MS

Trang 4

spectrum was acquired over two 20 atomic mass units wide

windows centered on the m/z-values corresponding to the

double and triple charged reduced peptide The mass

spectrometer was run in total ion count mode with a step

of 0.l atomic mass units and a 1.5-ms dwell time, the oriﬁce

voltage being set at 30 V The reconstruction of the original

molecular mass of the peptides was achieved using the

BioMultiview software (Applied Biosystem)

NMR spectroscopy

Samples for NMR spectroscopy were prepared dissolving

the lyophilized peptides in H2O/D2O (90/10, v/v) and

adjusting the pH to 5.5 with NaOH 0.1M Sample

concentration was 2 mMfor EGF-14, 50 lM for f5b

and f6b, 10–15 lMfor f5a and f6a Spectra were recorded

on a Bruker Avance DRX 500 operating at a1H frequency

of 500.12 MHz and equipped with a triple resonance, z-axis

gradient cryo-probe and on a Bruker Avance DRX 700

operating at a1H frequency of 700.13 MHz and equipped

with a triple resonance, z-axis gradient probe TOCSY and

NOESY spectra were recorded using a mixing time of 60 ms

and 150 ms, respectively, and a WATERGATE [20]

pulse scheme for solvent signal suppression Typically, 2D

experiments on f5b and f6b were recorded on the 500 MHz

equipped with the cryo-probe, acquiring 16 scans (64 for the

NOESY), 256 experiments in the t1 dimension, and 4 k

complex points 1D experiments on f5b and f6b were

recorded with 256 scans and 16 k complex points For f5a

and f6a, 1D experiments only were acquired on the

700 MHz, typically with 2048 scans and 16 k points Amide

temperature coefﬁcients were calculated from 2D TOCSY

and 1D spectra recorded between 298 K and 302 K with a

1 K step Additional spectra were recorded at 308, 313, and

318 K Data were transformed using Xwin-NMR (Bruker

BioSpin) and analyzed using XEASY [21] Chemical shifts

were referenced to sodium trimethylsilylpropionate

CD spectroscopy

Samples for CD spectroscopy were prepared dissolving the

lyophilized peptides in water Peptide concentration was

determined by amino acid analysis Brieﬂy, hydrolysis was

carried out for 60 min in vacuo at 150C in the presence of

6MHCl containing 2% phenol (w/w) Derivatization of the

amino acid mixture with phenylisothiocyanate was achieved

according to the standard protocol of PicoTag system

(Waters) Analysis of free amino acids was performed by

RP-HPLC on a PicoTag 3.9· 300 mm column The

resulting peptide concentrations were: 47 lM (EGF-14),

9 lM (f5a), 51 lM (f5b), 17 lM (f6a), 51 lM (f6b) CD

spectra were recorded on a Jasco J-810 spectropolarimeter

using 0.1 cm and 1 cm quartz cuvettes CD spectra of f5b

and f6b were recorded between 250 and 190 nm (0.1 cm

cuvette) and between 350 and 250 nm (1 cm cuvette); CD

spectra of f5a and f6a were recorded between 250 and

190 nm in a 1-cm cuvette CD spectra of native EGF-14

were recorded between 250 and 190 nm using a 0.1-cm

cuvette and between 350 and 250 nm with the same path

length and a 5X solution For each sample, ﬁve scans were

acquired, the baseline subtracted from the raw spectra, and

the mean residue ellipticity (MRE, degÆcm2Ædmol)1) was

calculated dividing the CD signal intensity (mdeg) by 10· c

· l · N, where c is the peptide concentration (M), l the path length (cm), and N the number of residues

A quantitative estimation of secondary structure content was carried out using different methods: SELCON3 [22],

CONTINLL [23], CDSSTR [24,25], and K2D [26] SELCON3,

CONTINLL, andCDSSTRwere run from the DichroWeb web server [27], K2D from theK2D web server [26].SELCON3,

CONTINLL, andCDSSTRwere applied using a reference data set of 49 proteins, including ﬁve denatured proteins, with a wavelength range of 240–190 nm [28].K2Ddoes not require any reference data set and makes use of data between 240 and 200 nm only

Results Peptide synthesis All peptides were prepared by standard solid-phase Fmoc-based methods, either in automatic or manual fashion After cleavage/deprotection of the peptide-resin, the identity

of the peptides was checked by LC-MS (Table 1) and the yield and purity estimated from RP-HPLC (Table 1) While products of Cys racemization were not observed, the most common side products were des-Cys-peptides and piperi-dide-derivate peptides The extent of the latter modification was minimized (< 5%, as estimated by LC-MS) using 1-hydroxybenzotriazole in dimethylformamide during Fmoc deprotection and keeping deprotection times to a minimum Deprotected, reduced peptides were purified by RP-HPLC to obtain final yields in the range 21–57% (Table 1) and purity > 99%

Oxidative folding The purified, lyophilized peptides were refolded in the presence of the Cys/cystine redox couple The time course of the refolding kinetics was followed both by LC-MS and RP-HPLC LC-MS was used to monitor the formation of disulfide bonds from the loss of two atomic mass units in molecular mass for each disulfide formed, while RP-HPLC was used to measure retention times and quantify the decrease of the starting product by UV detection at 214 nm The reduced peptides convert rapidly in a mixture of one-and two-disulfide products (Fig 2), which undergo a slower oxidation and reshuffling to give several three-disulfide isomers in frame-shifted peptides, or a largely predominant

Table 1 Peptide synthesis Yield (%) of the crude deprotected peptide

as estimated by weight; purity (%) of the crude deprotected peptide as estimated by RP-HPLC; ﬁnal yield (%); expected and observed average molecular mass (Da) of the reduced, puriﬁed peptide.

Peptide

Yield (%)

Purity (%)

Final yield (%)

Expected mass (Da)

Observed mass (Da)

EGF-14 85.5 66.4 56.7 3451.2 3452.0 f2 84.6 46.5 39.3 3483.4 3484.3 f3 86.1 44.1 38.0 3398.3 3398.0 f4 86.5 58.6 50.7 3327.3 3328.0 f5 83.4 48.3 40.3 3477.3 3478.7 f6 83.1 25.0 20.7 3451.3 3451.5

Trang 5

product for 14 Under our refolding conditions,

EGF-14 was rapidly oxidized and in 2 h transformed into the

native three-disulﬁde species On the contrary, the oxidative

folding of frame-shifted peptides f2–f6 resulted in a complex

mixture of oxidized isomers in all cases The equilibrium

pattern was reached within 24 h (Fig 3) and after this time

changes in the relative abundance of the species or

formation of new products were not observed The LC-MS

analysis conﬁrmed that all products in the ﬁnal mixtures are

three-disulﬁde isomers

The quantitative analysis of the RP-HPLC proﬁles

showed that the rates of disappearance of the reduced

forms are similar but not identical (Fig 4A) A ﬁt of

experimental data with a three-parameter negative

expo-nential curve (R > 0.99, data not shown) gave an apparent

rate constant value of 0.54 min)1for EGF-14, and values in

the range 0.22–0.26 min)1for f3–f6; the ﬁt for f2 was less

good, but still gave a value (0.4 min)1) that is slightly

smaller than that obtained for EGF-14 A noteworthy

difference in the rate of formation of three-disulﬁde peptides

was also observed (Fig 4B) EGF-14 reached its fully

oxidized form faster than the other peptides as

demonstra-ted by LC-MS and RP-HPLC analysis (Figs 2 and 4)

A further difference between EGF-14 and the

frame-shifted peptides is represented by the change in the

RP-HPLC retention time going from the reduced to the

oxidized species The final product of EGF-14 oxidative folding has a retention time that is considerably shorter with respect to the reduced species (reduced form, 21.6 min; oxidized form, 7.7 min) (Table 2), while for frame-shifted peptides most products show retention time values only slightly shorter than that of the corresponding reduced peptide Only in the case of f5 and f6, the retention time of one of the final products is significantly reduced compared with the starting product To quantitatively compare the behavior of the different peptides, the chromatographic parameter a, defined as the ratio between the retention time

of the oxidized product (RTox) and the retention time of the reduced peptide (RTred) was chosen As shown in Table 2, EGF-14 displays the lowest a value

EGF-14 disulfide topology The determination of the disulfı´de bond topology was addressed with the peptide mapping methodology tailored

on the peptide sequence and potential topology of disulﬁde bonds EGF-14 was digested ﬁrst by thermolysin From the digestion two peptides were obtained, with molecular mass

of 1403 and 2097 Da, respectively The former product conﬁrms the disulﬁde bridge between C611 and C620 The

2097 Da peptide, on the other hand, could not give an unequivocal answer about the two remaining bridges, which

Fig 2 Oxidative folding RP-HPLC proﬁles

of oxidative folding reactions, as detected by

UV at 214 nm, of the diﬀerent peptides at

selected refolding times The peak

corres-ponding to the initial, fully reduced form is

marked by the name of the peptide.

Trang 6

could be either C594–C604/C598–C609 or C594–C609/

C604–C609 The 2097 Da peptide was therefore treated

with AspN endopeptidase The reaction gave two fragments

of 810 and 985 Da, respectively, which can be produced

by the AspN cleavage at D597 only in the case of a C594–

C604/C598–C609 combination (Table 3) The experiment

thus conﬁrms that EGF-14 from human tenascin has a

disulﬁde topology typical of EGF domains (1–3, 2–4, 5–6)

NMR

1H NMR spectra of the peptides are reported in Fig 5, and

show the drastically different dispersion in the backbone

amide chemical shifts of EGF-14, in respect to that of the

frame-shifted peptides TOCSY spectra of f5b and f6b

(Fig 6) were recorded at different temperatures between

298 K and 318 K The chemical shift dispersion of the

backbone NH goups did not change in this temperature

range (7.8–8.8 p.p.m at 318 K, 7.7–8.7 p.p.m at 298 K for

f5b; 7.6–8.6 p.p.m at 318 K, 7.7–8.8 p.p.m at 298 K for

f6b), and neither did the dispersion in the Ca chemical shifts,

but the appearance of TOCSY spectra in the NH/aliphatic

region is different At 318 K, strong and sharp cross-peaks are present in the fingerprint region (3.5–5.0 p.p.m.) of f5b and, although for some residues the magnetization transfer along the side chain was not very efficient, the number of identified spin systems corresponds to the expected value

On the contrary, several cross-peaks are undetectable

or have very low intensity at 298 K After a tentative assignment of spin systems, the distribution of NH chemical shifts was compared with that expected for a random coil

Fig 3 Equilibrium mixtures RP-HPLC proﬁles of oxidative folding

reactions, as detected by UV at 214 nm, of the diﬀerent peptides after

24 h Retention times of labeled species are reported in Table 2.

0 5 10 15 20 25 30 35 40

time (min)

B

0 20 40 60 80 100

time (min)

A

Fig 4 Oxidative folding kinetics (A) Disappearance of the starting product (%, area of the initial reduced form with respect to the total integrated area) for EGF-14 (blue), f2 (green), f3 (red), f4 (light blue), f5 (black), f6 (orange) (B) Formation of three-disulﬁde species (%, area of

a three-disulfide species with respect to the total integrated area) for EGF-14 (blue), f2 (green), f3 (red), f4 (light blue), f5 (black), f6 (orange); different species (a, b in Fig 3) originating from the same peptide are shown as empty and filled triangles, respectively Oxidative folding kinetics were followed by RP-HPLC and UV detection at 214 nm.

Trang 7

peptide of the same sequence [29], and with that of the

native EGF-14 plotting the percentage of NH peaks in each

0.1 p.p.m chemical shift interval (Fig 7) The chemical

shift dispersion of backbone NHs in f5b (r¼ 0.27) is

two times larger than that expected for a random coil

peptide of the same sequence (r¼ 0.12), but less than half

of that of the native EGF-14 (r¼ 0.63) In a similar way,

the chemical shift dispersion of backbone NHs in f6b (r¼

0.26) is three times larger than that expected for a random

coil peptide of the same sequence (r¼ 0.064), but

consid-erably smaller than that of the native EGF-14 (r¼ 0.63)

Peptide f5a showed an even smaller dispersion in the

backbone NH chemical shifts compared with f5b, with

broad unresolved lines in the range 7.8–8.7 p.p.m On the

contrary, f6a displayed a slightly larger chemical shift

dispersion than f6b, with most of the peaks clustered in the

region 7.8–8.9 p.p.m., but three NH resonances shifted

downﬁeld at 9.2–9.3 p.p.m In a similar way, also in the

methyl region, a slightly larger chemical shift dispersion was

observed

The amide NH temperature coefﬁcients were measured

between 298 K and 302 K Such a small temperature

interval was chosen to limit chemical shift variations due to

temperature-induced conformational changes, and is

nev-ertheless sufﬁcient to measure temperature coefﬁcients in a

reliable way for most of the detectable spin systems

Measured values were more negative than) 4.3 p.p.b.ÆK)1

and) 4.8 p.p.b.ÆK)1for f5b and f6b, respectively, suggesting that no stable H-bond involved in secondary structure elements is formed [30] However, several NH amides had values in the borderline region around)4.5 p.p.b.ÆK)1 The chemical shift of the aromatic protons in the three histidine residues was also compared In f5b, the 4H protons all resonate between 7.05 and 7.10 p.p.m., while the 2H protons are well separated and resonate at 8.08, 8.27, and 8.40 p.p.m at 298 K In f6b, the 4H protons resonate between 7.10 and 7.15 p.p.m., and the 2H protons, which are not as well separated as in f5b, resonate between 8.32 and 8.39 p.p.m at 298 K

NOESY spectra of both f5b and f6b displayed very few cross-peaks, suggesting a correlation time for the molecules close to the zero-point of the NOE at that ﬁeld

CD The CD spectrum of EGF-14 is dominated by a negative band in the far-UV region (Fig 8A) This band has its minimum at 200 nm, a shoulder at 215 nm and is going to zero at 190 nm Two additional much weaker positive bands can be observed in the far-UV at 235 nm and in the near-UV at 270 nm (Fig 8B) The CD spectra of f5b and f6b (Fig 8) are also dominated by the negative band at 200 nm and resemble that of EGF-14, but the shoulder at 215 nm and the positive bands are missing; on the contrary, the CD spectrum of f6b is slightly negative at 270 nm, and the intensity of this band is roughly four times weaker than that

of EGF-14 The CD spectra of f5a and f6a could be recorded only in the far-UV region Peptide f5a has two very weak negative bands at 205 and 230 nm, while the spectrum of f6a

is characterized by a weak negative band shifted at 215 nm The positive CD band in the spectrum of EGF-14 in the 250–300 nm region can arise both from the contribution of the only Tyr present and from the disulﬁde bonds Peptides f5b and f6b do not contain any Tyr but one Phe instead, which does not contribute signiﬁcantly to the adsorption beyond 270 nm The weak negative band displayed by f6b

in this region might then arise from a partial order in the disulfide bonds On the contrary, f5b does not show any optical activity in this range, suggesting that the disulfides are flexible

Table 2 RP-HPLC retention times Retention times of the reduced

(RT red , min) peptides and of the main three-disulﬁde species (RT ox ,

min); diﬀerence in retention times of the reduced and oxidized forms

(DRT, min) and selectivity parameter (a, deﬁned as RT ox /RT red ) for

three-disulﬁde species.

Peptide RT red (min) RT ox (min) DRT (min) a

f2b 22.3 6.4 0.78

f5b 18.2 4.6 0.80

f6b 18.0 5.2 0.78

Table 3 EGF-14 disulfide mapping Determination of disulfide bond topology of EGF-14 by proteolysis and identification of the fragments by

LC-MS Cleavage sites are identiﬁed by a slash (/), Cys residues in bold The disulﬁde pattern numbering refers to the consecutive positions of cysteines within the sequence.

Mass, calculated (Da)

Disulﬁde pattern Thermolysin GQHSCPSDCNN(590–600)/LGQC(601–604)/VSGRC(605–609) 2097.5 (M+H)1+ 2097.3 1–3; 2–4

1049.8 (M+2H) 2+ 1048.6 ICNEGYSGEDCSE(610–622) 1403.8 (M+H)1+ 1403.4 5–6

702.5 (M+2H)2+ 701.7 ICNEG(610–614)/YSGEDCSE(615–622) 1422.0 (M+H) 1+ 1421.3 5–6

711.5 (M+2H) 2+ 710.6 aspN SCPS(593–596)/LGQC(601–604) 810.1 (M+H)1+ 811.5 1–3

406.1 (M+2H) 2+ 405.9 DCNN(597–600)/VSGRC(605–609) 985.5 (M+H) 1+ 985.1 2–4

493.5 (M+2H)2+ 493.1

Trang 8

The positive band at 230 nm in the far-UV CD spectrum

of EGF-14 can also arise from the contribution of Tyr This band is not present in the spectra of the frame-shifted peptides The other bands in this region are mainly dictated

by the electronic transitions of the backbone chromophores and are sensitive to the presence of secondary structure elements A qualitative analysis of the spectra suggests the absence of helical structure, and a dominant component of irregular structure in all the peptides

A quantitative analysis of secondary structure content was carried out using different methods [26,27] (SELCON3 [22], CONTINLL[23], CDSSTR [24,25], K2D[26]) These CD spectra analysis programs did not produce satisfactory results in all cases This is not surprising, given that in such small, disulﬁde-rich peptides containing relatively little regular secondary structure, the contribution of side chains

to the overall CD spectrum can be signiﬁcant The amounts

of b sheet, turn, and unordered structure found by these methods are in the range 25–35%, 15–20%, 40–65%, respectively, with no or negligible amounts of a-helix (data not shown)

Discussion The ‘frame-shift’ approach Proteins targeted to the extracellular environment can contain several tandem cysteine-rich domains [31], and the correct pairing of cysteines to form disulfide bridges is critical to reach the final native fold In principle, two different factors can determine the pairing of cysteines to give disulfide bonds in multidomain proteins: the topology

of the disulfides within each repeat, and the frame along which this topology is repeated over the amino acid chain Human tenascin contains 14 EGF-like repeats [7,8], for a total of 84 cysteines that need to be correctly paired to form, within each repeat, the 1–3, 2–4, 5–6 disulfide bond pattern that is characteristic of EGF modules To look into the factors that drive the consecutive modules to fold within this unique correct structural frame, we devised a simple model system that could be studied in detail by physico-chemical methods In this approach, six peptides were selected using a window that corresponds to the average length of tenascin EGF repeats (Fig 1) Sliding this window over the sequence of tenascin EGF repeats 13 and 14 (residues 560–622) by one cysteine at each step, we obtained six peptides that are all 33 residue long, contain six cysteines, and bear a partial overlap in the sequence While the first peptide corresponds to the native EGF-14 repeat, the others are frame-shifted EGF repeats display-ing a different pattern in the cysteine spacdisplay-ing The oxidative folding of frame-shifted peptides simulates, in

a way, the mispairing that would occur whether inter-rather than intra-repeat disulﬁde bonds form In other words, we forced misfolding to occur within short peptides that nevertheless maintain their native sequence

Oxidative folding Because the EGF repeat is one of the most commonly employed building block in extracellular proteins [10,12], we wondered if there might be a kinetic reason that largely

6.5 7.0 7.5 8.0 8.5 9.0

9.5

6.5 7.0 7.5 8.0 8.5 9.0

9.5

6.5 7.0 7.5 8.0 8.5 9.0

9.5

6.5 7.0 7.5 8.0 8.5 9.0

9.5

6.5 7.0 7.5 8.0 8.5 9.0

9.5

ppm

A

B

C

D

E

Fig 5 NMR spectroscopy.1H-1D spectra (amide/aromatic region)

of EGF-14 (A), f5a (B), f5b (C), f6a (D), f6b (E) at 298 K in H 2 O/D 2 O

(90 : 10, v/v).

Trang 9

favors the correct formation of disulﬁde bonds within the

same EGF repeat, or in other words, if the EGF-type repeats

are so successful because they fold fast Experimental results

at least partially support this hypothesis The disappearance

rates of the reduced frame-shifted peptides, including

EGF-14, are all within the same order of magnitude The

disappearance rate of the starting (reduced) product mainly

reﬂects the oxidation rate of cysteines to cystines to form a

ﬁrst disulﬁde bond and give a species that can be separated

by RP-HPLC As the redox potential is expected to be very

similar for all cysteines in the sequence in the presence of a

similar chemical environment, there are no gross variations

in the disappearance rate of the starting product However, significant albeit small differences are detectable, and the oxidation of EGF-14 is slightly faster A possible explan-ation is that the formexplan-ation of the first disulfide in EGF-14 is favored by some residual native-like structure in the reduced state or, alternatively, that the next oxidation steps are faster,

as suggested by the fact that the frame-shifted peptides only slowly evolve towards three-disulfide species, and remain trapped in a series of products, while EGF-14 is quickly finding its pathway to the native form, which within 2 h is the major species The burial of a disulfide bond in a native-like environment can for example alter its redox potential

8.0

3.5

4.0

4.5

5.0

(ppm)

8.0

3.5

4.0

4.5

5.0

(ppm)

Fig 6 NMR spectroscopy Fingerprint region of TOCSY spectra at 298 K (left) and 318 K (right) for f5b (top, A and B) and f6b (bottom, C and D).

Trang 10

and render it less accessible to the external redox couple.

Both kinetic (disappearance rate of the reduced peptide and

convergence towards a unique product) and thermodynamic

(stability of the three-disulﬁde species formed) factors are

therefore favoring the EGF-like topology, determining a

preferential folding frame in the cluster of highly repeated

domains

Peptide structure

Despite the complexity of the mixtures obtained in the

oxidative folding reactions, we were able to isolate and

characterize, by NMR and CD, some of three-disulfide species that are formed Our efforts were pointed towards the characterization of those products that displayed a large difference with respect to the retention time of the reduced and fully oxidized species This was considered an important indication of effective burial of hydrophobic residues upon folding, with the formation of a relatively compact struc-ture This, in turn, can be promoted by a crossed disulfide topology of the EGF type (1–3, 2–4, 5–6) or equivalent, while a linear arrangement of disulfides (1–2, 3–4, 5–6) is less likely to produce compact structures NMR and CD studies suggest that the products of the oxidative folding of frame-shifted peptides (f5a, f5b, f6a, f6b) are highly flexible in solution and only partially structured, with some degree of conformational restraint given by the presence of three

0

10

20

30

40

50

A

chemical shift (ppm)

0

10

20

30

40

50

B

chemical shift (ppm)

Fig 7 Backbone NH 1 H chemical shifts Distribution (%, black bars)

of backbone NH chemical shifts (p.p.m.) for f5b (A) and f6b (B) The

distribution of backbone NH chemical shifts of EGF-14 (gray bars)

and that expected for a random coil peptide of the same sequence

(white bars) are also shown.

-30 -25 -20 -15 -10 -5 0 5

190 200 210 220 230 240 250

2 dmol

-1 )

2 dmol

-1 )

wavelength (nm)

A

-100 -50 0 50 100 150 200 250 300

260 280 300 320 340

B

Fig 8 CD spectroscopy CD spectra (mean residue ellipticity, Q MR , degÆcm 2 Ædmol)1) in the far-UV (190–250 nm, A) of EGF-14 (black), f5b (red), f6b (blue), f5a (orange) and f6a (light blue) and in the

near-UV (250–350 nm, B) of EGF-14 (black), f5b (red), f6b (blue).

Định dạng
Số trang	12
Dung lượng	749,21 KB