This information is now available for a Keywords activity-based probe; ATP-binding proteins; COFRADIC; diagonal chromatography; N-terminal peptides; peptide sorting; protein N-glycosylat
Trang 1Applications of diagonal chromatography for proteome-wide characterization of protein modifications and
activity-based analyses
Kris Gevaert1,2, Francis Impens1,2, Petra Van Damme1,2, Bart Ghesquie`re1,2, Xavier Hanoulle3 and Joe¨l Vandekerckhove1,2
1 Department of Medical Protein Research, VIB, Ghent, Belgium
2 Department of Biochemistry, Ghent University, Belgium
3 UMR 8576 CNRS ) University of Sciences and Technologies of Lille, Structural and Functional Glycobiology Unit, Villeneuve d’Ascq, France
Introduction
Proteomics refers to a qualitative, differential and
quantitative estimation of a proteome Proteomes can
be extremely complex, often encompassing more than
10 000 different components per cell Two-dimensional
gel electrophoresis [1] followed by electroblotting and
microsequencing [2–4] or in-gel digestion combined
with Edman sequencing [5] of the generated peptides
or peptide mass fingerprinting [6–10] have been the
methods of choice to reproducibly separate and iden-tify complex protein mixtures Although large-scale 2D gel electrophoresis separates thousands of proteins [11,12], probably no more than a few hundred different proteins have been identified from such gels To obtain better proteome coverage, alternative methods were introduced Groundbreaking methodologies became available when high-throughput genome sequencing started to cover the entire genetic information of several species This information is now available for a
Keywords
activity-based probe; ATP-binding proteins;
COFRADIC; diagonal chromatography;
N-terminal peptides; peptide sorting; protein
N-glycosylation; protein processing
Correspondence
K Gevaert, Department of Biochemistry,
Faculty of Medicine and Health Sciences,
Ghent University, A Baertsoenkaai 3,
B-9000 Ghent, Belgium
Fax: +32 92649496
Tel: +32 92649274
E-mail: kris.gevaert@ugent.be
Website: http://www.proteomics.be
(Received 24 April 2007, revised 10
Septem-ber 2007, accepted 17 OctoSeptem-ber 2007)
doi:10.1111/j.1742-4658.2007.06149.x
Numerous gel-free proteomics techniques have been reported over the past few years, introducing a move from proteins to peptides as bits of informa-tion in qualitative and quantitative proteome studies Many shotgun pro-teomics techniques randomly sample thousands of peptides in a qualitative and quantitative manner but overlook the vast majority of protein modifi-cations that are often crucial for proper protein structure and function Peptide-based proteomic approaches have thus been developed to profile a diverse set of modifications including, but not at all limited, to phosphory-lation, glycosylation and ubiquitination Typical here is that each modifica-tion needs a specific, tailor-made analytical procedure In this minireview,
we discuss how one technique) diagonal reverse-phase chromatogra-phy) is applied to study two different types of protein modification: pro-tein processing and propro-tein N-glycosylation Additionally, we discuss an activity-based proteome study in which purine-binding proteins were pro-filed by diagonal chromatography
Abbreviations
ABP, activity-based probe; COFRADIC, combined fractional diagonal chromatography; FSBA, 5¢-p-fluorosulfonylbenzoyladenosine; FSBG, 5¢-p-fluorosulfonylbenzoylguanosine; iTRAQ, isobaric tags for relative and absolute quantification; MudPIT, multidimensional protein
identification technology; PNGaseF, peptide N-glycosidase F; SB, sulfobenzoyl; SILAC, stable isotope labeling by amino acids in cell culture; Nbs2, 2,4,6-trinitrobenzenesulfonic acid.
Trang 2large number of species, and it now suffices to generate
partial protein sequence information with which to
access entire (predicted) protein sequences stored in
expressed sequence tag, gene and protein sequence
databases
This brought the dawn of novel strategies for
tein identification Measured masses of peptides
pro-duced by cleaving a protein with a protease with
well-known specificity (e.g trypsin) were searched
against a database of peptide masses calculated from
protein sequences derived from genome sequences [6–
10] When these peptides are derived from a mixture
of proteins, they are subjected to MS⁄ MS
fragmenta-tion for identificafragmenta-tion [13] Recently, top-down
pro-tein sequencing combining ESI MS and highly
accurate FT MS [14] was shown to match proteins
larger than 200 kDa to sequences in databases [15]
Such strategies only became possible following the
availability of massive numbers of DNA sequences,
recent developments in MS, and bioinformatics tools
that link DNA and protein sequences to information
generated by different types of mass spectrometers
[16–18]
Recently, peptides have increasingly become the
center of analysis: protein mixtures, either partially
purified by prefractionation or as such, are digested
with trypsin, and the generated peptide mixture is
analyzed When cell or tissue lysates, or even isolated
organelles, are analyzed, the number of peptides
becomes so high that mass spectrometers can no
longer analyze all of the peptides This results in
poor sample coverage, generally referred to as
ran-dom sampling or undersampling [19], and it became
crucial to add peptide prefractionation before MS
analysis Yates’ group introduced separation of
pep-tides based on two parameters [20]) net charge and
hydrophobicity – and called their technique
multi-dimensional protein identification technology (MudPIT
[21]) MudPIT has since then been used in several
studies and has demonstrated its value, but it still
suffers from undersampling [19]
Selecting a lower number of peptides representative
of each protein originally present in the mixture may
alleviate this problem These so-called signature
pep-tides [22] are then the only analyzed components, and
in this way a less complex peptide mixture is presented
to the mass spectrometer The first reports using this
strategy were selective for cysteinyl peptides, allowed
quantification (differential analysis), and used biotin
tagging for consecutive capture by immobilized avidin
[23] Later on, affinity selection was used to isolate, for
instance, phosphopeptides [24], N-glycosylated peptides
[25], ubiquitinated peptides [26], and N-terminal pep-tides [27]
COFRADIC as a peptide-sorting tool Our peptide-centric proteome approach [28,29] sorts signature peptides and selects the part of a proteome containing the information of biological interest Our technique is based on diagonal chromatography [30,31] consisting of two repeated, identical peptide separa-tions with a specific modification reaction (sorting step) in between Peptides that remain unchanged elute
at the same position in the two chromatographic runs, whereas peptides that acquire a modification segregate from the unchanged peptides either in earlier or in later fractions To reduce the number of repetitive chromatographic runs, several fractions from the pri-mary run can be combined and subjected to the sort-ing reaction (Fig 1) For this reason, we call this adapted version of diagonal chromatography com-bined fractional diagonal chromatography
(COFRAD-IC [32])
It should be clear from the peptide-sorting principle that any chemical or enzymatic modification that is highly specific, is quantitative and produces a suffi-ciently large chromatographic shift can be imple-mented in COFRADIC This is illustrated by applications in which selection for methionyl or cyste-inyl peptides was carried out in tryptic digests of total cellular lysates [33,34], or where the N-terminal pep-tides of the proteins present in the mixture were selected [33–36] Similarly, as we can select peptides based on the specific chemical nature of their amino acid side chains, we can also select peptides carrying post-translational modifications, either by removing this modification (e.g by dephosphorylation of phos-phopeptides [37]), or by converting it into a moiety with altered properties (e.g by reducing nitrotyrosine to aminotyrosine [38]) An overview of the COFRADIC sorting protocols that have been developed is given in Table 1
We here concentrate on applications of COFRADIC
in studying selected post-translational modifica-tions) protein processing [34,36] and N-glycosylation [39] – and describe the use of COFRADIC for study-ing interactions between small molecules and proteins The latter is a particular application of ‘post-transla-tional COFRADIC’, by which a small molecule is covalently linked to a target protein and the corre-sponding modified tryptic peptide is then sorted using the principles of diagonal chromatography The exam-ple given here is a global activity-based proteome
Trang 3analysis of purine-binding proteins in a total lysate of human Jurkat T-cells [40]
COFRADIC analysis of protein processing ) protease degradomics Protein processing introduces novel protein fragments that may be visualized on 2D polyacrylamide gels For example, Canals et al used the fluorescent 2D differ-ence gel electrophoresis technique [41] to catalog quan-titative differences in the protein composition of conditioned media of cells either expressing the metal-loproteinase ADAMTS1 at physiological levels or overexpressing it [42] The latter scenario led to an increase of fragments of proteins shed by ADAMTS1 into the medium that were picked by difference gel electrophoresis and identified by MS In fact, this study led to the identification of five potential ADAM-TS1 substrates, two of which (nidogens 1 and 2) were further validated Gel-free proteomic approaches have been introduced for ‘degradomics’ [43] research as well The group of Overall used isotope-coded affinity tag [23] combined with LC-MS⁄ MS to quantify the levels
of secreted extracellular matrix proteins in breast carcin-oma cell cultures overexpressing a membrane type 1 matrix metalloproteinase [44] and, more recently, they multiplexed their analyses using isobaric tags for relative and absolute quantification (iTRAQ) reagents for the identification of matrix metalloproteinase-2 substrates in fibroblasts [45]
Clearly, both gel-based and gel-free approaches point to potential protease substrates; however, at this stage it is important to note that the characterization
of the actual protein cleavage site has typically remained elusive Nonetheless, the latter information is highly valuable, as it can lead to more rational design
of protease inhibitors [46], it is vital for constructing precise algorithms that predict protease substrates [47], and, after all, protein processing is a post-translational modification that should preferably be characterized before any assumption concerning the protease sub-strate potential is made Protein processing produces a novel C-terminal peptide (from the N-terminal frag-ment of a substrate) and a novel N-terminal peptide (from the C-terminal fragment) Hence, identifying either one of these ‘reporter peptides’ directly points to the actual processing site As recently reviewed [29], in
a whole proteomic background, C-terminal peptides are only poorly isolated On the other hand, N-termi-nal COFRADIC [33], and the more or less ‘single-step’ isolations of N-terminal peptides by protein sequence tags [48] and positional proteomics [27] were shown to isolate N-terminal peptides from complex mixtures
min
mAU
0
200
400
600
800
1000
1200
1400
primary separation
combine primary fractions
COFRADIC sorting reaction
LC-MS/MS analysis
min
mAU
0
100
200
300
400
500
600
700
secondary separation
Fig 1 The COFRADIC peptide sorting scheme A peptide mixture
is first separated by RP-HPLC (the primary COFRADIC separation).
Here, the UV absorbance profile at 214 nm of a tryptic digest of a
proteome preparation from human Jurkat T-cells is shown Primary
fractions (indicated in light gray boxes) are combined ) here, four
primary fractions (each 1 min wide) that are separated by a 13 min
window – and undergo a chemical or enzymatic reaction (the
COFRADIC sorting reaction) In this particular case, the side chains
of methionines were oxidized by hydrogen peroxide, leading to the
formation of methione-sulfoxide During a second, identical
separa-tion (the secondary COFRADIC separasepara-tion), such oxidized methionyl
peptides undergo a hydrophilic shift and segregate away from the
bulk of nonmethionyl peptides Methionyl peptides are thus
col-lected (dark gray boxes) and analyzed by LC-MS ⁄ MS.
Trang 4However, only the N-terminal COFRADIC approach
has thus far been applied to protease degradomics
research [36,38] and is discussed here (potential
draw-backs of the two affinity-based peptide isolation
proto-cols are discussed in Conclusions)
In essence, N-terminal COFRADIC segregates
pep-tides containing the protein N-termini from internal
peptides This is achieved following an initial
acetyla-tion or trideutero-acetylaacetyla-tion reacacetyla-tion on a complete
proteome prior to trypsin digestion This blocks all
free a-amines and e-amines and, further down,
distin-guishes between in vivo blocked (acetylated) and in vivo
free (trideutero-acetylated) protein N-termini Trypsin
no longer recognizes acetylated lysines, and,
conse-quently, upon digestion, Arg-C type peptides are
generated In fact, two types of peptides are now
apparent: N-terminal peptides with a blocked,
acety-lated⁄ trideutero-acetylated a-amine, and internal
pep-tides carrying a free a-amine This peptide mixture is
first separated by RP-HPLC and collected in a small
number of primary fractions Then, internal peptides
present in each fraction are reacted with
2,4,6-trinitro-benzenesulfonic acid, which is known to efficiently and
quantitatively modify primary amines [49] Internal
peptides thereby acquire a trinitrophenyl group at their
N-terminus and thus become very hydrophobic
Run-ning such TNBS-modified primary fractions a second
time on the same column and under identical
chro-matographic conditions will now segregate
TNBS-non-reactive N-terminal peptides (all their amino groups
were already blocked) from TNBS-reacted internal
peptides, which underwent a very strong hydrophobic
shift (Table 1) Following metabolic or postmetabolic
labeling, N-terminal peptides of two (or more)
proteo-mes can be weighed against each other and,
impor-tantly, neo-N-termini originating from protein
processing are readily distinguished [34,36]
The characterization of protease substrates by such
a differential N-terminal COFRADIC approach is illustrated in Fig 2 In an ongoing project, host cell substrates of the HIV-1 protease are catalogued in human Jurkat T-cells grown in stable isotope labeling
by amino acids in cell culture (SILAC) medium supple-mented with either natural, light 12C6-arginine or heavy 13C6-arginine [50] Arginine is clearly the essen-tial amino acid of choice, as all N-terminal peptides isolated by COFRADIC, by the nature of the process, will end on an arginine residue This metabolic labeling introduces a mass spacing of 6 Da between light and heavy N-terminal peptides
Cells are typically lysed by repeated freeze–thawing, and the lysate is either incubated with recombinant HIV-1 protease or left untreated (control) Following protease incubation, both proteomes are S-alkylated and acetylated, and equal amounts are then mixed and subjected to N-terminal COFRADIC In the setup depicted in Fig 2, neo-N-termini generated by the ret-roviral protease are expected in the ‘light proteome’ and will only be present in the light 12C6-arginine form Almost identical numbers of pre-existing N-ter-mini (i.e the N-terN-ter-mini of intact proteins), on the other hand, should appear as couples of light and heavy labeled peptides in ratios close to 1 : 1 This is illustrated by taking b-actin as an example: its acety-lated N-terminal peptide is present in a near 1 : 1 ratio (Fig 2B), whereas a second, now trideutero-acetylated peptide is only present in the light proteome (Fig 2C) Following MS⁄ MS analysis (Fig 2D), the latter peptide is identified as TEAPLNPKANR(106-116) and constitutes a neo-N-terminus indicative of HIV-1-mediated protein processing Processing of b-actin by the HIV-1 protease between Leu105 and Thr106 was already identified in previous studies [51], thereby vali-dating our findings
Table 1 Overview of the different COFRADIC procedures that have been developed The type of peptide, the sorting agent used in between the two consecutive RP-HPLC separation steps and the type of evoked shift are indicated References to our original papers, in which full technical details can be found, are given.
acid-modified cysteine
free a-amine peptides
Hydrophobic (internal peptides)
[33]
hydrophobic
[39]
Trang 5A similar approach but now with postmetabolic,
trypsin-mediated 18O-labeling [52] was used to
charac-terize in vivo protein processing in Fas-induced
apopto-tic Jurkat cells [34] In this study, 93 cleavage sites in
71 different proteins were characterized in a ‘proteomic
background’ of more than 1800 proteins At the time
of reporting these results, the overall majority of the identified cleavage sites were uncharacterized An anal-ogous setup was used for an in vitro analysis of the substrates of the HrtA2⁄ Omi protease [36] In that
human Jurkat T-cells
SILAC medium
12C6-arginine
human Jurkat T-cells SILAC medium
13C6-arginine
freeze-thaw lysate
recombinant HIV-1 protease
A
B
D
C
control
combine N-terminal COFRADIC analysis
949.80 950.06
630.43; y5
744.46; y6
954.66; y8
347.24; b3 444.41; b4
557.42; b5
671.37; b6
1012.66; b9
Fig 2 HIV-1 protease processes b-actin in vitro at Leu105 The experimental route is sketched in (A) Following N-terminal COFRADIC, two different peptides from b-actin were identified Its N-terminal peptide, DDDIAALVVDNGSGMCKAGFAGDDAPR(2–28) (N-terminus acetylated, lysine trideuteroacetylated, methionine oxidized and cysteine carbamidomethylated) was present in both proteome digests [ion trap MS spectrum of triply charged precursor in (B)], whereas a second peptide was only present in the proteome treated with the HIV-1 protease [ion trap MS spectrum of doubly charged precursor in (C)] Following MS ⁄ MS analysis [(D), b and y fragment ions indicated), this peptide was identified as TEAPLNPKANR(106–116) (N-terminus and lysine were both trideuteroacetylated), pointing to a previously characterized cleavage site of the HIV-1 protease in b-actin].
Trang 6study, we identified 50 different cleavage events in
15 human proteins, and further validated these Omi
substrates by nonproteomic methods Finally, as our
method directly points to the actual site of proteolytic
cleavage, data interpretation and the design of
follow-up analyses are straightforward
COFRADIC-based sorting of
N-glycosylated peptides
Glycosylation of asparagines in the Asn-Xaa-Ser⁄ Thr
acceptor motif [53] is a widespread protein
modifica-tion: a survey in the UniProtKB⁄ Swiss-Prot database
(release 52.2, 3 April 2007) indicates that 3694 human
protein entries (i.e about 23% of all human protein
entries) have at least one feature key pointing to an
N-glycosylation event
Different methods have been used to isolate and
identify N-glycosylated proteins and characterize their
glycosylation sites In general, glycosylated proteins
and peptides are affinity-isolated or chemically trapped
prior to further analysis Affinity-based isolation of
N-glycosylated proteins is rather simple and is based
on lectin-affinity chromatography Lectins are proteins
or glycoproteins that recognize oligosaccharides but
generally favor certain classes of oligosaccharides [54]
Thus, to increase the overall coverage of
N-glycosylat-ed proteins, several lectins were combinN-glycosylat-ed in
multilec-tin affinity chromatography [55,56] Alternatively, the
lectins’ glycan bias was exploited in a serial lectin
approach separating N-glycosylated (concanavalin A)
from O-glycosylated peptides (Jacalin) [57] Chemical
trapping and release of N-glycosylated peptides was
introduced by the group of Aebersold in 2003 [25] In
their approach, aldehydes are first introduced into the
glycan by periodate oxidation These aldehydes then
covalently bind to immobilized hydrazide groups by
which glycosylated proteins are retained and all
non-glycosylated proteins are removed Immobilized
gly-cosylated proteins are then further trimmed by trypsin
such that only tryptic peptides carrying glycans
remained fixed Such peptides are finally recovered by
peptide N-glycosidase F (PNGaseF), which efficiently
removes N-glycans from conjugated asparagines while
converting these to aspartic acids [58] The potential of
this chemical trapping approach is evident from recent
studies [59–62]; however, it requires several chemical
and enzymatic modification steps, and it is therefore
more complex than lectin-affinity methods; this could
potentially obstruct its widespread introduction in
proteomics laboratories
We recently showed that N-glycosylated peptides can
be isolated by diagonal chromatography [39] In our
approach, a protein mixture containing N-glycosylated proteins is digested with trypsin, and the resulting pep-tide mixture is separated by RP-HPLC N-glycosylated peptides are then specifically targeted by PNGaseF and thus deglycosylated (COFRADIC sorting step) When separated a second time by RP-HPLC, deglycosylated peptides shift out of the primary interval of nongly-cosylated peptides and are thereby isolated Impor-tantly, the shift evoked in this way can be both hydrophilic and hydrophobic, reflecting the nature of the glycan Indeed, N-glycans can contain negatively charged sugars such sialic acid [63] and sulfated carbo-hydrates [64], and removing such glycans with PNG-aseF evokes a hydrophobic shift analogous to that observed for dephosphorylated peptides [37] Following
MS⁄ MS analysis, former N-glycosylated asparagines in the Asn-Xaa-Ser⁄ Thr motif are deamidated to aspartic acids This mass signature in the consensus N-glycosyl-ation motif is used to distinguish deglycosylated pep-tides from artificially deamidated peppep-tides, especially in Asn-Gly and Asn-Ser motifs [65], undergoing small hydrophilic shifts [39]
Our COFRADIC procedure was applied to a trypsin digest of 10 lL of mouse serum depleted for its three most abundant proteins (albumin, IgGs and transfer-rin), and resulted in the characterization of 127 differ-ent N-glycosylation sites (comprising 10 novel sites) in
82 proteins estimated to span a concentration range of
at least five orders of magnitude [39] Several N-glyco-sylation sites of the large subunit of mouse carboxy-peptidase N (UniProtKB⁄ Swiss-Prot entry Q9DBB9) were identified in this study (Table 2) This protein binds to the catalytic subunit of carboxypeptidase N, which functions in protecting organisms from circulat-ing vasoactive and inflammatory peptides containcirculat-ing C-terminal arginine or lysine [66] The large subunit of this complex binds and stabilizes the catalytic subunit and thereby keeps the complex in circulation In silico predictions indicate that this protein potentially has nine different targets for N-glycosylation, six of which were identified in our study: the asparagines at posi-tions 74, 111, 119, 348, 359 and 367 (Table 2) Three other asparagines at positions 266, 311 and 520 were missed and, as is evident from the annotations in the UniProtKB⁄ Swiss-Prot database, have hitherto not been experimentally characterized A closer look at the sequences of the tryptic peptides harboring the poten-tial glycosylation sites at positions 266 and 311 clearly indicates that these peptides are very large (66 and 36 amino acids long, respectively) Therefore, they could have been missed either because they are insoluble or because our mass spectrometers, which have an empiri-cal upper mass limit close to 3000 Da for producing
Trang 7MS⁄ MS spectra that are unambiguously identified by
mascot [67], could not detect them One obvious way
to overcome this is by using proteases with nontryptic
specificities such as proteinase K that generally
pro-duce smaller peptides [68] and thus increase the chance
that more glycopeptides will be finally identified
However, such protease digests sharply augment the
complexity of the analyte mixture
Activity-based proteome-wide profiling
of purine-binding proteins
In order to assign functions to the many
uncharacter-ized (hypothetical) proteins that genome sequencing
projects provide, several small compounds occupying
and modifying active sites of enzymes have recently
found their way into functional proteomics [69,70]
Quite a lot of these so-called activity-based probes
(ABPs) are natural protein-reactive products or
syn-thetic analogs [71] ABPs in functional proteome
stud-ies generally consist of four parts: a reactive group
targeting amino acids within the enzyme’s binding
pocket, a structural moiety that is recognized by this
binding pocket, a linker, and a tag for visualization
and⁄ or isolation of modified proteins [72] Different
classes of enzymes have already been studied using
activity-based proteomics Examples include
biotinyl-ated fluorophosphonates for monitoring serine
hydrolases [73], and biotinylated
a-bromobenzyl-phosphonates for detecting protein tyrosine
phosphata-ses [74]
ATP, ADP and AMP are important sensors for the
energy status of cells, interacting with and thereby
reg-ulating the activities of key enzymes in cellular
metab-olism In addition, ATP and, to a lesser extent, GTP
are known as carriers of high-energy phosphoryl
groups that can be covalently linked to proteins and
metabolites Here, kinases play a pivotal role and
transiently interact with triphospho derivatives before the b–c phosphodiester bond is cleaved To character-ize ATP-binding and GTP-binding proteins in cells,
we profiled purine-binding proteins on a proteome scale [40] For this purpose, we used 5¢-p-fluoro-sulfonylbenzoyladenosine (FSBA) (Fig 3B), a known reactive homolog of ATP (Fig 3A) that binds proteins
in their nucleotide-binding region and then covalently modifies nucleophilic amino acids (especially tyrosine and lysine) in its proximity [75] In the past, FSBA was mainly used to profile the ATP-binding features of selected, individual proteins However, in 2004, Moore
et al published a study in which FSBA and 5¢-p-fluoro-sulfonylbenzoylguanosine (FSBG) were used to profile ATP-binding and GTP-binding proteins, respectively,
in the proteomes of different lymphoid cells [76] In their approach, proteins were labeled with FSBA or FSBG in cell extracts, separated by 2D PAGE and electrotransferred onto a poly(vinylidene difuoride) membrane Subsequent treatment of sulfobenzoyl adenosine⁄ sulfobenzoyl guanosine (SBA ⁄ SBG)-labeled proteins with NaOH hydrolyzed the ester bond between the adenosine or guanosine and the sulfo-benzoyl (SB) group and exposed the latter Antibodies
to SB were then used to immunodetect FSBA-targeted
or FSBG-targeted proteins Overlaying an image of the immunoblot with the 2D pattern of silver-stained proteins pointed to candidate ATP-binding or GTP-binding proteins that were selected from the 2D gel and identified by MS In this way, 12 different proteins could be identified as FSBA-labeled proteins
Given the fact that a mild alkaline treatment as used
by Moore et al [76] hydrolyzes the rather unstable benzoate ester bond between the adenosine and the SB group, we recently developed a COFRADIC protocol sorting for SBA-labeled peptides [40] The central sort-ing reaction is shown in Fig 3C and consists of a
25 min incubation of SBA-labeled peptides in 50 mm
Table 2 N-glycosylation sites in the large subunit of mouse carboxypeptidase N Both N-glycosylation sites characterized in our study [39] and those that were missed are given Known or potential glycosylation sites are in bold type [M + H]+, mass of the singly protonated peptide ion.
Characterized N-glycosylation sites
Unidentified N-glycosylation sites
Trang 8NaOH We first tested our protocol on a tryptic digest
of SBA-labeled recombinant Chinese hamster a-protein
kinase A, and noted that removing the adenosine group
in between the two COFRADIC separations resulted
in a strong hydrophilic shift of only one peptide that
was identified as HKETGNHYAMK*ILDK(62–76)
(K* indicates the SB-labeled lysine) This peptide
har-bors the site known to be involved in the catalytic
transfer of c-phosphate from ATP to protein kinase A
substrates [77]
When this sorting procedure was applied to a whole
proteome) here, a human Jurkat T-cell proteome
depleted of small compounds like ATP and GTP,
which could compete with FSBA) 185 sites in 132
proteins were identified Clearly, this is a significantly
higher number of proteins than were detected in
the previous gel-based study [76] Therefore, our
COFRADIC technique allows the functional
interpre-tation of a larger part of a sampled proteome More
importantly, our approach directly points to the actual
site that was modified and might thereby aid in
inter-preting structural features of ATP-binding proteins
As expected, the majority of FSBA-labeled Jurkat
proteins were known binders of small nucleotides,
cofactors, or DNA and RNA molecules However,
several proteins and sites were not readily explained by
the known affinity of FSBA for purine-binding
pock-ets Closer inspection revealed that at least 23 of such
unexplainable sites were previously characterized as
tyrosine phosphorylation sites Therefore, we assume
that when FSBA is recognized by an ATP-binding site,
there are two options for SBA labeling: either the
fluorosulfonyl group reacts with a target side chain
located on the protein carrying the ATP-binding site (homo-reaction), or, through lack of a suitable reac-tion partner, it may react with a side chain present on proteins that interact with the protein carrying the actual ATP-binding site (crossover reaction) An illus-tration of the second case is observed for kinases that can transfer the SBA group onto their substrate pro-teins by a pseudocatalytic mechanism In this way, the SBA group can be linked to proteins that have no ATP-binding site We have verified this assumption by incubating a Src substrate peptide with FSBA in the presence or absence of Src; it was shown that Src
‘catalyzed’ the labeling of the substrate peptides by a factor of more than 20 [40] Hence, we concluded that care must be taken when interpreting the results of activity-based proteome studies, as not all identified proteins will actually carry out the function that was assessed by the used ABP
Conclusions
As compared to other gel-free proteomics techniques [72], COFRADIC has a number of unique properties
As COFRADIC is essentially a peptide-sorting tech-nology by which only a set of peptides representative
of the proteomic problem is withdrawn from the com-plex analyte mixture, the sample-to-sample reproduc-ibility is much higher than in shotgun approaches [78] For instance, although MudPIT uses a powerful chromatographic technology combining two basic sep-aration principles (peptide net charge and hydro-phobicity), peptide separation still takes place on the entire, complex mixture COFRADIC eliminates a
O
HO OH
O
O
S
O
O
PEPTIDE
A
C
B
N N
NH 2
OH
O
S O
O PEPTIDE
N N
NH 2
O
HO OH
O P
O
P
O
P
O O O O O O
O
N N
O
HO OH
O
O
S O
O F
N N
NH 2
25 min @ 25°C
Fig 3 FSBA COFRADIC The structures of ATP and its reactive homolog FSBA are shown in (A) and (B), respectively The COFRADIC reaction sorting for SBA-labeled peptides is shown in (C).
Trang 9large number of peptides that are irrelevant to the
biological problem under consideration, thereby
reduc-ing the complexity of the problem without losreduc-ing
much information Unlike targeted peptide-centric
approaches such as isotope-coded affinity tag [23],
COFRADIC is not based on affinity procedures,
which are limited at two levels: first, the chemistries
used to convert sets of peptides into affinity probes;
and second, the limitations of mass transfer that are
inherently to liquid–solid state chemistries [79] At the
first level, COFRADIC has a fundamental advantage
because its chemistries do not need to create different
affinity labels For instance, an affinity tag specific for
methionyl peptides is extremely difficult to establish; in
contrast, a simple oxidation step by hydrogen peroxide
will specifically produce methionyl-sulfoxide derivatives
showing significant hydrophilic shifts in diagonal
reverse-phase chromatography [32] At the second
level, affinity-based experiments [23,27,48] have
limita-tions either at the level of incomplete or variable
incor-poration of the tag (for example, linking a biotinyl
group to a specific set of peptides can be incomplete
and partly unspecific) or at the level of interactions of
tagged peptides with the affinity resin, where the
high-est affinities and avidities are not always reached In
contrast, using COFRADIC, we select subsets of
pep-tides related to the biological question under
investiga-tion For instance, for the study of the oxidation of
protein methionines during oxidative stress, cells can
be differentially labeled with [13C]methionine or
[12C]methionine With COFRADIC, we sort for
methi-onine-containing peptides only: thus, we select out of
the mixture only those peptides containing the
differ-ential information, while all other peptides, which are
of no relevance, are discarded
All kinds of peptide selections can be done
with-out, each time, modifying or adapting the sorting
apparatus itself The latter is, in principle, an
mated HPLC apparatus equipped with an
auto-sampler, and can be purchased from a variety of
companies; and, at least in our hands, HPLC solvent
gradients and flow rates can nowadays be controlled
such that the overall reproducibility of HPLC runs
is very high, allowing efficient peptide sorting by
COFRADIC The only parameter that needs
chang-ing is the nature of the COFRADIC sortchang-ing reaction,
which can be chemical or enzymatic, but should
under all circumstances be highly specific and
prefer-entially quantitative Together with a sufficiently large
chromatographic shift (to segregate altered and
unal-tered peptides to the highest degree), the specificity
and quantitative nature of the COFRADIC sorting
reaction are clearly crucial for efficient peptide
sort-ing Unspecific sorting reactions and only slight alter-ations in peptide column retention will yield ‘impure’ sorted peptides, whereas nonquantitative sorting reac-tions will lead to irreproducible peptide sorting In Table 1, the chemical or enzymatic modification reactions that have been used successfully in a COFRADIC-based approach are listed They cover sorting methods varying from modifications to spe-cific side chains, such as cysteinyl [80] and methionyl [32] moieties, to post-translational modifications by phosphatase [37] and PNGaseF [39] treatments In another application, peptides located at the N-termini
of proteins or of their fragments are sorted [33] In this way, we have successfully analyzed protein pro-cessing in highly complex proteomes by the target proteins and identified the exact cleavage site(s), cre-ating the basis for fundamental protease degradomics [34,36]
As mentioned above, it is also possible to set up spe-cific covalent interactions between proteins and small molecules such as drugs or mimetic molecules of natu-ral metabolites such that chromatographic shifts can
be evoked, thus allowing sorting of the conjugated peptides by COFRADIC The example shown relates
to a study with an ATP analog [40]; however, it can,
in principle, be extended to drugs that covalently inter-act with their target protein, either directly or after being metabolized in the tissue or organism to form reactive products
One of the drawbacks of COFRADIC relates to the segmentation of the peptide separation flow during the primary run: many peptides may end up in two con-secutive fractions for their secondary analyses When the same separation is repeated a second time or fur-ther times, peptides eluting at the boundaries of the primary selected time intervals may show a slight drift, thus ending up in different secondary fractions By the same effect, peptides that are differentially labeled by isotopes, such as hydrogen and deuterium, the deriva-tives of which display slightly different chromato-graphic properties, may be artificially enriched in certain fractions Such situations could impose a hin-drance on any type of quantitative differential analyses performed with COFRADIC In addition, if eluting peptide peaks are cut, the remaining material is diluted, imposing limitations on the overall sensitivity This sensitivity issue remains one of the limitations of the technology; however, it is very well compensated
by the efficiency of the other steps in the system: lim-ited losses during consecutive chromatographic separa-tions, more freedom in the selection of efficient reaction systems and better reaction conditions, which are performed in a homogeneous phase, and, finally,
Trang 10the fact that at the end only a selection of peptides is
presented for analysis to mass spectrometers The
sta-bility of the peptide elution profile, particularly in the
primary runs of COFRADIC approaches, has not
been found to be a big problem as long as the buffer
conditions in which the sample has been prepared are
kept constant
One general drawback of signature peptides for
characterizing protein modifications is the fact that
only those peptides that can be separated by
RP-HPLC, ionize well in mass spectrometers and yield
informative MS⁄ MS spectra can be identified An
interesting, recent development is top-down protein
sequencing, which enables researchers to focus on an
increasing set of protein modifications [81,82] Such
top-down techniques focus on complete proteins, allow
detection of normally labile protein modifications, and
avoid several problems associated with signature
pep-tides (see above) However, proteins of interest need to
be rather pure (the number of contaminating proteins
should be low), which may currently hinder the routine
applicability of such approaches
This review has shown that the COFRADIC
tech-nology is extremely versatile and flexible and provides
profound insights into biological questions, often
much more than what could be obtained by
alterna-tive proteomics procedures such as 2D gels or shotgun
proteomics Its strong point is its high flexibility in
selecting specific chemistries or enzymatic
modifica-tions oriented towards the biological question(s) under
consideration It should be clear from the supporting
concepts that the repertoire of applications can only
be expected to grow in the future through the
develop-ment of specific chemical or enzymatic sorting
reac-tions that alter the chemical nature of a predetermined
set of peptides
Acknowledgements
F Impens is a research assistant of the Fund for
Scientific Research) Flanders (Belgium) The work in
this paper was supported by research grants from the
Fund for Scientific Research) Flanders (Belgium)
(project number G.0280.07) and the Inter University
Attraction Poles (IAP-Phase VI)
References
1 O’Farrell PH (1975) High resolution two-dimensional
electrophoresis of proteins J Biol Chem 250, 4007–4021
2 Vandekerckhove J, Bauw G, Puype M, Van Damme J
& Van Montagu M (1985) Protein-blotting on
Poly-brene-coated glass-fiber sheets A basis for acid
hydro-lysis and gas-phase sequencing of picomole quantities
of protein previously separated on sodium dodecyl sulfate⁄ polyacrylamide gel Eur J Biochem 152, 9–19
3 Aebersold RH, Teplow DB, Hood LE & Kent SB (1986) Electroblotting onto activated glass High effi-ciency preparation of proteins from analytical sodium dodecyl sulfate-polyacrylamide gels for direct sequence analysis J Biol Chem 261, 4229–4238
4 Bauw G, De Loose M, Inze D, Van Montagu M & Vandekerckhove J (1987) Alterations in the phenotype
of plant cells studied by NH(2)-terminal amino acid-sequence analysis of proteins electroblotted from two-dimensional gel-separated total extracts Proc Natl Acad Sci USA 84, 4806–4810
5 Rosenfeld J, Capdevielle J, Guillemot JC & Ferrara P (1992) In-gel digestion of proteins for internal sequence analysis after one- or two-dimensional gel electrophore-sis Anal Biochem 203, 173–179
6 Henzel WJ, Billeci TM, Stults JT, Wong SC, Grimley C
& Watanabe C (1993) Identifying proteins from two-dimensional gels by molecular mass searching of peptide fragments in protein sequence databases Proc Natl Acad Sci USA 90, 5011–5015
7 James P, Quadroni M, Carafoli E & Gonnet G (1993) Protein identification by mass profile fingerprinting Biochem Biophys Res Commun 195, 58–64
8 Mann M, Hojrup P & Roepstorff P (1993) Use of mass spectrometric molecular weight information to identify proteins in sequence databases Biol Mass Spectrometry
22, 338–345
9 Pappin DJ, Hojrup P & Bleasby AJ (1993) Rapid iden-tification of proteins by peptide-mass fingerprinting Curr Biol 3, 327–332
10 Yates JR 3rd, Speicher S, Griffin PR & Hunkapiller T (1993) Peptide mass maps: a highly informative approach to protein identification Anal Biochem 214, 397–408
11 Challapalli KK, Zabel C, Schuchhardt J, Kaindl AM, Klose J & Herzel H (2004) High reproducibility of large-gel two-dimensional electrophoresis Electrophore-sis 25, 3040–3047
12 Klose J (1999) Large-gel 2-D electrophoresis Methods Mol Biol 112, 147–172
13 Sadygov RG, Cociorva D, Yates JR 3rd (2004) Large-scale database searching using tandem mass spectra: looking up the answer in the back of the book Nat Methods 1, 195–202
14 Kelleher NL, Lin HY, Valaskovic GA, Aaserud DJ, Fridriksson EK & McLafferty FW (1999) Top down versus bottom up protein characterization by tandem high-resolution mass spectrometry J Am Chem Soc 121, 806–812
15 Han X, Jin M, Breuker K & McLafferty FW (2006) Extending top-down mass spectrometry to proteins with