Beck-Sickinger Institute of Biochemistry, Faculty of Biosciences, Pharmacy and Psychology, University of Leipzig, Germany The introduction of noncanonical amino acids and bio-physical pr
Trang 1R E V I E W A R T I C L E
Expressed protein ligation
Method and applications
Ralf David, Michael P.O Richter and Annette G Beck-Sickinger
Institute of Biochemistry, Faculty of Biosciences, Pharmacy and Psychology, University of Leipzig, Germany
The introduction of noncanonical amino acids and
bio-physical probes into peptides and proteins, and total or
segmental isotopic labelling has the potential to greatly aid
the determination of protein structure, function and protein–
protein interactions.To obtain a peptide as large as possible
by solid-phase peptide synthesis, native chemical ligation
was introduced to enable synthesis of proteins of up to 120
amino acids in length.After the discovery of inteins, with
their self-splicing properties and their application in protein
synthesis, the semisynthetic methodology, expressed protein
ligation, was developed to circumvent size limitation
prob-lems.Today, diverse expression vectors are available that
allow the production of N- and C-terminal fragments that
are needed for ligation to produce large amounts and high
purity protein(s) (protein a-thioesters and peptides or teins with N-terminal Cys).Unfortunately, expressed pro-tein ligation is still limited mainly by the requirement of a Cys residue.Of course, additional Cys residues can be introduced into the sequence by site directed mutagenesis or synthesis, however, those mutations may disturb protein structure and function.Recently, alternative ligation approaches have been developed that do not require Cys residues.Accord-ingly, it is theoretically possible to obtain each modified protein using ligation strategies
Keywords: expressed protein ligation; IMPACTTM-system; intein; native chemical ligation
Introduction
Proteins and peptides that have been modified by
intro-ducing noncanonical amino acids, fluorescence tags, spin
resonance labels or cross-linking agents have great potential
for investigations into protein–protein interactions and can
help to elucidate protein structures.Furthermore, artificial
peptides and proteins with new properties and with a broad
range of applications can be obtained.Further interest lies
in fragmental or complete isotopic labelling for NMR
studies to determine protein structures
Solid-phase peptide synthesis (SPPS) provides the
pos-sibility of introducing noncanonical amino acids into
peptides but is restricted to peptides of up to 60 amino
acids in length.By using expression systems in bacteria or
yeast, the recombinant generation of peptides and proteins and their complete isotopic labelling has become possible [1–3].The size of the constructs is not restricted but the insertion of noncanonical amino acids is difficult [4,5].The limitation of peptide size in SPPS was circumvented by several approaches developed for the synthesis of proteins
by segment condensation [6].Liu et al.used a glycolalde-hyde peptide ester for the reaction of an unmasked aldeglycolalde-hyde with an amino-group of an N-terminal Cys or Ser to form
a thiazolidine- or oxazolidine-ring.Rearrangement of the O-acyl-ester resulted in an amide bond with a pseudoproline residue [7].In the thiol capture approach, where only Cys sidechains have to be protected, a 4-mercapto-dibenzofuran ester forms an asymmetric disulfide bond with an N-terminal Cys activated with an S-(methoxycarbonyl)sul-fenyl (Scm) group of a second peptide.The free amino function of this amino acid can attack the carbonyl group of the ester and an OfiN-acyl transfer results in an amide-bond.Reductive cleavage of the disulfide releases the free Cys sidechain [8].CNBr-cleavage fragments refold and form noncovalent complexes and finally the missing peptide bonds are reattached [9].Cytochrome c CNBr fragments 1–65 and 66–104 were modified and religated by this method [10], but this technique is limited by the occurrence
of Met at the cleavage site
Dawson et al.introduced a simple and elegant method called native chemical ligation (NCL) for the synthesis of peptides by condensation of their unprotected segments The coupling of synthetic peptide-thioesters with peptides carrying an N-terminal Cys leads to an amide-bond at the ligation site.This approach has proven to be useful for the synthesis of smaller proteins up to 120 amino acids in
Correspondence to A.G.Beck-Sickinger, Institute of Biochemistry,
University of Leipzig, Bru¨derstr.34, D-04103 Leipzig, Germany.
Fax: + 49 341 97 36 909, Tel.: + 49 341 97 36 900,
E-mail: beck-sickinger@uni-leipzig.de
Abbreviations: BAL, backbone amide linker; CBD, chitin binding
domain; eGFP, enhanced green fluorescent protein; EPL, expressed
protein ligation; FRET, fluorescence resonance energy transfer;
GFP, green fluorescent protein; HOBt, 1-hydroxybenzotriazole;
IMPACT TM , intein-mediated purification with an affinity chitin
binding tag; IPL, intein-mediated protein ligation; NCL, native
chemical ligation; PTPase, protein tyrosine phosphatase; SPPS,
solid-phase peptide synthesis; TROSY, transverse relaxation
optimized spectroscopy; TWIN, two intein system.
(Received 12 November 2003, revised 19 December 2003,
accepted 5 January 2004)
Trang 2length; larger proteins cannot be obtained easily in one
ligation step.Multistep NCL of different peptide-segments,
however, can lead to larger proteins [11].An extension of
this NCL strategy is the expressed protein ligation (EPL)
method [12] using recombinant thioesters and/or
aCys-peptides.This review gives an overview of this method and
its applications in the past few years
Native chemical ligation
The method of native chemical ligation was introduced by
Dawson et al.[13,14] and is based on the reaction between a
thioester and the sidechain of a Cys residue – reported for
the first time by Wieland et al.[15].Two fully unprotected
synthetic peptides react to form an amide bond, so they are
connected as in the native peptide backbone.The reaction
proceeds in aqueous conditions at neutral pH.The first step
of this process is the chemoselective transthioesterification
of an unprotected peptide Ca-thioester with an N-terminal
Cys of a second peptide.The so-formed thioester
sponta-neously undergoes an SfiN-acyl transfer to form a native
peptide bond and the resulting peptide product is obtained
in the final disposition.Internal Cys residues within both
peptide segments are permitted because the initial
trans-thioesterification step is reversible and no side products
are obtained, thus, no protecting groups are necessary.An
alternative method was introduced by Tam et al.[16,17],
where a C-terminal thiocarboxylic acid S-alkylates an
N-terminal a-bromoAla to form a covalent thioester.This
rearranges by SfiN-acyl shift and builds an -X-Cys- peptide
bond (Fig.1)
To prevent the thiol of the N-terminal Cys from oxidation,
and thus forming an unreactive disulfide linked dimer, it is
necessary to add thiols or other reducing reagents like
tris(2-carboxyethyl)phosphine (TCEP) [18] to the reaction
mix-ture.Furthermore, the addition of an excess of thiols not
only keeps the thiol-functions reduced but also increases the
reactivity by forming new thioesters through
transthioeste-rification [19].The addition of solubilizing agents such as
urea or guanidinium hydrochloride does not affect the
ligation reaction and can be used to increase the concentra-tion of peptide segments and results in higher yields.The compatibility and efficiency of all proteinogenic amino acids
at the C-terminus of the thioester peptide to react in NCL was determined by Hackeng et al [20] All 20 amino acids except Val, Ile and Pro can be placed in the -X-Cys- position
in NCL.Val, Ile and Pro are reported to react slowly.Also, Asp and Glu as C-terminal residues are less favourable because of the formation of side products [21]
A useful application of NCL is solid-phase chemical ligation (SPCL) [22].In this approach, one of the two segments is bound to a polymer, while the other is applied in aqueous solution and can be used in excess.A simple washing step completely removes the solubilized peptides and the assembled full length protein can be cleaved from the resin
In the tandem peptide ligation approach, the NCL is applied to the synthesis of peptides and proteins requiring two or more ligation steps.NCL is combined with a pseudoproline ligation by imine capture [23], the third step can be pseudoglycine ligation [24]
In addition to Cys, related amino acids, including selenoCys [25] and selenohomoCys [26], have been reported
to work in a similar manner
Thioester formation The bottleneck in NCL is the generation of the thioester Several applications have been developed using solid-phase peptide synthesis.Most of the strategies to obtain peptide thioesters have used the Boc-strategy [13,17] because of the base-lability of the thioester.However, different attempts
in the synthesis of thioesters were performed by using the 9-fluorenylmethoxycarbonyl (Fmoc) method.In general, the Fmoc-strategy has several advantages over the Boc-strategy, the first being the milder conditions used for cleavage from the resin.To circumvent the susceptibility of the thioester linkages to nucleophiles like piperidine, used for the removal of the Fmoc-protecting group, several cocktails for deprotection have been developed, e.g., 1,8-diazabicyclo[5.4.0]undec-7-ene (DBU) with 1-hydroxy-benzotriazole (HOBt) [27], 1-methylpyrrolidine with hexamethyleneimine and HOBt [28] or DBU and HOBt [29].The final cleavage from the resin then results in the peptide thioester
Further methods were introduced that used different resins.One is based on modifications of Kenner’s sulfon-amide Ôsafety catchÕ linker [30].The C-terminus of the growing peptide chain is attached to the resin with an acid-and base-stable N-acyl sulfonamide linker.The sulfonamide
is activated after peptide synthesis by N-alkylation using diazomethane or iodoacetonitrile.The cleavage occurs with nucleophile like thiols, which finally results in a peptide thioester [31,32].In the backbone amide linker (BAL) strategy, the first carboxy terminally protected amino acid is attached to the resin on the backbone nitrogen.The peptide chain grows in the N-terminal direction.Deprotection, activation and thioester formation at the carboxy terminus occurs on the solid support.The peptide thioester can be cleaved from the resin with trifluoroacetic acid [33] Another approach uses standard resins like phenyl-acetamidomethyl (PAM) or 4-hydroxymethyl benzoic acid (HMBA), the Lewis acid, Al(CH)Cl and thiols in
Fig 1 Ligation of unprotected peptide segments In native chemical
ligation (A) the first step is a transthioesterification of a Ca-thioester by
the thiol function of an N-terminal Cys followed by a spontaneous
SfiN-acyl shift to obtain a native peptide bond.In an alternative
approach (B), a Ca-thiocarboxylic acid reacts with an a-bromo amino
acid by forming a thioester.This leads to the same product as in
method A.
Trang 3methylenchloride [34].Unfortunately, the alkylaluminium
thiolate method can lead to epimerization at the C-terminus
and reactions at the sidechains, e.g., sidechain thioesters and
aspartimide formation.This can be avoided by using a
weaker Lewis acid, e.g Al(CH3)3[35].A further possibility
is the synthesis of peptides on Cl-trityl-resin and the
cleavage of the fully protected peptide chain with acetic
acid and trifluoroethanol.The thioester can be obtained by
the treatment of the protected peptide with activating
reagents and thiols [36,37].After deprotection of functional
sidechains with trifluoroacetic acid, the thioester can be
easily purified by HPLC (Fig.2)
An alternative approach for the thioester synthesis of
larger peptides and proteins in high yields and purity uses
a bacterial expression system based on the intein mediated
self-splicing mechanism of precursor proteins as discussed
below
Recombinant generation of proteins
with C-terminal thioester or N-terminal Cys
Inteins and their use in protein chemistry
Inteins are internal segments of precursor proteins that
catalyze their ipso excision, in an intramolecular process
called protein splicing, with the concurrent ligation of the
two flanking external regions (N- and C-exteins) through
a native peptide bond.This finally yields the host protein
Thus, inteins are analogues of self-splicing RNA introns
The first intein was discovered in 1987 and up to now over
100 inteins are listed [38–40].The origin of inteins is not yet
clear.However, understanding of inteins, their evolution,
distributions and properties, will be easier if they are
considered as parasitic genetic elements.They will not
contribute to an organism’s fitness if they are propagated
into the next generation.The insertion of an intein gene into
a protein gene can be described through the so called
homing cycle.Homing is the transfer of a parasitic genetic
element to a cognate allele that lacks the element.This
process results in the duplication of the parasitic genetic
element and its rapid spread in a population [41–43].Inteins
occur in organisms of all three domains of life as well as in
viral and phage proteins.There they are predominantly
found in enzymes involved in DNA replication and repair [40,44].Inteins can be divided into four classes: the maxi inteins (with integrated endonuclease domain), mini inteins (lacking the endonuclease domain), trans-splicing inteins (where the splicing junctions are not covalently linked) and Ala inteins (Ala as the N-terminal amino acid) [45].The sequences of inteins have some characteristics in common They appear in conserved regions of the host protein and all intein sequences harbour different motifs termed A and B (which contain a conserved Thr and His) at the N-terminal splicing domain, F and G at the C-terminal splicing domain (Fig.3) Endonuclease containing inteins also bear the blocks C, D, E and H [38,46].The N-terminal amino acids are typically Cys, Ser or Ala.The C-terminal block G contains a conserved His/Asp pair and a downstream Cys, Ser or Thr amino acid
The nucleophilic thiol or hydroxyl sidechains of the conserved amino acid residues led to the assumption that (thio)esters that are formed by an NfiS- or an NfiO-shift are intermediates of the internal rearrangement steps of the splicing reaction.This was proven by various investigations
Fig 2 Formation of synthetic peptide
a-thio-esters Peptide a-thioesters can be synthesized
by the Fmoc strategy by using backbone
amide linker resins (A), acidic cleavage from
mercaptoalkyl linker resins (B), Lewis acid
activated cleavage from common resins
(C), cleavage of fully protected peptides
(Boc, t-butyloxycarbonyl; tBu, t-Butyl) and
deprotection after thioester generation (D)
and by using of sulfonamide safety catch
linker resins (E).
Fig 3 Characteristic positions of intein motifs and numbering The inserted intein carries the N-terminal extein (left shaded box) and the C-terminal extein (right shaded box).The residues important for the splicing process as well as the conserved segment blocks (A, B, C, D, E,
H, F, G) and some internal intein key amino acids are depicted in the one letter code within the certain segments (bold black).Numbering of the amino acids of a precursor protein is made in the following way: the intein’s N-terminal amino acid (Cys, etc.) is numbered as 1 whereas the C-terminal amino acid of the N-terminal extein is num-bered as )1 and the N-terminal residue of the C-terminal extein is numbered beginning with +1.
Trang 4Replacement of the amino acid residues at the N-terminus
containing a nucleophilic thiol or hydroxyl sidechain and
the Asp at the C-terminus, through site directed
mutagen-esis, ended up in a complete loss of splicing activity of the
intein [47,48]
Splicing mechanism
The first step of the well understood standard splicing
process of inteins (Fig.4) is the transfer of the N-terminal
extein unit to the sidechain -SH or -OH group of a Cys/Ser
residue located at the immediate N-terminus of the intein
(NfiS-acyl shift).In some cases, inteins bear Ala at the
ultimate position at their N-terminus.In such cases, the first
step is circumvented [48,49] and the +1 nucleophile within
the C-extein attacks the carbon of the peptide’s N-terminal
splicing junction.This rearrangement seems to be
thermo-dynamically highly unfavourable but the molecular
archi-tecture of the intein forces the scissile peptide bond into a
twisted conformation of higher energy and thereby pushes
the equilibrium to the (thio)ester side.The following step is a
new transfer of the N-terminal extein to the Cys/Ser/Thr at
the +1 position of the C-extein, which leads to a branched
intermediate.In the last step, which might be a concerted
reaction, a conserved Asp residue at the C-terminus of the
intein cyclizes and a peptide bond is formed between the two exteins through an SfiN-acyl shift [50]
This splicing mechanism implicates the importance of the conserved amino acids flanking the splicing junctions such
as the block B Thr and His, and the block G His [45]
In the case of C-terminal splicing, the cumulative data indicate that the present penultimate His appears to assist the C-terminal Asp cyclization, although there are reported mutants referring to this residue which did not prevent splicing.The three dimensional structure of the splicing domain at the N-terminal part of the intein forces the peptide bond into a twisted conformation.This could also
be protonated through the penultimate His residue men-tioned above.Mutation of this amino acid did not affect the first steps of the splicing up to the branched intermediate but abolished the final step.In the X-ray crystal structure of the intein, Mycobacterium xenopi gyrase (Mxe GyrA) (Fig.5), the His197 is hydrogen bonded to Asn198 so that His197 is oriented for the donation of a proton from Nd position to the emerging alpha amino group of the C-extein, prior
to the SfiN-acyl shift [51,52].Some putative inteins that lack the penultimate His residue are either inactive or use other amino acids.Accordingly, the penultimate His is not absolutely required but increases the splicing rate.Block B contains Thr and His that are separated through two amino
Fig 4 Mechanism of intein-mediated protein splicing In the initial step a thioester intermediate is formed by an NfiS-acyl shift
at the N-terminal Cys of the intein (Cys 1 ) Transthioesterification by a nucleophilic attack of the sidechain of the N-terminal Cys
of the C-extein (Cys +1 ) on the thioester is formed in the first step and results in a branched intermediate.Peptide bond cleavage coupled to succinimide formation of the C-terminal intein–Asp releases the intein.The knotted exteins undergo a spontaneous SfiN-acyl shift and yield a peptide bond.Peptide bond cleavage can occur independently at both splicing sites.Mutation of Cys 1 to Ala prevents splicing at the N-terminus and leads
to a C-terminal extein bonded with the intein C-terminal splicing cannot occur when the C-terminal Asn is substituted by an Ala residue and the N-terminal extein is cleaved
by nucleophilic attack.
Trang 5acids.Both play a key role for the N-terminal splicing
process.Substitution of block B His to Leu in Sce VMA
abolished splicing [53,54] and only C-terminal cleavage
occurred.This implies that this His residue takes part in the
first NfiS rearrangement at the N-terminal splicing
junc-tion.X-ray crystal structures of Sce VMA1 [55–57] and
MxeGyrA [51] with exteins showed a protonation of the
scissile peptide bond through the imidazole ring.This
interaction promotes the breakdown of the tetrahedral
intermediate formed by the +1 nucleophilic attack of the
N-terminal thioester bond.These findings were further
elucidated and confirmed through investigations of Ala
inteins.The exact role of Thr is not yet fully understood
because of the lack of available structural data.It has been
postulated that the Mxe GyrA intein stabilizes the
tetra-hedral intermediate at the N-terminal splicing junction by
the formation of an oxy anion hole through Nd of Asn74
and the block B Thr
Both effects, the spatial constraints and the electronic
influence, lead to a reactive and accessible electrophilic
carbon of the scissile peptide bond as an acid/base catalysis
mechanism is suggested
Furthermore, divalent transition metal cations influence
the protein splicing process.It was shown for the split
inteins Ssp DnaE and the Mtu RecA that micromolar
concentrations of Zn2+ions decreased the splicing rate and
Zn2+ion concentrations in the millimolar range stopped
completely the process through chelation of key amino
acids.A similar effect was obtained for Cd2+ions [58,59]
Classification of inteins
The elucidation of the splicing mechanism and the
identi-fication of the key amino acid residues involved in the
scission and ligation of the peptide bonds facilitated the
molecular engineering of artificial inteins as tools for
different applications in protein chemistry.Currently there
are five general methods of intein usage in this field so far:
(a) modified inteins with an inducible autocatalytic cleavage
activity are used for protein purification; (b) inteins are used
for trans-splicing.Here the inteins are split into two
fragments that can recombine and reconstitute their splicing
activity in vivo or in vitro.(c) Intein mediated protein ligation
(IPL) is used for the generation of specifically
mono-activated proteins, which can further be ligated with peptide
segments and provides access to artificially labelled proteins; (d) inteins facilitate the synthesis of cyclic proteins and (e) inteins are used for the detection of protein–protein interactions [45,46]
Three dimensional structures of inteins The structure of the intein Sce VMA1 that was determined
by X-ray crystallography clearly shows two domains (Fig.5) [55–57] The structure of the splicing domain is similar to that of the mini intein in the Mycobacterium xenopi gyrase (Mxe GyrA) [51].Residues from the endo-nuclease domain of Sce VMA1 contribute to target sequence-specific contacts as well as parts of the other domain that are distant from the Sce VMA1 cleavage site Several studies have been made by photo-crosslinking to identify these residues [60].The splicing domains have predominantly all b-structures and show high similarity to the structure of the hedgehog proteins that are important in the development of multicellular organisms [61]
Formation of C-terminal thioester-activated proteins Protein engineering via NCL requires the specific generation
of C-terminal thioester-tagged proteins allowing ligation with a second peptide or protein containing an N-terminal Cys or Ser residue.The potent synthesis of Ca-thioesters of bacterially expressed proteins was found through studies of the N-terminal cleavage mechanism of inteins.In general, the cleavage of the peptide bonds at either the N-terminus or the C-terminus of the intein can occur independently Replacement of the C-terminal Asp by Ala blocked the splicing process in the Pyrrococcus species GB-D intein However, the lack of the succinimide formation did not affect the preceding NfiO-acyl rearrangement at the N-terminal splicing junction.The same data were found previously for the NfiS-acyl shift in the Sce VMA intein Incubation of this modified intein with thiols, like dithio-threitol, releases the corresponding free C-terminal thioester-tagged extein from the N-terminal splicing junction through transthioesterification.This thiol-inducible cleavage activity
of an engineered intein was the beginning of the extensive exploitation of other intein mutants as workhorses in the area of biotechnology to obtain mono-thioester labelled proteins and aCys-proteins [46,50]
Fig 5 Comparison of Mxe GyrA (A) and Sce
VMA (B) intein structure The structures of
both inteins have been determined by X-ray
crystallography [51,55,56] (PDP files 1AM2
and 1LWS, http://www.rcsb.org/pdb/) Blue
arrows indicate b-sheets whereas purple
cyl-inders symbolize a-helices.The N-termini are
coloured in green and C-terminal b-sheets in
red.The endonuclease domain of Sce VMA
(right part) is clearly separated from the
self-splicing domain (left part).
Trang 6The IMPACTTM-system [62] [intein-mediated purification
with an affinity chitin binding tag (Fig.6)] is commercially
available from New England Biolabs and allows the single
column isolation of protein thioesters by utilizing the thiol
induced self-cleavage activity of various inteins.In this
system, the target gene is cloned into an expression vector
right at the N-terminus of a modified intein.An additional
chitin binding domain (CBD) from Bacillus circulans is
fused to the C-terminal part of the intein and enables the
affinity purification of the further expressed three segmental
fusion proteins.All other cell proteins can be washed away
from the absorbed fusion protein, and after induction of the
cleavage with an excess of thiol and overnight incubation, the protein of interest can be eluted as a C-terminal thioester from the chitin resin.Several inteins are available (Table 1) which differ with respect to the thiols used at 4C Additionally, there are recombinant inteins, which cleave the C-terminal extein through the change of the pH or temperature.This can be applied to protein purification or EPL for the synthesis of the Cys segment.In the case of C-terminal thioester synthesis, modified mini inteins are commonly used with a AsnfiAla mutation from the genes
of Mycobacterium xenopi (Mxe GyrA), Saccharomyces cerevisiae(Sce VMA), Methanobacterium thermo-autotro-phicum(Mth RIR1) and Synechocystis sp.PCC6803 (Ssp DnaB).The cleavage takes place only at the N-terminus of the intein because of the absence of the Asp cyclization These inteins can be cleaved through induction with various thiols in great efficiency.This is an important chemical aspect for ongoing protein ligation together with the thioester stability
Choice of thiols For the thiolysis of the intein fusion proteins, a broad range
of thiols have been investigated.The choice of a certain thiol depends on the accessibility of the catalytic pocket of the intein/extein splicing domain and the properties of the target protein of interest.In general, the thiols should be small, nucleophilic molecules that can enter the catalytic pocket to attack the thioester bond connecting the extein and the intein.For further application of protein thioesters in EPL two things have to be considered to be dependent on the synthesis strategy.On one hand, the protein thioester should
be stable to hydrolysis in order to be isolated.On the other hand, the thioester should also be reactive enough in EPL Simple alkyl thioesters are quite stable to hydrolysis but not very reactive.Mixtures of alkylthiols and thiophenol [12,19]
or 2-mercaptoethansulfonic acid (MESNA) [63] improved the reactivity.If there is no need for a thioester isolation, MESNA or thiophenol could be used directly for the induction of the cleavage and the subsequent reaction Instead of thiols, other nucleophiles like hydroxylamine [45] can also be used to induce protein splicing and the isolation
of the target protein
Fig 6 Expressed protein ligation The synthesis of proteins with
C-terminal thioester (left) and proteins with N-terminal Cys (right) can
be performed by using the IMPACTTM-system.Thioesters can be
obtained by fusing the protein of interest to the N-terminus of an
intein, proteins with N-terminal Cys by fusing to the C-terminus of a
mutated intein.Separation occurs by using the Chitin binding domain
(CBD).Both fragments can be synthesized by SPPS and specifically
labelled at the N- or C-terminus of the protein.Ligation of both
fragments proceeds under the conditions of NCL.
Table 1 Intein based vectors and their potential applications Mxe GyrA, Mycobacterium xenopi gyrease A; Mth RIR1, Methanobacterium ther-moautotrophicum; Ssp DnaB, Synechocystis sp.PCC6803; Sce VMA, Saccharomyces cerevisiae.
Vector Intein Splice junction Cleavage induction References Applications
pTXB1, 3 Mxe GyrA C-terminus Thiola [64] Purification, generation of C-terminal thioesters pTYB1, 2 Sce VMA C-terminus Thiol a [62] Purification, generation of C-terminal thioesters pTWIN1 Ssp DnaB N-terminus pH and temperature [88] Purification C-terminal thioesters, aCys-proteins,
protein ligation, cyclization Mxe GyrA C-terminus Thiol a [88]
pTWIN2 Ssp DnaB N-terminus pH and temperature [111] Purification, C-terminal thioesters, aCys-proteins,
protein ligation, cyclization Mth RIR1 C-terminus Thiol a
pTYB11, 12 Sce VMA N-terminus Thiol a [112] Purification
pTYB3, 4, pKYB1 Sce VMA C-terminus Thiola [40] Purification, generation of C-terminal thioesters
a Other nucleophiles might be used for the induction of the protein cleavage.
Trang 7Generation of aCys proteins
The EPL requires a peptide or protein that contains an
amino terminal Cys residue (aCys) besides the a-thioester
moiety.To synthesize proteins possessing an aCys, the
protein cDNAs of interest can be cloned into various
commercially available vectors as mentioned above
(IMPACTTM-System).Thus, after the expression, the
intein/CBD fusion protein can be purified on a chitin
column and cleaved by changing the pH or temperature
This will lead to the free aCys proteins.One drawback in
the intein-based synthesis of aCys proteins is the possible
spontaneous cleavage, which results in a loss of the
purification tag [45,64]
Expressed protein ligation (EPL)
Expressed protein ligation [12,50,65,66], also named
intein-mediated protein ligation [46], is an extension of the NCL
method.A recombinant Ca-thioester reacts with a
chemi-cally synthesized or expressed peptide/protein possessing an
N-terminal Cys under the conditions of NCL to form a
native peptide bond.This ligation method combines the
advantages of molecular engineering and chemical peptide
synthesis in many cases and allows site-specific introduction
of unnatural amino acids and chemical or biophysical tags
into large proteins.In former times, the difficulty of this
strategy was the chemical preparation of peptides or
proteins with a C-terminal thioester and the generation of
peptides and proteins with N-terminal Cys residues in large
quantities and high purity.Now, the expression of both
segments in high yields is possible by using the introduced
IMPACTTM-system.Thioesters can be obtained by fusing
the protein of interest with the N-terminus of an intein,
proteins with N-terminal Cys by fusing with the C-terminus
of a mutated intein [64].Both fragments needed for ligation
can be synthesized alternately by SPPS as described already,
so it is possible to introduce specific labels either at the
N- or C-terminus of the protein.The chemically synthesized
section can be as small as possible whereas the expressed
part is not limited in size.This can lead to very large
specifically labelled proteins
Expressed protein ligation can be performed directly on chitin beads and thiolysis and ligation can occur simulta-neously.It is disadvantageous if solubilizing agents are needed for the ligation, because urea or guanidinium hydrochloride for example denaturate the chitin binding domain at concentrations higher than 2M.Alternatively, the thioester may be eluted and the ligation reaction may proceed in a second step.Detergents, urea or guanidinium hydrochloride can be used in higher concentrations to increase the solubility of peptides which may result in a higher reaction yield
If an amino acid within the protein sequence or several amino acids on both ends was to be modified, the protein would have to be split in three or more fragments and two or more ligation steps would have to be executed.The second peptide fragment carrying an N-terminal Cys and an additional C-terminal thioester has to be masked recombinantly at the N-terminus with
a protease cleavage site, e.g factor Xa protease After the first ligation step, the N-terminal Cys is liberated by protease treatment and the second ligation step can be performed [50].This protein can be synthesized from the C- to N-direction
Applications of expressed protein ligation
EPL chemistry applications are summarised in Table 2 and described in more detail below
Site specific protein modifications The ability to change specific sidechains by the insertion of noncanonical amino acids has great potential in protein structure/function studies
To determine the role of post-translational modifica-tions it is necessary to insert phosphorylamodifica-tions or glyco-sylations at defined positions.A phosphotyrosine peptide
is ligated to the C-terminus of the protein tyrosine kinase C-terminal, Src kinase (Csk), which results in an intra-molecular phosphotyrosine–Src homology 2 interaction and increased catalytic phosphoryl transfer to a substrate when compared with a nonphosphorylated control [12]
Table 2 Recent highlights show the scope of EPL chemistry GFP, green fluorescent protein; CAR D1, immunoglobulin D1 domain of cox-sackievirus-adenovirus receptor; MBP, maltose binding protein; proNPY, proneuropeptide Y; BBP, brain-binding peptide; RGD, (Arg-Gly-Asp)-containing peptide.
Investigation of protein–protein interactions Enhanced GFP [78,80,81]
Peptide and protein labelling with biophysical probes c-Crk-II, hIL-8 [73,76]
Trang 8The Csk–Src system was also investigated by Wang et al.
who displaced the Src–tyrosine by five unnatural Tyr
analogues to determine the role of the Tyr-sidechain for
Src affinity to Csk [67].Lu et al.[68] observed the
influence of phosphorylation at two Tyr residues of
protein tyrosine phosphatase SHP-2 by introducing
non-hydrolyzable phospho-tyrosine analogues at the
phos-phorylation site of SHP-2 by expressed protein ligation
Their results showed that phosphorylation at Tyr542 leads
to the basal inhibition of protein tyrosine phosphatase
(PTPase) activity by interacting with the N-terminal SH2
domain, whereas phosphorylated Tyr580 stimulates the
PTPase by interacting with the C-terminal SH2-domain
The role of phosphorylation of the eukaryotic initiation
factor elF4E, which is implicated in the regulation of the
initiation step of translation, was observed by the
selectively phosphorylated version.Cap affinity of
phos-phorylated and unphosphos-phorylated elF4E was determined
by fluorimetric time-synchronized titration [69]
The introduction of biophysical probes (spin labels or
fluorescence tags) allows the observation of protein–protein
interactions, membrane insertion or cellular uptake of
labelled peptides and proteins.Several fluorescence based
approaches [70–72] have been developed where the
fluoro-phore is attached to the sidechain of an amino acid (mainly
Lys) within the protein sequence
Cotton et al.described the synthesis of a dual-labelled
version of the Crk-II adapter protein and its investigation
by fluorescence resonance energy transfer (FRET).A pair
of tetramethylrhodamine and fluoresceine was ligated to
the N- and C-terminus by solid-phase expressed protein
ligation.The construct reported the phosphorylation of
Crk-II by the nonreceptor tyrosine kinase by fluorescence
change that was affected by structural changes [73].The
same FRET-pair was used to observe
homo-oligomeriza-tion of glutathione S-transferase, SH2 domain
phospha-tase-1 and serotonin N-acetyltransferase by measurement
of intermolecular FRET-effects [74].We succeeded
recently in the semisynthesis of the 69 amino-acid
proNPY and its analogues to study prohormone
proces-sing.Five variants were synthesized containing either no
label or were labelled with carboxyfluorescein or biotin
Western blot analysis was performed to determine the
binding site of anti-NPY and anti-proNPY antibodies
[75]
Furthermore we synthesized human interleukin-8, a
chemotactic cytokine, and its C-terminal
carboxyfluo-rescein-labelled analogue by expressed protein ligation
Possessing four Cys residues, the formation of two
disulfide-bridges was necessary to obtain biological activity of hIL-8
One of these Cys residues was chosen as a ligation site
Internalization studies on HL60-cells expressing both
hIL-8-receptor subtypes and binding studies on
HL60-membranes provided an insight into the ligand receptor
interaction and the internalization of the
interleukin-8-receptor complex [76]
Also, single atoms like isotopes or atom homologues like
F instead of H, or Se instead of S can represent biophysical
probes.Wallace et al.introduced simultaneously (and
site-specific) selenium and bromine as reporter atoms into the
sequence of cytochrome c without significant changes of
structure and function [77]
Intermolecular protein splicing intrans to study protein–protein interaction
Protein–protein interactions are essential for many biologi-cal processes like receptor-ligand binding, protein polymer-ization, gene expression, etc.To study these interactions
in vivo, several methods have been developed, one example being the yeast two-hybrid system.The principle of these methods is that potentially interacting proteins are tagged to proteins with a particular function [78].This function will be recovered if an interaction of the tagged proteins is accomplished.By using protein-splicing in trans [79] a split intein is tagged to a split functional protein that is reconstituted after interaction of the intein parts.Ozawa
et al.used halves of enhanced green fluorescent protein (eGFP) as N- and C-terminal exteins and fused them to N- and C-terminal fragments of a modified intein [80,81]
No fluorescence was observed from any construct expressed
in E.coli.In contrast, coexpression of calmodulin and its target peptide M13 connected to the intein led to fluores-cence of eGFP, suggesting that the interaction of calmo-dulin and M13 triggers the refolding of the intein.A related approach using firefly luciferase, was introduced by the same group for mammalian cells [82]
The conditional protein splicing approach from Mootz
et al.[83,84] used the dimerization of the rapamycin receptor FKBP and the rapamycin binding domain in the presence of rapamycin to reconstitute a split intein in mammalian cells.Maltose binding protein (MBP) and a His-tag were used as exteins and the splicing product was detected by Western blotting or by immunoprecipitation in the cells.In a related approach by this group, GFP was coupled to the N-terminus of an intein and expressed in Chinese hamster ovary cells.The chemically synthesized C-terminal part of the intein was coupled to a FLAG-epitope and transported through the membrane by using a protein transduction domain.The C-terminal intein can associate with its N-terminal half within the cells and ligation of GFP to the FLAG-epitope is performed [85]
By using the EPL-method, eGFP was ligated to an amidated human calcitonin (hCT) derived carrier peptide Covalently bound calcitonin and its C-terminal fragments were shown to permeate membranes of nasal epithelium, but permeation was limited to peptides.Ligated eGFP-hCT(8–32) shows specific mucosal internalization, whereas enhanced GFP did not show internalization per se.The shuttle-ability of hCT and its possible role in drug delivery was demonstrated using eGFP [86]
Generation of cyclic peptides and proteins Backbone cyclization can improve the stability and the activity of peptides and proteins and reduce their conform-ational flexibility.The production of circular proteins may influence the rational design of enzymes and the develop-ment of new agents by structure activity studies
Cyclic structures can be obtained either by disulfide formation or by formation of a peptide bond between N- and C-termini or by sidechain cyclization.Several methods have been developed by using modified inteins to generate cyclic peptides and proteins.The aim is to create a protein with both an N-terminal Cys and a C-terminal
Trang 9thioester.Such a peptide can be generated by flanking the
protein of interest with two inteins (Fig.7).The N-terminal
modified intein can be cleaved by a pH and temperature
shift, whereas the C-terminal intein is cleaved by the
addition of thiols.This Ôtwo intein systemÕ (TWIN) also
allows the separation by chitin binding domains fused to the
inteins.The reaction of the two reactive groups leads to the
formation of cyclic peptides and proteins or multimers by
an amide bond [87,88]
Several approaches use intramolecular trans-splicing for
the generation of cyclic backbones in vivo and in vitro.In
these cases, the split intein is not coupled to a cleaved
protein or to two proteins which should be knotted, but the
intein parts flank one protein with an N-terminal Cys
residue.If the intein is reconstituted, a thioester intermediate
will be formed that undergoes
transthioesterification.Cyc-lization of Asp and SfiN-acyl transfer leads to a cyclic
product [89–92]
A simple approach for in vivo cyclization in Escherichia
colicells was introduced by Camarero et al.[93].An SH3
domain from murine c-Crk adapter protein with an
N-terminal Cys residue was N-terminally fused to an intein
with a chitin binding domain.After the expression of this
fusion protein, the N-terminal Met residue produced by the
start-codon is replaced by the Met-aminopeptidase, which
results in an active Cys residue.The amide-bond connecting the protein to the intein can switch by NfiS-acyl shift to the thioester bond.As this protein now possesses a reactive N-terminal Cys residue and a C-terminal thioester it can react to form an intramolecular bond by NCL
Generation of cytotoxic proteins
In some cases, the expression of the desired proteins in bacteria can cause cytotoxic side-effects because the target protein competes with cellular components of the host Another problem is that overexpressed proteins may aggregate as inclusion bodies in the cytosol.By using EPL techniques this can be avoided through modular synthesis of
an artificial target protein as an intein fusion protein Subsequently, through ligation and refolding, the native conformation and biological functionality of a cytotoxic protein will be recovered.The potential cytotoxic RNase A was expressed by this method [63].One part of this protein was produced as a segment carrying an intein at its C-terminal site.After thiol-induced intein-mediated clea-vage, the obtained thioester of the truncated RNase A was joined with a fragment synthesized by SPPS that contained
a naturally occurring Cys residue at the N-terminus Ligation of both enzymatic inactive protein segments led
to the full length protein, which reconstituted its enzymatic activity after several renaturation steps.Another intein-based approach was used to purify the cytotoxic endonuc-lease I-TevI by insertional inactivation followed by pH controllable splicing [94].In this case, a mini intein mutant (DI-SIM) of the full length Mtu RecA intein was inserted into the I-TevI sequence thereby inactivating the protein
in vivo.The intein triggered the splicing of the protein after purification on a chitin column and the endonuclease could
be obtained in its native state.However, this method was only successful when an appropriate Cys residue was in the target protein allowing proper insertion of the intein Furthermore, the toxicity has to be low and the splicing ratio in vitro/in vivo has to be as high as possible.Expression
of the whole protein is one of the big advantages in this system as the folding of the endonuclease does not interfere with the folding of the intein module.Intein-based trans-splicing systems with either native or artificial split inteins also seem to be adequate workhorses for the synthesis of cytotoxic proteins [91,95]
Segmental isotopic labelling Expressed protein ligation is of great use for the introduc-tion of stable isotopes into protein segments (Fig.8) [96,97] This approach circumvents the practical size limitation for structure determination by using NMR spectroscopy Generally, inadequate loss of structure resolution is based
on several effects that are proportional to the number of amino acids.This includes line broadening, longer rota-tional correlation times and an increased number of signals
of similar chemical shifts.Even though there are new NMR techniques available, like transverse relaxation optimized spectroscopy (TROSY) [98], the standard isotopic labelling strategies through incorporation of uniformly labelled15N,
13C and perdeuteration of amino acid sidechains bear the
Fig 7 Generation of cyclic proteins Intramolecular trans-splicing
(left).The two parts of a split intein flank one protein with N-terminal
Cys.If the intein is reconstituted, a thioester intermediate will be
produced that undergoes transthioesterification.After Asp cyclization
and SfiN-acyl transfer, a cyclic product is formed.Two intein (TWIN)
system (right).The protein of interest is cloned between two inteins.
The N-terminally modified intein can then be cleaved with a pH and
temperature shift, whereas the C-terminal intein is cleaved by addition
of thiols.The reaction of the two reactive groups leads to the formation
of cyclic peptides and proteins.
Trang 10signal overlap of macromolecular systems.Yamazaki et al.
selectively labelled the C-terminal domain of the E coli
RNA polymerase a-subunit [99] by using a trans-splicing
system based on a split PI-PfuI intein.Muir and coworkers
used the EPL strategy to introduce an15N-labelled domain
within the Src-homology domain 3 and 2 segment derived
from AbI protein tyrosine kinase [100].In both cases the
part of the protein of interest was bacterially expressed in
15N-isotope containing media.Fusion of this labelled
segment with the other recombinant protein part that was
unlabelled led to the specifically labelled protein.One of the
great advantages of these labelling strategies is the
possibi-lity to elucidate particular interactions of protein domains
Such a phenomenon could be shown in bacterial sigma
factor [101].In this case, the comparative NMR studies of
isotopic labelled model proteins of this protein obtained by
applying EPL revealed that the C-terminal DNA binding
domain does not interact directly with the N-terminal
autoregulatory domain.EPL and trans-splicing also have a
great impact in the preparation of labelled internal protein
segments.Yamazaki’s group presented a method for central
segmental isotopic labelling by using a tandem trans-splicing
approach [102,103].To label an inner segment of the maltose
binding protein, the target protein was expressed as three
split intein fusion proteins.The central segment was thereby
expressed in isotope containing media as a fusion protein
with attached PI-PfuI and PI-PfuII inteins at its termini
Consequently, the N-terminal parts of the desired protein
were expressed as fusion proteins carrying the other halves of
the split inteins.Simultaneous splicing yielded the target
protein including an inner isotopically labelled fragment
Alternative ligation methods
The only disadvantage of NCL and EPL is the necessity of a
Cys residue or a homologue at the ligation site.The
occurrence of this amino acid in globular proteins is very
low and the insertion of additional Cys residues can alter the
protein structure and function by the formation of disulfide
bridges.Several approaches have been developed to
circumvent this limitation (Fig.9)
NCL with Cys-mimetics
The NCL-methodology has been extended to -X-Gly- and
-Gly-X- ligation sites [104].One peptide possessing a
C-terminal thioester reacts with a second one containing
either an Na(ethanethiol) peptide or a Na(oxyethanethiol) peptide.The thioester intermediate forms a 5- or 6-member ring and in a final SfiN-transfer an amide bond is formed
In a subsequent step, the substitution at the amide bond can
be removed by the treatment with Zn and H+to form a native peptide bond
NCL combined with desulfurization
In this application, NCL is extended to proteins without Cys-residues [105].Ala is a common amino acid in peptides and proteins, thus, a specific Ala is replaced by a Cys residue
at the ligation site within the sequence of the protein of interest.Then NCL is performed to ligate thioester and Cys-peptide.In the following step the Cys is converted to an Ala
by desulfination using palladium or Raney-nickel and hydrogen.This approach can be used for the synthesis of linear and cyclic proteins and extends NCL-methodology
to -X-Ala-.As no selectivity of the desulfurization reaction
is possible, proteins that contain further Cys residues cannot
be made by this technique
Staudinger ligation This ligation method is inspired by the Staudinger reaction, where a phosphine is used to reduce an azide to an amine
An intermediate iminophosphoran possesses a nucleophilic nitrogen which can react with an acyl donor to form an amide.A peptide bearing a C-terminal phosphinothioester
is coupled to another peptide with an N-terminal a-azido group to form a peptide bond.The final product has no residual atoms [106,107].This ligation method may also be combined with NCL for tandem ligation applications.The method however, has up to now only been used for small peptides
Expressed enzymatic ligation This method combines the advantages of expressed protein ligation with the substrate mimetic strategy of protease mediated ligation.The reverse hydrolysis potential of a protease, e.g Glu/Asp-specific serine protease from Staphylococcus aureus, is used to catalyze the peptide bond formation [108].The limiting enzyme substrate specificity and possible proteolysis of peptides and ligated products
is eliminated by substrate mimetics carrying a site-specific ester leaving group at the C-terminus of the former
Fig 8 Segmental isotopic labelling Protein segments are expressed in unlabelled or iso-topically enriched media as fusion proteins with parts of split inteins.Reconstitution of the inteins results in trans-splicing that leads
to terminally (A) or centrally (B) labelled proteins.