Here we pre-sent the high-resolution structure of NlpI, the first structure of a complete TPR-containing protein.. We therefore cloned and expressed NlpI, and determined the protein struc
Trang 1A prokaryotic tetratricopeptide repeat protein with a globular fold Christopher G M Wilson1, Tommi Kajander1and Lynne Regan1,2
1 Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
2 Department of Chemistry, Yale University, New Haven, CT, USA
Repeat proteins in general, and tetratricopeptide
repeats (TPRs) in particular, have recently attracted
interest from the perspectives of structure, function,
folding and design [1–6] The TPR was first identified
during sequence analysis of proteins CDC23 and
nuc2+ from yeast [7,8], and has subsequently been
found in a wide variety of polypeptides from all
gen-era It is a degenerate 34-residue motif, which adopts a
helix-turn-helix structure The first helix is usually
termed the ‘A’ helix, while the second is referred to as
the ‘B’ helix [6] The most common number of tandem
TPRs within a single protein is three, but as many as
16 have been predicted on the basis of sequence
analy-sis [2] Natural and designed TPRs whose structures
have been determined share a common tertiary
organ-ization, which is dominated by interactions between A
helices and the preceding AB pair (Fig 1A) Local
AB, BA¢ and nonlocal AA¢ helix packing generate an
extended superhelical array with right-handed twist The motif is often terminated by an additional A, or
‘capping helix’, whose exposed edge is hydrophilic in character, thus promoting favourable interactions with solvents TPRs are therefore distinct from globular proteins, because they do not possess a single hydro-phobic core Instead, stabilizing interactions are distri-buted throughout the structure
One significant consequence of TPR superhelicity is the creation of concave (‘front-face’) and convex (‘back-face’) surfaces In the best-characterized examples to date, the 3-TPR domains TPR1 and TPR2A of human Hsp organizing protein (HOP), which bind the C-terminal residues of Hsc70 and Hsp90, respectively, ligand recognition occurs at the concave front-face [9,10] Each of the 3-TPR domains of HOP behave
as structurally and functionally independent domains
in vitro In vivo, functioning independently as part of
Keywords
crystal structure; NlpI; lipoprotein;
tetratricopeptide; TPR
Correspondence
L Regan, Yale University, PO Box 208114,
New Haven, CT 06520–8114, USA
Fax: +1 203432 5767
Tel: +1 203432 5566
E-mail: lynne.regan@yale.edu
(Received 7 September 2004, revised 19
September 2004, accepted 19 September
2004)
doi:10.1111/j.1432-1033.2004.04397.x
There are several different families of repeat proteins In each, a distinct structural motif is repeated in tandem to generate an elongated structure The nonglobular, extended structures that result are particularly well suited
to present a large surface area and to function as interaction domains Many repeat proteins have been demonstrated experimentally to fold and function as independent domains In tetratricopeptide (TPR) repeats, the repeat unit is a helix-turn-helix motif The majority of TPR motifs occur as three to over 12 tandem repeats in different proteins The majority of TPR structures in the Protein Data Bank are of isolated domains Here we pre-sent the high-resolution structure of NlpI, the first structure of a complete TPR-containing protein We show that in this instance the TPR motifs do not fold and function as an independent domain, but are fully integrated into the three-dimensional structure of a globular protein The NlpI struc-ture is also the first TPR strucstruc-ture from a prokaryote It is of particular interest because it is a membrane-associated protein, and mutations in it alter septation and virulence
Abbreviations
HOP, Hsp organizing protein; Hsp, heat shock protein; SUPR, superhelical peptide repeat; TPR, tetratricopeptide repeat.
Trang 2the full-length HOP, they act to facilitate the assembly
of multichaperone regulatory complexes The
struc-tural independence of these TPR domains, and the
presence of independent ligand-binding sites in each,
has been assumed to be characteristic properties of
TPR domains Methods that identify motifs from
amino acid sequence (e.g pFAM [11]) readily predict
TPRs, with the implication that they are discrete
domains TPR domains, or even subsets of TPRs
within a domain [12], are often studied independently
In the course of a wider effort toward understanding
TPR structure and function, a number of related
observations intrigued us First, the only structures of
natural TPRs in the PDB (13 TPR-containing coordi-nate sets) consist of nonglobular, extended arrays of helices [9,10,13–21] Second, the majority of these (11 structures) are for isolated domains, taken from larger parent proteins Third, of those structures that consist of more than the TPR sequence alone, the non-TPR component can be unstructured, a C-terminal capping helix [14], can bind back as an extended poly-peptide to the concave face [16], or can assume a completely independent domain organization [19,20] Fourth, a single structure has been determined where TPR motifs are seen to fold back on themselves: in the seven-repeat peroxisomal targeting signal receptor PEX5, repeats 6 and 7 pack against repeats 1 and 2 through loop motifs [13] This is an unusual example, however, because an alternative conformation for PEX5 has been determined in which the structure is extended [18] This may represent a novel confor-mational switching mechanism where dynamic, long-range inter-TPR interactions are critical to function Alternatively, these two conformations could reflect different crystallization conditions Fifth, no structure
of a prokaryotic TPR has yet been determined Here
we describe the first structure of a complete TPR-containing protein from Escherichia coli
New lipoprotein I (NlpI) is a 32 kDa, 276 residue protein from E coli K-12 [22] The corresponding chromosomal gene encodes a 294 amino acid polypep-tide, whose N-terminal 18 residues comprise a periplas-mic export sequence and ‘lipobox’ motif (Fig 1B) Following translocation across the inner plasma mem-brane, the prosequence and lipobox cysteine are recognized, enzymatically modified and proteolytically processed by components of the lipoprotein biosyn-thetic pathway This yields an N-acyl-S-sn-1,2-diacyl-glyceryl-cysteine (residue 19) as the N-terminus of the mature, 276 residue, membrane-anchored protein [22] The identity of the residue at the +2 position (Ser20)
in the mature protein suggests that NlpI is not retained
at the inner membrane, but is likely to be anchored at the outer membrane [23–25] The precise topological location (periplasmic or extracellular face) is not known NlpI has been proposed to play a role in bacterial septation, or regulation of cell wall degrada-tion during cell division [22] Disrupdegrada-tion of the chro-mosomal copy of the nlpI gene, or plasmid-mediated overexpression of the protein, both lead to altered cell morphology and to osmotic sensitivity
NlpI is of potential clinical interest, because loss of the nlpI gene affects the synthesis of pili and flagellae, leading to changes in extracellular adhesion properties which are correlated with an invasive, pathogenic phenotype [26] A BLAST search for similar sequences
Fig 1 Canonical TPR structure and NlpI sequences (A) Example of
TPR extended helical structure, from the consensus design
1NA0.pdb [6] Three repeats are the most common number seen.
The AB and AA¢ W packing angles are responsible for curvature and
superhelicity of the motif (B) Amino acid sequence of NlpI from
the translation of the nlpI gene [22], including the signal
prosequence (underscored) and lipobox cysteine modification site
(boxed) Proposed TPR motifs are shaded grey [27] (C) Alignment
of the putative NlpI TPRs compared to the signature motif
Varia-tions from the consensus are unshaded posiVaria-tions within the vertical
shaded bars The fourth repeat contains the fewest matches to the
consensus.
Trang 3finds highly conserved homologs from well-known
pathogenic species (see below)
Initial studies of NlpI primary structure predicted
three tandem TPR repeats [22] A fourth repeat,
imme-diately following the third, and a fifth independent
TPR, have also been suggested [27], although the
simi-larity of the fourth repeat to the consensus is weak
(Fig 1C) On the basis of these analyses, one might
anticipate the presence of an independent, extended
three-repeat This could account for approximately
40% of the mature sequence, while the structural
char-acter of the remaining polypeptide is unknown
Several properties make NlpI an attractive target
for structural characterization The gene can be
obtained directly from laboratory strains of E coli,
while its size and periplasmic localization suggest it
is likely to be a single domain, capable of
independ-ent folding The single cysteine is responsible for
tethering the protein to a plasma membrane; the
absence of disulfide bridges or other structural
cys-teine, simplifies the protein chemistry by removing
the need for reducing agents during purification and
handling We therefore cloned and expressed NlpI,
and determined the protein structure by X-ray
crys-tallography The structure reveals a fold in which the
TPR is not an independent domain, but is an
integ-ral part of a globular protein
Results and Discussion
Cloning and expression of NlpI
The gene for NlpI was obtained by direct PCR
ampli-fication from E coli DH10B [28] The sequence
corres-ponding to the mature polypeptide (residues 20–294,
lacking the signal prosequence and Cys20)
overexpres-ses exceptionally well in BL21 (DE3), with yields of
100 mgÆL)1 (Fig 2A) Purified mature NlpI is
sol-uble to at least 200 mgÆmL)1 in 10 mm Tris⁄ HCl
pH 8.0, 10 mm NaCl
To investigate the anticipated 3-TPR domain of
NlpI, we subcloned residues 62–197 This region also
expresses well, but in contrast to the mature
polypep-tide, the majority of protein is found in inclusion
bod-ies (Fig 2A) Material purified from the lysate soluble
fraction precipitated after elution off Ni-nitrilotriacetic
acid agarose Alternative expression, purification,
solu-bilization and refolding regimes were investigated, but
we were unable to obtain soluble 3-TPR This result
was surprising, as we had anticipated the 3-repeat to
be an independent domain The insolubility of this
region was the first indication that the 3-TPR might
be participating in a more complex structure We
therefore pursued characterization of only the mature polypeptide (20–294)
Analytical gel filtration chromatography indicated that mature NlpI runs at nearly twice its anticipated size ( 56 kDa vs 31 kDa) This was not entirely unexpected, because an extended array of helices occu-pies a larger hydrodynamic radius than a globular molecule of similar polypeptide length However, cal-culation of the Matthews coefficient from initial crys-tallographic data indicated that a dimer was most compatible with unit-cell dimensions (solvent content
56%) This was further supported by in vivo formal-dehyde crosslinking (data not shown) that trapped a species corresponding to dimeric NlpI
Structure of mature NlpI
We have solved the crystal structure of mature NlpI
to 2.0 A˚ Data collection and refinement statistics are shown in Table 1 A region of the 2Fo-Fc electron density map for residues 158–162 is shown in Fig 3 The protein crystallised in space group P212121 with two monomers in the asymmetric unit, related by a twofold axis of noncrystallographic symmetry running through the dimer interface We conclude that the contents of the asymmetric unit represent the biolo-gically active protein The two chains together form
an arrow-shaped structure, wider than it is deep (Fig 2B,C) N-termini of both molecules share a common point of origin, a feature compatible with membrane localization through N-terminal lipid anchors on both chains Table 2 shows the secondary structure components, interhelix packing geometries, and the angle of rotation between the AB helix pairs present With the exception of an extended, but not unstructured region of polypeptide (30–37), NlpI is composed of a-helix (64%) and turn motifs (23%) NlpI monomers can be described generally as a superhelical array of helix-turn-helix motifs, in which the C-terminus is folded (rolled-up) inside the N-terminus (Fig 2D) A depression on one side of each monomer contains a bound Tris molecule This cavity, formed by the curvature and packing of heli-ces, is highly suggestive of a ligand binding pocket and we speculate that it may represent the functional site of the protein
Helix packing interactions TPRs
Many features of the distribution of side chain con-tacts within NlpI are typical of a TPR protein The side chain contact map (Fig 4A) is dominated by a
Trang 4repetitive pattern of interactions parallel to the
diago-nal (i to i + 3, i to i + 4 within a continuous helix),
orthogonal to the diagonal (helix A interacting with
helix B), then parallel to the diagonal (helix A
inter-acting with helix A¢ of the next repeat), and finally
returning to the diagonal (helix B interacting with
A¢) This distribution is more or less continuous,
reflecting a progression of helix-turn-helix AB, AA¢,
BA¢ interactions through the structure The exception
is helix 1, which interacts exclusively with helix 2
through hydrophobic packing of bulky groups (e.g
Leu44 against Leu77, Met47 against the aliphatic
components of side chain Arg68) Of the six AB
helix-turn-helix pairs (Table 2), four closely resemble
TPRs: helices 2 and 3 (TPR1), 4 and 5 (TPR2), 6 and 7 (TPR3) and 12 and 13 (TPR4) contain the characteristic pattern of signature residues that coin-cides with helix-loop-helix lengths Interhelix AB and AA¢ W packing angles fall within those typical for TPRs [6] These repeats correspond to the anticipated tandem 3-TPR, and the isolated fifth TPR predicted from the amino acid sequence (Fig 1C) [22,27] Heli-ces 2 and 12 (A heliHeli-ces of TPRs 1 and 4, respect-ively) contain additional helical residues preceding the start of the TPR region The final helix (14) does not contain the solvating polar groups associated with terminating ‘capping’ helices found in the other TPR structures (e.g PP5, TPR1 and TPR2A of
D C
Fig 2 Solubility of NlpI constructs and the structure of mature NlpI (A) 10–20% gradient SDS ⁄ PAGE of NlpI expression products, showing the insolubility of the 3-TPR construct vs mature NlpI BenchMark molecular mass markers (lanes 1 & 4); 3-TPR insoluble (lane 2, arrow-head) and soluble (lane 3); mature NlpI insoluble (lane 5) and soluble (lane 6, arrowarrow-head); mature NlpI following TEV protease cleavage, and purification over Superdex 75 (lane 7, arrowhead, anticipated molecular mass of 31.8 kDa) (B) Side and (D) top views of the NlpI dimer Chains are coloured from N- (dark blue) to C-termini (orange) Axis of noncrystallographic rotational symmetry runs through the center ‘x’ (C) Monomer of NlpI, showing the rolled-up array of helices with the C-terminus folding within the curvature of the N-terminus Helix num-bers are in brackets Note that ‘A’ helices locate to the globular center, and the perpendicular arrangement of helices 10 and 11, against heli-ces 8 and 9.
Trang 5HOP), because the majority of these residues
partici-pate in the protein core
NonTPR helix motifs
Packing interactions are more complex for the two
remaining pairs of helices (8 and 9, 10 and 11) These
are of particular significance since they are responsible
for the compact structure of NlpI Helix pair 10–11
(Fig 4B) is, at 27 residues (17 of which are helix) too short to be termed a TPR The interhelix ABW packing angle is the highest (+ 172), bringing them close to parallel, and is also of the opposite sign to that which characterizes a TPR Interactions with the following pair of helices (12 and 13) is distinguished by the only negative AA¢ angle within the protein Critically, this combination of nonTPR packing angles imparts left-handed superhelical character to the region The pitch
of the overall right-handed superhelix is therefore reduced, which brings the C-terminus up toward the N-terminus
Helices 8 and 9 correspond to the region of sequence postulated by some to be a fourth TPR, following on immediately after the tandem 3-repeat (Fig 1C [27]) However, the interactions taking place within this pair indicate that it is not a TPR Helix 8 contains the hydrophobic sequence LWLYL(168–171), and these groups are involved in long-range interactions (dis-cussed below) Helix 8 participates more in these than
in packing against its partner, helix 9 The signature glycine residue of the A helix is missing, a space occu-pied in a TPR by a bulky hydrophobic or aromatic ring from the partner B helix (knobs-in-holes complement-arity) Instead, the glycine position is occupied by ala-nine, with the remaining space filled by a tyrosine (Tyr171) from the same (A) helix The complementarity
Table 1 SeMet-NlpI Data processing, MAD phasing and refinement statistics FOM, figure of merit, value from DM is at 2.0 A ˚ , whereas
SOLVE ⁄ RESOLVE values are at 2.5 A ˚
I ⁄ Sigma b
Avearge B-factors (A˚2 )
Ramachandaran plot (%) (most favoured ⁄ allowed ⁄ disallowed) 94.0 ⁄ 6.0 ⁄ 0.0
a Values are given with Friedel-pairs (hkl and -h-k-l) kept separate b Value for the high resolution shell is given in parentheses.
Fig 3 Sample 2Fo-Fc density map for NlpI (sigma ¼ 1.0) Residues
shown are Gln158-Asn162 (QDDPN) for chain A, which
corres-ponds to the turn region between helices 7 and 8 Image was
pro-duced with BOBSCRIPT [57] and RASTER 3 D [58].
Trang 6between helix 8 and 9 is therefore less marked, and they
are less tightly packed together compared to AB pairs
before or after (Fig 4B) The apparent lack of contacts
between helices 8 and 9 is, however, compensated for
by an unusual association with the next pair of helices,
10 and 11 These pack against 8 and 9 at an angle of
96 (visible in Fig 2D), which is the highest
inter-repeat rotation angle within the structure A unique,
nonTPR interaction takes place where the indole ring
of Trp200 (helix 10) inserts between helices 8 and 9,
against Phe190 and the amide backbone of Phe165
This locks the pairs together (Fig 4D) The abrupt
increase in helical array curvature is the second factor
responsible for bringing distal regions of sequence back
toward proximal ones
Long-range interactions & globularity
The presence of long-range contacts within NlpI is
revealed by clusters in the contact map far from the
diagonal (Fig 4A) These interactions take place only
between A helices, which dominate the inside of the
NlpI helical roll (Fig 2C) The clusters can be
consid-ered as four overlapping groups (Fig 4E) Cluster 1
involves helices 12 and 14, packing against the
N-terminal region of NlpI These constitute the most
distant interactions between elements of primary
structure, and include a hydrogen bond between the backbone carbonyl of Asn263 and the backbone amide
of Leu34 The positive AA¢ W angle between helix 12 and helix 10 is in part responsible for this Cluster 2 consists of loop against loop interactions between TPR1 and TPR2, with helix 14 (including a Ca back-bone contact between Gly76 and His266) From func-tional perspectives this is perhaps significant, as this first TPR is more open than any other, forming an exposed ‘lip’ on the NlpI monomer, and therefore most closely resembles classical TPR front-face environments
in providing an interaction surface Cluster 3 contains solvent inaccessible hydrophobic groups (Leu134, Ile138, Val269) and hydrogen bond interactions (Tyr142-Lys242) between the third and fourth TPR, with helix 14 Cluster 4, which overlaps with clusters 2 and 3, forms the core of NlpI long-range interactions These include bulky hydrophobic, aromatic and surface solvent exposed groups from helices 6, 8, 12 and 14 For example, Trp169 (helix 8) has hydrophobic con-tacts with Leu134, Ile138, Tyr142 (helix 6), Phe238 (helix 12) and Val269 (helix 14) Trp169 and Tyr273 (helix 14) are on opposite sides of the protein core, and
do not interact directly, but they are bridged by alipha-tic and aromaalipha-tic groups (Ile138, Tyr142, Phe238 and Val269) from helices 6 and 12 Aromatic ring inter-actions, including ‘T’ face-edge (Tyr131-Phe156,
Table 2 Primary, secondary and tertiary structure statistics for mature NlpI Excluding the N-terminal 310helix (H0), NlpI contains 14 a-heli-ces W Packing angles characteristic of TPRs range from )160 to )174 (AB), 11 to 32 (AA¢) and 40 to 53 (AB-A’B¢ repeat rotation) [6].
AB pairs 4 and 5 (helices 8–9, 10–11) do not display true TPR characteristics, but are responsible for the sharp curvature of the array and reduced superhelical pitch, leading to the formation of a globular structure.
Helix a
Residues Deviation b,c
A ˚ AB pair Wc
AB W c
AA¢
Rotation d
(AB)(A¢B¢)
Helix pair sequence (signature TPR residues underscored)
1 38–51 6.4 – – – – W LG Y A F A P
2 58–74 7.6 1 )160.5 +31.3 57.6 DDERAQLLYERGVLYDSLGLRALARNDFSQALAIRPDM
3 78–91 5.9 (TPR1) (pair 1–2)
4 96–108 5.9 2 )165.5 +16.7 59.0 PEVFNYLGIYLTQAGNFDAAYEAFDSVLELDPTY
5 112–125 5.1 (TPR2) (pair 2–3)
6 131–142 8.9 3 )153.0 +26.5 16.2 YAHLNRGIALYYGGRDKLAQDDLLAFYQDDPND
7 146–159 12.2 (TPR3) (pair 3–4)
8 164–177 14.6 4 +163.3 +26.9 96.2 PFRSLWLYLAEQKLDEKQAKEVLKQHFEKSDKEQW
9 179–192 3.9 (pair 4–5)
10 199–206 12.9 5 +172.6 )26.0 40.4 GWNIVEFYLGNISEQTLMERLKADATD
11 212–222 7.4 (pair 5–6)
12 226–246 19.4 6 )158.3 +38.2 261 NTSLAEHLSETNFYLGKYYLSLGDLDSATALFKLAVANNVHNF
13 250–261 8.6 (TPR4) (pair 1–6)
14 269–283 7.9 – – – –
a
Defined by by PROCHECK [51] b
From ideal helix geometry c
Calculated by PROMOTIF [52] d Obtained by transforming one helical pair onto another with lsqman [53].
Trang 7Tyr141-Tyr142, Phe205-Tyr243) and offset pi-stacking
(Phe85-Tyr101, Trp298-His232) are evident, with the
spaces between these moieties filled by bulky aliphatics
(Fig 4F)
Long-range interactions are a characteristic property
of globular protein structures By virtue of their
exten-ded helical organization, TPR and other repeat
pro-teins typically lack this feature, but rather have
stabilizing apolar contacts distributed throughout the
molecule, both within and between repeats In contrast,
NlpI contains a central hydrophobic core, composed of
distant motifs from TPR and nonTPR helices TPRs
are therefore compatible with globular structures, but
they do not appear to be capable of defining it The
increased curvature and reduced superhelical pitch
required to form a compact, tertiary structure are
derived from nonTPR elements within the fold
Quaternary organization
NlpI is a homodimer in solution, and the crystal
structure reveals monomers to be related by twofold
axis of symmetry The dimer interface consists of the
extended N-terminal region, helix 1 and TPR helices
2, 3, 11, 12, 13 and 14 (Table 3, Figs 5 and 6B) The
values obtained for interface surface area, interaction
type (two-thirds hydrophobic, but also hydrogen bonds and salt-bridges), gap volume index and planar-ity (which relate to the complementarplanar-ity of the inter-face surinter-faces) fall within the ranges associated with known homodimeric states [29] Three aspects are especially noteworthy First, rotational symmetry places the N-termini of both monomers spatially close
to each other A lipid-modified dimer will therefore be anchored to a plasma membrane in a specific orienta-tion (N-termini ‘face down’ toward the membrane) This is significant, because the potential ligand binding
C
D
Fig 4 Contact map of mature NlpI and packing interactions (A) Backbone (upper left from diagonal) and side chain (lower right from diago-nal) contacts within 5 A ˚ Long-range contact clusters are boxed (B) Packing interactions between nonTPR helices 10 (red) and 11 (blue), and (C) helices 8 (red) and 9 (blue) Space-filling atoms shown are large and small hydrophobic residues (F, Y, W, I, L, V, A and G) Bulky groups
of helix 8 point toward the protein core (D) View of NlpI helices 8 and 9 (with helix 7 removed), showing diminished association between the pair, and the insertion of Trp200 from helix 10 Right-handed superhelical curvature imparted by the first three TPRs appears to cease, allowing the subsequent structure to roll-up (C) Location of long-range packing clusters from (A), which define the core of NlpI (F) Aromatic and bulky side chains surrounding Trp169 (orange).
Table 3 NlpI dimer interface statistics Values were obtained with SURFNET [29,55] SA ¼ surface area.
Monomer buried SA (A˚2 ) 1585.5 N-terminal 23–26, 30–36
% Monomer SA 12.6 Helix 1 38, 41, 44, 45, 48,
49, 51
% Nonpolar atoms 36.6 Helix 2 68, 76, 77
% Polar atoms 63.4 Helix 3 78, 79, 80, 83
Hydrogen bonds 16 Helix 12 237, 240, 244
262, 264, 266
Trang 8cleft (discussed below) of each monomer would then
be exposed, and oriented roughly perpendicular to the
plane of the membrane NlpI localized in this manner
could then serve as a tether to which other functional
components would bind Second, the dimer interface
is made up of distant regions of monomer sequence
That is, N-terminal and C-terminal portions of
mono-mer polypeptide come together, forming the molecular
surface Quaternary structure is therefore dependent
on tertiary structure, and their formation may be
interdependent (cooperative) Third, it was previously
noted that the first TPR (helices 2 and 3) participates
in long-range interactions within a monomer through
loop residues (Asp73, Ser74, Leu75, Arg78), while the
majority of the inner front-face assumes an open ‘lip’
conformation (Fig 4D) In contrast, seven residues of
the outer back-face participate in the dimer interface,
packing against C-terminal portions of polypeptide
from the partner molecule Consideration of
mono-meric NlpI alone gives the impression that these
heli-ces make few molecular contacts when in fact they
make many, albeit with a separate polypeptide chain
The insolubility of NlpI 3-TPR (fragment 62–197) is
therefore understandable, in terms of the failure of an
isolated motif to form critical intra- and
intermole-cular contacts These observations demonstrate the
capacity, and on occasion the necessity, of TPRs to
participate at all levels of structure organization, and
suggest that the fold is more versatile than was
previ-ously thought
We now know the structure of NlpI, and observe
that the TPRs in this protein do not form an
inde-pendent domain One could therefore ask if there are any features of the TPR sequences that hint at differences between these TPRs, and those that fold in-dependently Unfortunately, with the limited sequence– structure data available at this stage, there are no correlations strong enough to allow us to predict, or subclassify, which TPR sequences will form an exten-ded array and which will adopt globular structures
Implications for function: a putative binding cleft
We examined the conservation of NlpI structure through a sequence alignment of the 12 most similar sequences identified through a BLAST search (Fig 6A) When conserved positions are mapped onto the struc-ture of NlpI, they correlate to three distinct locations within the protein Two of these are clearly structural in nature: the globular core (discussed previously) and the dimerization interface (also predominantly hydropho-bic, Fig 6B), suggest that NlpI homologs share a com-mon tertiary and quaternary organization The third conserved region corresponds to the depression on one side of each NlpI monomer (Fig 6C) The cleft is lined with polar (Asn267), acidic (Asp163, Glu235, Glu231 and Glu270), aromatic (Tyr131, Phe165, Trp198, Phe268) and hydrophobic groups (Ile104, Val269) Visu-ally, the shape of the depression is highly suggestive of a binding site The presence of four invariant acidic groups (one aspartate and three glutamate) implicates electrostatic interactions, possibly with a basic motif, in the putative binding event The high degree of sequence conservation in the cleft suggests all homologs of NlpI bind the same ligand
In addition, our attention was drawn to this cavity during the final stages of model building, because it contained a patch of 2Fo-Fc density that could not be accounted for by the polypeptide, water molecules,
Mg2+or Cl–ions The structure of Tris, also a compo-nent of the crystallization mother liquor, was found to
fit the density envelope, making hydrogen bonds with carboxylate groups of Glu235 and Glu270, and with the back bone amide of Val269 Phe165 and Phe268 face each other, flanking the two carboxylates and Tris (Fig 6D)
NlpI is thought to play a role in the regulation of the cell wall and extracellular surface, but its exact function
is not known, and no ligand interactions have yet been described There has been some suggestion that the C-terminus may associate with the periplasmic protease Tsp, and it has been proposed that removal of residues beyond Gly282 serves to activate the protein [27] How-ever, the C-terminus of NlpI does not contain a motif that resembles the canonical ‘WVAAA’ associated with
Fig 5 NlpI dimer interface Chain B has been translated and
rota-ted to expose the surface in contact (yellow) The interface is
com-posed of remote regions of sequence from the N- and C-termini.
Val32 is indicated to illustrate the rotational symmetry of the
inter-face Contacts were obtained with SURFNET [55] and CONTACT from
the CCP4 [42,43] suite of programs.
Trang 9Tsp recognition [30] Our structure suggests the
func-tionality of NlpI in fact lies within the cleft associated
with globular body of the fold
Structural homologs
A DALI search for homologous structures finds two
PDB entries with significant similarity to NlpI The
first, p67phox from human (Z-score¼ 12.4, rmsd of
2.5 A˚ over 152 residues) consists of four TPR motifs,
and an extended C-terminus that packs against the
concave front-face groove through hydrophobic
interactions The intracellular ligand, Rac-GTP, is required for the assembly of the multiprotein NADPH oxidase complex, and binds to the surface formed by TPR connecting loops and the C-terminal polypeptide [16] The ligand-binding mode is there-fore distinct from that of TPR1 and TPR2A of HOP The structural similarity between NlpI and p67phox is illustrated in Fig 7A, and reveals a close match between the first four AB repeats of each protein
The second structure, domain III from E coli malt-ose transcriptional regulator MalT (Z-score ¼ 12.3,
A
Fig 6 Homologs of NlpI, structural conservation the putative binding cleft (A) CLUSTALX alignment [48] of the 12 sequences most similar to
E coli NlpI identified by BLAST [49] Positions are colored as follows: red, identical (*); yellow, similar (:) 1, Escherichia coli; 2, Salmonella typhinurium; 3, Yersinia pestis; 4, Yersinia enterocolitica; 5, Vibro haemolyticus; 6, Vibro vulnificus; 7, Vibro cholerae; 8, Photorhabdus lumi-nescens; 9, Photobacterium profundum; 10, Haemophilus influenzae; 11, Haemophilus ducreyi; 12, Shewanella oneidensis E coli residues underscored locate to the hydrophobic core (B) Molecular surface revealing conserved positions within the dimerization interface (mostly hydrophobic) (C) The surface of an NlpI monomer, showing the putative ligand binding cleft and bound Tris molecule (D) Tris molecule, conserved acidic and aromatic side chains within the cleft Orange dashes indicate hydrogen bonds between Tris and the amide backbone of Val269, and conserved side chain carboxylates of Glu235 and Glu270.
Trang 10rmsd of 4.3 A˚ over 212 residues), consists of eight
superhelical peptide repeat (SUPR) motifs that assume
a superhelical fold [31] SUPRs resemble TPRs, but
their helices are slightly longer (16–18 residues) and
the sequence consensus is more degenerate The
N-terminal region of MalT is responsible for
func-tional dimerization, while the C-terminus is thought to
contain a maltotriose binding site, formed by the
con-cave surface of four SUPR repeats Close structural
similarity can be seen between the first three AB
repeats of NlpI and MalT (Fig 7B) However, by the
sixth repeat the reduction in NlpI superhelical pitch
has folded the protein back onto itself, while MalT
continues in a more regular superhelix (and in
conse-quence lacks hydrophobic core interactions)
In terms of their biological roles, p67phox and MalT
both mediate intermolecular interactions, and are
responsible for the assembly of multiprotein
com-plexes It is therefore interesting to speculate, on the
basis of structural identity and the presence of a
con-served surface cleft, whether NlpI participates in
ana-logous multiprotein assemblies in E coli
Conclusion
We have determined the first structure of a complete
TPR-containing globular protein, which is also the first
TPR from a prokaryotic organism The structure
reveals an intimate association between the TPR motifs
and the rest of the protein, showing how the TPR
par-ticipates in the overall fold Until now, many of the
TPR-containing regions of proteins have behaved as
separate domains: they fold and function completely
independently of the rest of the protein Here we show
an alternate arrangement, in which the TPR is insepar-ably part of the whole structure Nothing about the sequence, or any a priori considerations, suggested that this structure would be different from the TPRs from independent domains The structure provides a strong hint at the location of the active site, though as yet the ligand bound by NlpI has not been identified Its involvement in bacterial virulence, and likely presence
of identical interactions in many pathogenic species, makes NlpI a potential target for new antibiotics
Experimental procedures
Cloning NlpI constructs
DNA encoding NlpI sequences was obtained by PCR amplification from a single colony of E coli DH10B [28] (Invitrogen, Carlsbad, CA, USA), grown on Luria–Bertani agar overnight All oligonucleotides were chemically syn-thesized by the W M Keck Core Facility (Yale University, New Haven, CT, USA) Primers to amplify mature NlpI (residues 20–294) were 5¢-aataatccatggggagtaatacttcctggcgta aaagtgaagtcc-3¢ and 5¢-attattggatccctattgctggtccgattctgccag-3¢ 3-TPR NlpI primers (residues 62–197) were 5¢-aataatccatgg gggcacagcttttatatgagcgcggag-3¢ and 5¢-aataatggatcctcactgttc cttatccgatttttcgaagtgc-3¢ PCR products were doubly diges-ted with NcoI and BamHI (New England Biolabs, Beverly,
MA, USA), and purified by agarose gel electrophoresis onto dialysis membrane, prior to ligation into doubly diges-ted, dephosphorylated expression vector pET11a-HT This vector was assembled in-house from vectors pProEX-HTa (Invitrogen) and pET11a [32] (Novagen, San Diego, CA, USA), and places cloned sequences under T7 promoter con-trol Expression in an E coli DE3 bacterial host produces
an N-terminal hexahistidine-tagged protein, cleavable with TEV protease Ligation products were transformed into electrocompetent E coli DH10B (Invitrogen), and trans-formants sequenced by the W M Keck Facility
Expression and purification
Plasmids, verified by DNA sequencing, were transformed into E coli BL21 (DE3) Gold (Stratagene), and grown in
car-benicillin at 37C until cell culture absorbance at 600 nm
induction with 100 lm isopropyl thio-b-d-galactoside Cells were harvested by centrifugation (6000 g, 20 min) after 4 h
(SeMet)-labelled NlpI was expressed in E coli methionine auxotroph B834 (DE3), grown in M9 medium [33]
thiamine At a cell density of D600¼ 0.4, bacteria were har-vested and resuspended in fresh M9 supplemented with
Fig 7 Structural homologues of NlpI Structure alignment of NlpI
(blue) with (A) p67 phox and (B) domain III of MalT (yellow)
Super-imposition of coordinates was performed with LSQ _ EXPLICIT and
LSQ _ IMPROVE [45].