Báo cáo khoa học: The crystal structure of NlpI A prokaryotic tetratricopeptide repeat protein with a globular fold potx

Here we pre-sent the high-resolution structure of NlpI, the ﬁrst structure of a complete TPR-containing protein.. We therefore cloned and expressed NlpI, and determined the protein struc

Trang 1

A prokaryotic tetratricopeptide repeat protein with a globular fold Christopher G M Wilson1, Tommi Kajander1and Lynne Regan1,2

1 Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA

2 Department of Chemistry, Yale University, New Haven, CT, USA

Repeat proteins in general, and tetratricopeptide

repeats (TPRs) in particular, have recently attracted

interest from the perspectives of structure, function,

folding and design [1–6] The TPR was ﬁrst identiﬁed

during sequence analysis of proteins CDC23 and

nuc2+ from yeast [7,8], and has subsequently been

found in a wide variety of polypeptides from all

gen-era It is a degenerate 34-residue motif, which adopts a

helix-turn-helix structure The ﬁrst helix is usually

termed the ‘A’ helix, while the second is referred to as

the ‘B’ helix [6] The most common number of tandem

TPRs within a single protein is three, but as many as

16 have been predicted on the basis of sequence

analy-sis [2] Natural and designed TPRs whose structures

have been determined share a common tertiary

organ-ization, which is dominated by interactions between A

helices and the preceding AB pair (Fig 1A) Local

AB, BA¢ and nonlocal AA¢ helix packing generate an

extended superhelical array with right-handed twist The motif is often terminated by an additional A, or

‘capping helix’, whose exposed edge is hydrophilic in character, thus promoting favourable interactions with solvents TPRs are therefore distinct from globular proteins, because they do not possess a single hydro-phobic core Instead, stabilizing interactions are distri-buted throughout the structure

One signiﬁcant consequence of TPR superhelicity is the creation of concave (‘front-face’) and convex (‘back-face’) surfaces In the best-characterized examples to date, the 3-TPR domains TPR1 and TPR2A of human Hsp organizing protein (HOP), which bind the C-terminal residues of Hsc70 and Hsp90, respectively, ligand recognition occurs at the concave front-face [9,10] Each of the 3-TPR domains of HOP behave

as structurally and functionally independent domains

in vitro In vivo, functioning independently as part of

Keywords

crystal structure; NlpI; lipoprotein;

tetratricopeptide; TPR

Correspondence

L Regan, Yale University, PO Box 208114,

New Haven, CT 06520–8114, USA

Fax: +1 203432 5767

Tel: +1 203432 5566

E-mail: lynne.regan@yale.edu

(Received 7 September 2004, revised 19

September 2004, accepted 19 September

2004)

doi:10.1111/j.1432-1033.2004.04397.x

There are several different families of repeat proteins In each, a distinct structural motif is repeated in tandem to generate an elongated structure The nonglobular, extended structures that result are particularly well suited

to present a large surface area and to function as interaction domains Many repeat proteins have been demonstrated experimentally to fold and function as independent domains In tetratricopeptide (TPR) repeats, the repeat unit is a helix-turn-helix motif The majority of TPR motifs occur as three to over 12 tandem repeats in different proteins The majority of TPR structures in the Protein Data Bank are of isolated domains Here we pre-sent the high-resolution structure of NlpI, the ﬁrst structure of a complete TPR-containing protein We show that in this instance the TPR motifs do not fold and function as an independent domain, but are fully integrated into the three-dimensional structure of a globular protein The NlpI struc-ture is also the ﬁrst TPR strucstruc-ture from a prokaryote It is of particular interest because it is a membrane-associated protein, and mutations in it alter septation and virulence

Abbreviations

HOP, Hsp organizing protein; Hsp, heat shock protein; SUPR, superhelical peptide repeat; TPR, tetratricopeptide repeat.

Trang 2

the full-length HOP, they act to facilitate the assembly

of multichaperone regulatory complexes The

struc-tural independence of these TPR domains, and the

presence of independent ligand-binding sites in each,

has been assumed to be characteristic properties of

TPR domains Methods that identify motifs from

amino acid sequence (e.g pFAM [11]) readily predict

TPRs, with the implication that they are discrete

domains TPR domains, or even subsets of TPRs

within a domain [12], are often studied independently

In the course of a wider effort toward understanding

TPR structure and function, a number of related

observations intrigued us First, the only structures of

natural TPRs in the PDB (13 TPR-containing coordi-nate sets) consist of nonglobular, extended arrays of helices [9,10,13–21] Second, the majority of these (11 structures) are for isolated domains, taken from larger parent proteins Third, of those structures that consist of more than the TPR sequence alone, the non-TPR component can be unstructured, a C-terminal capping helix [14], can bind back as an extended poly-peptide to the concave face [16], or can assume a completely independent domain organization [19,20] Fourth, a single structure has been determined where TPR motifs are seen to fold back on themselves: in the seven-repeat peroxisomal targeting signal receptor PEX5, repeats 6 and 7 pack against repeats 1 and 2 through loop motifs [13] This is an unusual example, however, because an alternative conformation for PEX5 has been determined in which the structure is extended [18] This may represent a novel confor-mational switching mechanism where dynamic, long-range inter-TPR interactions are critical to function Alternatively, these two conformations could reﬂect different crystallization conditions Fifth, no structure

of a prokaryotic TPR has yet been determined Here

we describe the ﬁrst structure of a complete TPR-containing protein from Escherichia coli

New lipoprotein I (NlpI) is a 32 kDa, 276 residue protein from E coli K-12 [22] The corresponding chromosomal gene encodes a 294 amino acid polypep-tide, whose N-terminal 18 residues comprise a periplas-mic export sequence and ‘lipobox’ motif (Fig 1B) Following translocation across the inner plasma mem-brane, the prosequence and lipobox cysteine are recognized, enzymatically modiﬁed and proteolytically processed by components of the lipoprotein biosyn-thetic pathway This yields an N-acyl-S-sn-1,2-diacyl-glyceryl-cysteine (residue 19) as the N-terminus of the mature, 276 residue, membrane-anchored protein [22] The identity of the residue at the +2 position (Ser20)

in the mature protein suggests that NlpI is not retained

at the inner membrane, but is likely to be anchored at the outer membrane [23–25] The precise topological location (periplasmic or extracellular face) is not known NlpI has been proposed to play a role in bacterial septation, or regulation of cell wall degrada-tion during cell division [22] Disrupdegrada-tion of the chro-mosomal copy of the nlpI gene, or plasmid-mediated overexpression of the protein, both lead to altered cell morphology and to osmotic sensitivity

NlpI is of potential clinical interest, because loss of the nlpI gene affects the synthesis of pili and ﬂagellae, leading to changes in extracellular adhesion properties which are correlated with an invasive, pathogenic phenotype [26] A BLAST search for similar sequences

Fig 1 Canonical TPR structure and NlpI sequences (A) Example of

TPR extended helical structure, from the consensus design

1NA0.pdb [6] Three repeats are the most common number seen.

The AB and AA¢ W packing angles are responsible for curvature and

superhelicity of the motif (B) Amino acid sequence of NlpI from

the translation of the nlpI gene [22], including the signal

prosequence (underscored) and lipobox cysteine modification site

(boxed) Proposed TPR motifs are shaded grey [27] (C) Alignment

of the putative NlpI TPRs compared to the signature motif

Varia-tions from the consensus are unshaded posiVaria-tions within the vertical

shaded bars The fourth repeat contains the fewest matches to the

consensus.

Trang 3

ﬁnds highly conserved homologs from well-known

pathogenic species (see below)

Initial studies of NlpI primary structure predicted

three tandem TPR repeats [22] A fourth repeat,

imme-diately following the third, and a ﬁfth independent

TPR, have also been suggested [27], although the

simi-larity of the fourth repeat to the consensus is weak

(Fig 1C) On the basis of these analyses, one might

anticipate the presence of an independent, extended

three-repeat This could account for approximately

40% of the mature sequence, while the structural

char-acter of the remaining polypeptide is unknown

Several properties make NlpI an attractive target

for structural characterization The gene can be

obtained directly from laboratory strains of E coli,

while its size and periplasmic localization suggest it

is likely to be a single domain, capable of

independ-ent folding The single cysteine is responsible for

tethering the protein to a plasma membrane; the

absence of disulﬁde bridges or other structural

cys-teine, simpliﬁes the protein chemistry by removing

the need for reducing agents during puriﬁcation and

handling We therefore cloned and expressed NlpI,

and determined the protein structure by X-ray

crys-tallography The structure reveals a fold in which the

TPR is not an independent domain, but is an

integ-ral part of a globular protein

Results and Discussion

Cloning and expression of NlpI

The gene for NlpI was obtained by direct PCR

ampli-ﬁcation from E coli DH10B [28] The sequence

corres-ponding to the mature polypeptide (residues 20–294,

lacking the signal prosequence and Cys20)

overexpres-ses exceptionally well in BL21 (DE3), with yields of

100 mgÆL)1 (Fig 2A) Puriﬁed mature NlpI is

sol-uble to at least 200 mgÆmL)1 in 10 mm Tris⁄ HCl

pH 8.0, 10 mm NaCl

To investigate the anticipated 3-TPR domain of

NlpI, we subcloned residues 62–197 This region also

expresses well, but in contrast to the mature

polypep-tide, the majority of protein is found in inclusion

bod-ies (Fig 2A) Material puriﬁed from the lysate soluble

fraction precipitated after elution off Ni-nitrilotriacetic

acid agarose Alternative expression, puriﬁcation,

solu-bilization and refolding regimes were investigated, but

we were unable to obtain soluble 3-TPR This result

was surprising, as we had anticipated the 3-repeat to

be an independent domain The insolubility of this

region was the ﬁrst indication that the 3-TPR might

be participating in a more complex structure We

therefore pursued characterization of only the mature polypeptide (20–294)

Analytical gel ﬁltration chromatography indicated that mature NlpI runs at nearly twice its anticipated size ( 56 kDa vs 31 kDa) This was not entirely unexpected, because an extended array of helices occu-pies a larger hydrodynamic radius than a globular molecule of similar polypeptide length However, cal-culation of the Matthews coefﬁcient from initial crys-tallographic data indicated that a dimer was most compatible with unit-cell dimensions (solvent content

56%) This was further supported by in vivo formal-dehyde crosslinking (data not shown) that trapped a species corresponding to dimeric NlpI

Structure of mature NlpI

We have solved the crystal structure of mature NlpI

to 2.0 A˚ Data collection and reﬁnement statistics are shown in Table 1 A region of the 2Fo-Fc electron density map for residues 158–162 is shown in Fig 3 The protein crystallised in space group P212121 with two monomers in the asymmetric unit, related by a twofold axis of noncrystallographic symmetry running through the dimer interface We conclude that the contents of the asymmetric unit represent the biolo-gically active protein The two chains together form

an arrow-shaped structure, wider than it is deep (Fig 2B,C) N-termini of both molecules share a common point of origin, a feature compatible with membrane localization through N-terminal lipid anchors on both chains Table 2 shows the secondary structure components, interhelix packing geometries, and the angle of rotation between the AB helix pairs present With the exception of an extended, but not unstructured region of polypeptide (30–37), NlpI is composed of a-helix (64%) and turn motifs (23%) NlpI monomers can be described generally as a superhelical array of helix-turn-helix motifs, in which the C-terminus is folded (rolled-up) inside the N-terminus (Fig 2D) A depression on one side of each monomer contains a bound Tris molecule This cavity, formed by the curvature and packing of heli-ces, is highly suggestive of a ligand binding pocket and we speculate that it may represent the functional site of the protein

Helix packing interactions TPRs

Many features of the distribution of side chain con-tacts within NlpI are typical of a TPR protein The side chain contact map (Fig 4A) is dominated by a

Trang 4

repetitive pattern of interactions parallel to the

diago-nal (i to i + 3, i to i + 4 within a continuous helix),

orthogonal to the diagonal (helix A interacting with

helix B), then parallel to the diagonal (helix A

inter-acting with helix A¢ of the next repeat), and ﬁnally

returning to the diagonal (helix B interacting with

A¢) This distribution is more or less continuous,

reﬂecting a progression of helix-turn-helix AB, AA¢,

BA¢ interactions through the structure The exception

is helix 1, which interacts exclusively with helix 2

through hydrophobic packing of bulky groups (e.g

Leu44 against Leu77, Met47 against the aliphatic

components of side chain Arg68) Of the six AB

helix-turn-helix pairs (Table 2), four closely resemble

TPRs: helices 2 and 3 (TPR1), 4 and 5 (TPR2), 6 and 7 (TPR3) and 12 and 13 (TPR4) contain the characteristic pattern of signature residues that coin-cides with helix-loop-helix lengths Interhelix AB and AA¢ W packing angles fall within those typical for TPRs [6] These repeats correspond to the anticipated tandem 3-TPR, and the isolated ﬁfth TPR predicted from the amino acid sequence (Fig 1C) [22,27] Heli-ces 2 and 12 (A heliHeli-ces of TPRs 1 and 4, respect-ively) contain additional helical residues preceding the start of the TPR region The ﬁnal helix (14) does not contain the solvating polar groups associated with terminating ‘capping’ helices found in the other TPR structures (e.g PP5, TPR1 and TPR2A of

D C

Fig 2 Solubility of NlpI constructs and the structure of mature NlpI (A) 10–20% gradient SDS ⁄ PAGE of NlpI expression products, showing the insolubility of the 3-TPR construct vs mature NlpI BenchMark molecular mass markers (lanes 1 & 4); 3-TPR insoluble (lane 2, arrow-head) and soluble (lane 3); mature NlpI insoluble (lane 5) and soluble (lane 6, arrowarrow-head); mature NlpI following TEV protease cleavage, and purification over Superdex 75 (lane 7, arrowhead, anticipated molecular mass of 31.8 kDa) (B) Side and (D) top views of the NlpI dimer Chains are coloured from N- (dark blue) to C-termini (orange) Axis of noncrystallographic rotational symmetry runs through the center ‘x’ (C) Monomer of NlpI, showing the rolled-up array of helices with the C-terminus folding within the curvature of the N-terminus Helix num-bers are in brackets Note that ‘A’ helices locate to the globular center, and the perpendicular arrangement of helices 10 and 11, against heli-ces 8 and 9.

Trang 5

HOP), because the majority of these residues

partici-pate in the protein core

NonTPR helix motifs

Packing interactions are more complex for the two

remaining pairs of helices (8 and 9, 10 and 11) These

are of particular signiﬁcance since they are responsible

for the compact structure of NlpI Helix pair 10–11

(Fig 4B) is, at 27 residues (17 of which are helix) too short to be termed a TPR The interhelix ABW packing angle is the highest (+ 172), bringing them close to parallel, and is also of the opposite sign to that which characterizes a TPR Interactions with the following pair of helices (12 and 13) is distinguished by the only negative AA¢ angle within the protein Critically, this combination of nonTPR packing angles imparts left-handed superhelical character to the region The pitch

of the overall right-handed superhelix is therefore reduced, which brings the C-terminus up toward the N-terminus

Helices 8 and 9 correspond to the region of sequence postulated by some to be a fourth TPR, following on immediately after the tandem 3-repeat (Fig 1C [27]) However, the interactions taking place within this pair indicate that it is not a TPR Helix 8 contains the hydrophobic sequence LWLYL(168–171), and these groups are involved in long-range interactions (dis-cussed below) Helix 8 participates more in these than

in packing against its partner, helix 9 The signature glycine residue of the A helix is missing, a space occu-pied in a TPR by a bulky hydrophobic or aromatic ring from the partner B helix (knobs-in-holes complement-arity) Instead, the glycine position is occupied by ala-nine, with the remaining space ﬁlled by a tyrosine (Tyr171) from the same (A) helix The complementarity

Table 1 SeMet-NlpI Data processing, MAD phasing and refinement statistics FOM, figure of merit, value from DM is at 2.0 A ˚ , whereas

SOLVE ⁄ RESOLVE values are at 2.5 A ˚

I ⁄ Sigma b

Avearge B-factors (A˚2 )

Ramachandaran plot (%) (most favoured ⁄ allowed ⁄ disallowed) 94.0 ⁄ 6.0 ⁄ 0.0

a Values are given with Friedel-pairs (hkl and -h-k-l) kept separate b Value for the high resolution shell is given in parentheses.

Fig 3 Sample 2Fo-Fc density map for NlpI (sigma ¼ 1.0) Residues

shown are Gln158-Asn162 (QDDPN) for chain A, which

corres-ponds to the turn region between helices 7 and 8 Image was

pro-duced with BOBSCRIPT [57] and RASTER 3 D [58].

Trang 6

between helix 8 and 9 is therefore less marked, and they

are less tightly packed together compared to AB pairs

before or after (Fig 4B) The apparent lack of contacts

between helices 8 and 9 is, however, compensated for

by an unusual association with the next pair of helices,

10 and 11 These pack against 8 and 9 at an angle of

96 (visible in Fig 2D), which is the highest

inter-repeat rotation angle within the structure A unique,

nonTPR interaction takes place where the indole ring

of Trp200 (helix 10) inserts between helices 8 and 9,

against Phe190 and the amide backbone of Phe165

This locks the pairs together (Fig 4D) The abrupt

increase in helical array curvature is the second factor

responsible for bringing distal regions of sequence back

toward proximal ones

Long-range interactions & globularity

The presence of long-range contacts within NlpI is

revealed by clusters in the contact map far from the

diagonal (Fig 4A) These interactions take place only

between A helices, which dominate the inside of the

NlpI helical roll (Fig 2C) The clusters can be

consid-ered as four overlapping groups (Fig 4E) Cluster 1

involves helices 12 and 14, packing against the

N-terminal region of NlpI These constitute the most

distant interactions between elements of primary

structure, and include a hydrogen bond between the backbone carbonyl of Asn263 and the backbone amide

of Leu34 The positive AA¢ W angle between helix 12 and helix 10 is in part responsible for this Cluster 2 consists of loop against loop interactions between TPR1 and TPR2, with helix 14 (including a Ca back-bone contact between Gly76 and His266) From func-tional perspectives this is perhaps signiﬁcant, as this ﬁrst TPR is more open than any other, forming an exposed ‘lip’ on the NlpI monomer, and therefore most closely resembles classical TPR front-face environments

in providing an interaction surface Cluster 3 contains solvent inaccessible hydrophobic groups (Leu134, Ile138, Val269) and hydrogen bond interactions (Tyr142-Lys242) between the third and fourth TPR, with helix 14 Cluster 4, which overlaps with clusters 2 and 3, forms the core of NlpI long-range interactions These include bulky hydrophobic, aromatic and surface solvent exposed groups from helices 6, 8, 12 and 14 For example, Trp169 (helix 8) has hydrophobic con-tacts with Leu134, Ile138, Tyr142 (helix 6), Phe238 (helix 12) and Val269 (helix 14) Trp169 and Tyr273 (helix 14) are on opposite sides of the protein core, and

do not interact directly, but they are bridged by alipha-tic and aromaalipha-tic groups (Ile138, Tyr142, Phe238 and Val269) from helices 6 and 12 Aromatic ring inter-actions, including ‘T’ face-edge (Tyr131-Phe156,

Table 2 Primary, secondary and tertiary structure statistics for mature NlpI Excluding the N-terminal 310helix (H0), NlpI contains 14 a-heli-ces W Packing angles characteristic of TPRs range from )160 to )174 (AB), 11 to 32 (AA¢) and 40 to 53 (AB-A’B¢ repeat rotation) [6].

AB pairs 4 and 5 (helices 8–9, 10–11) do not display true TPR characteristics, but are responsible for the sharp curvature of the array and reduced superhelical pitch, leading to the formation of a globular structure.

Helix a

Residues Deviation b,c

A ˚ AB pair Wc

AB W c

AA¢

Rotation d

(AB)(A¢B¢)

Helix pair sequence (signature TPR residues underscored)

1 38–51 6.4 – – – – W LG Y A F A P

2 58–74 7.6 1 )160.5 +31.3 57.6 DDERAQLLYERGVLYDSLGLRALARNDFSQALAIRPDM

3 78–91 5.9 (TPR1) (pair 1–2)

4 96–108 5.9 2 )165.5 +16.7 59.0 PEVFNYLGIYLTQAGNFDAAYEAFDSVLELDPTY

5 112–125 5.1 (TPR2) (pair 2–3)

6 131–142 8.9 3 )153.0 +26.5 16.2 YAHLNRGIALYYGGRDKLAQDDLLAFYQDDPND

7 146–159 12.2 (TPR3) (pair 3–4)

8 164–177 14.6 4 +163.3 +26.9 96.2 PFRSLWLYLAEQKLDEKQAKEVLKQHFEKSDKEQW

9 179–192 3.9 (pair 4–5)

10 199–206 12.9 5 +172.6 )26.0 40.4 GWNIVEFYLGNISEQTLMERLKADATD

11 212–222 7.4 (pair 5–6)

12 226–246 19.4 6 )158.3 +38.2 261 NTSLAEHLSETNFYLGKYYLSLGDLDSATALFKLAVANNVHNF

13 250–261 8.6 (TPR4) (pair 1–6)

14 269–283 7.9 – – – –

a

Defined by by PROCHECK [51] b

From ideal helix geometry c

Calculated by PROMOTIF [52] d Obtained by transforming one helical pair onto another with lsqman [53].

Trang 7

Tyr141-Tyr142, Phe205-Tyr243) and offset pi-stacking

(Phe85-Tyr101, Trp298-His232) are evident, with the

spaces between these moieties ﬁlled by bulky aliphatics

(Fig 4F)

Long-range interactions are a characteristic property

of globular protein structures By virtue of their

exten-ded helical organization, TPR and other repeat

pro-teins typically lack this feature, but rather have

stabilizing apolar contacts distributed throughout the

molecule, both within and between repeats In contrast,

NlpI contains a central hydrophobic core, composed of

distant motifs from TPR and nonTPR helices TPRs

are therefore compatible with globular structures, but

they do not appear to be capable of deﬁning it The

increased curvature and reduced superhelical pitch

required to form a compact, tertiary structure are

derived from nonTPR elements within the fold

Quaternary organization

NlpI is a homodimer in solution, and the crystal

structure reveals monomers to be related by twofold

axis of symmetry The dimer interface consists of the

extended N-terminal region, helix 1 and TPR helices

2, 3, 11, 12, 13 and 14 (Table 3, Figs 5 and 6B) The

values obtained for interface surface area, interaction

type (two-thirds hydrophobic, but also hydrogen bonds and salt-bridges), gap volume index and planar-ity (which relate to the complementarplanar-ity of the inter-face surinter-faces) fall within the ranges associated with known homodimeric states [29] Three aspects are especially noteworthy First, rotational symmetry places the N-termini of both monomers spatially close

to each other A lipid-modified dimer will therefore be anchored to a plasma membrane in a specific orienta-tion (N-termini ‘face down’ toward the membrane) This is significant, because the potential ligand binding

C

D

Fig 4 Contact map of mature NlpI and packing interactions (A) Backbone (upper left from diagonal) and side chain (lower right from diago-nal) contacts within 5 A ˚ Long-range contact clusters are boxed (B) Packing interactions between nonTPR helices 10 (red) and 11 (blue), and (C) helices 8 (red) and 9 (blue) Space-filling atoms shown are large and small hydrophobic residues (F, Y, W, I, L, V, A and G) Bulky groups

of helix 8 point toward the protein core (D) View of NlpI helices 8 and 9 (with helix 7 removed), showing diminished association between the pair, and the insertion of Trp200 from helix 10 Right-handed superhelical curvature imparted by the first three TPRs appears to cease, allowing the subsequent structure to roll-up (C) Location of long-range packing clusters from (A), which define the core of NlpI (F) Aromatic and bulky side chains surrounding Trp169 (orange).

Table 3 NlpI dimer interface statistics Values were obtained with SURFNET [29,55] SA ¼ surface area.

Monomer buried SA (A˚2 ) 1585.5 N-terminal 23–26, 30–36

% Monomer SA 12.6 Helix 1 38, 41, 44, 45, 48,

49, 51

% Nonpolar atoms 36.6 Helix 2 68, 76, 77

% Polar atoms 63.4 Helix 3 78, 79, 80, 83

Hydrogen bonds 16 Helix 12 237, 240, 244

262, 264, 266

Trang 8

cleft (discussed below) of each monomer would then

be exposed, and oriented roughly perpendicular to the

plane of the membrane NlpI localized in this manner

could then serve as a tether to which other functional

components would bind Second, the dimer interface

is made up of distant regions of monomer sequence

That is, N-terminal and C-terminal portions of

mono-mer polypeptide come together, forming the molecular

surface Quaternary structure is therefore dependent

on tertiary structure, and their formation may be

interdependent (cooperative) Third, it was previously

noted that the ﬁrst TPR (helices 2 and 3) participates

in long-range interactions within a monomer through

loop residues (Asp73, Ser74, Leu75, Arg78), while the

majority of the inner front-face assumes an open ‘lip’

conformation (Fig 4D) In contrast, seven residues of

the outer back-face participate in the dimer interface,

packing against C-terminal portions of polypeptide

from the partner molecule Consideration of

mono-meric NlpI alone gives the impression that these

heli-ces make few molecular contacts when in fact they

make many, albeit with a separate polypeptide chain

The insolubility of NlpI 3-TPR (fragment 62–197) is

therefore understandable, in terms of the failure of an

isolated motif to form critical intra- and

intermole-cular contacts These observations demonstrate the

capacity, and on occasion the necessity, of TPRs to

participate at all levels of structure organization, and

suggest that the fold is more versatile than was

previ-ously thought

We now know the structure of NlpI, and observe

that the TPRs in this protein do not form an

inde-pendent domain One could therefore ask if there are any features of the TPR sequences that hint at differences between these TPRs, and those that fold in-dependently Unfortunately, with the limited sequence– structure data available at this stage, there are no correlations strong enough to allow us to predict, or subclassify, which TPR sequences will form an exten-ded array and which will adopt globular structures

Implications for function: a putative binding cleft

We examined the conservation of NlpI structure through a sequence alignment of the 12 most similar sequences identiﬁed through a BLAST search (Fig 6A) When conserved positions are mapped onto the struc-ture of NlpI, they correlate to three distinct locations within the protein Two of these are clearly structural in nature: the globular core (discussed previously) and the dimerization interface (also predominantly hydropho-bic, Fig 6B), suggest that NlpI homologs share a com-mon tertiary and quaternary organization The third conserved region corresponds to the depression on one side of each NlpI monomer (Fig 6C) The cleft is lined with polar (Asn267), acidic (Asp163, Glu235, Glu231 and Glu270), aromatic (Tyr131, Phe165, Trp198, Phe268) and hydrophobic groups (Ile104, Val269) Visu-ally, the shape of the depression is highly suggestive of a binding site The presence of four invariant acidic groups (one aspartate and three glutamate) implicates electrostatic interactions, possibly with a basic motif, in the putative binding event The high degree of sequence conservation in the cleft suggests all homologs of NlpI bind the same ligand

In addition, our attention was drawn to this cavity during the ﬁnal stages of model building, because it contained a patch of 2Fo-Fc density that could not be accounted for by the polypeptide, water molecules,

Mg2+or Cl–ions The structure of Tris, also a compo-nent of the crystallization mother liquor, was found to

ﬁt the density envelope, making hydrogen bonds with carboxylate groups of Glu235 and Glu270, and with the back bone amide of Val269 Phe165 and Phe268 face each other, ﬂanking the two carboxylates and Tris (Fig 6D)

NlpI is thought to play a role in the regulation of the cell wall and extracellular surface, but its exact function

is not known, and no ligand interactions have yet been described There has been some suggestion that the C-terminus may associate with the periplasmic protease Tsp, and it has been proposed that removal of residues beyond Gly282 serves to activate the protein [27] How-ever, the C-terminus of NlpI does not contain a motif that resembles the canonical ‘WVAAA’ associated with

Fig 5 NlpI dimer interface Chain B has been translated and

rota-ted to expose the surface in contact (yellow) The interface is

com-posed of remote regions of sequence from the N- and C-termini.

Val32 is indicated to illustrate the rotational symmetry of the

inter-face Contacts were obtained with SURFNET [55] and CONTACT from

the CCP4 [42,43] suite of programs.

Trang 9

Tsp recognition [30] Our structure suggests the

func-tionality of NlpI in fact lies within the cleft associated

with globular body of the fold

Structural homologs

A DALI search for homologous structures ﬁnds two

PDB entries with signiﬁcant similarity to NlpI The

ﬁrst, p67phox from human (Z-score¼ 12.4, rmsd of

2.5 A˚ over 152 residues) consists of four TPR motifs,

and an extended C-terminus that packs against the

concave front-face groove through hydrophobic

interactions The intracellular ligand, Rac-GTP, is required for the assembly of the multiprotein NADPH oxidase complex, and binds to the surface formed by TPR connecting loops and the C-terminal polypeptide [16] The ligand-binding mode is there-fore distinct from that of TPR1 and TPR2A of HOP The structural similarity between NlpI and p67phox is illustrated in Fig 7A, and reveals a close match between the ﬁrst four AB repeats of each protein

The second structure, domain III from E coli malt-ose transcriptional regulator MalT (Z-score ¼ 12.3,

A

Fig 6 Homologs of NlpI, structural conservation the putative binding cleft (A) CLUSTALX alignment [48] of the 12 sequences most similar to

E coli NlpI identified by BLAST [49] Positions are colored as follows: red, identical (*); yellow, similar (:) 1, Escherichia coli; 2, Salmonella typhinurium; 3, Yersinia pestis; 4, Yersinia enterocolitica; 5, Vibro haemolyticus; 6, Vibro vulnificus; 7, Vibro cholerae; 8, Photorhabdus lumi-nescens; 9, Photobacterium profundum; 10, Haemophilus influenzae; 11, Haemophilus ducreyi; 12, Shewanella oneidensis E coli residues underscored locate to the hydrophobic core (B) Molecular surface revealing conserved positions within the dimerization interface (mostly hydrophobic) (C) The surface of an NlpI monomer, showing the putative ligand binding cleft and bound Tris molecule (D) Tris molecule, conserved acidic and aromatic side chains within the cleft Orange dashes indicate hydrogen bonds between Tris and the amide backbone of Val269, and conserved side chain carboxylates of Glu235 and Glu270.

Trang 10

rmsd of 4.3 A˚ over 212 residues), consists of eight

superhelical peptide repeat (SUPR) motifs that assume

a superhelical fold [31] SUPRs resemble TPRs, but

their helices are slightly longer (16–18 residues) and

the sequence consensus is more degenerate The

N-terminal region of MalT is responsible for

func-tional dimerization, while the C-terminus is thought to

contain a maltotriose binding site, formed by the

con-cave surface of four SUPR repeats Close structural

similarity can be seen between the ﬁrst three AB

repeats of NlpI and MalT (Fig 7B) However, by the

sixth repeat the reduction in NlpI superhelical pitch

has folded the protein back onto itself, while MalT

continues in a more regular superhelix (and in

conse-quence lacks hydrophobic core interactions)

In terms of their biological roles, p67phox and MalT

both mediate intermolecular interactions, and are

responsible for the assembly of multiprotein

com-plexes It is therefore interesting to speculate, on the

basis of structural identity and the presence of a

con-served surface cleft, whether NlpI participates in

ana-logous multiprotein assemblies in E coli

Conclusion

We have determined the ﬁrst structure of a complete

TPR-containing globular protein, which is also the ﬁrst

TPR from a prokaryotic organism The structure

reveals an intimate association between the TPR motifs

and the rest of the protein, showing how the TPR

par-ticipates in the overall fold Until now, many of the

TPR-containing regions of proteins have behaved as

separate domains: they fold and function completely

independently of the rest of the protein Here we show

an alternate arrangement, in which the TPR is insepar-ably part of the whole structure Nothing about the sequence, or any a priori considerations, suggested that this structure would be different from the TPRs from independent domains The structure provides a strong hint at the location of the active site, though as yet the ligand bound by NlpI has not been identiﬁed Its involvement in bacterial virulence, and likely presence

of identical interactions in many pathogenic species, makes NlpI a potential target for new antibiotics

Experimental procedures

Cloning NlpI constructs

DNA encoding NlpI sequences was obtained by PCR ampliﬁcation from a single colony of E coli DH10B [28] (Invitrogen, Carlsbad, CA, USA), grown on Luria–Bertani agar overnight All oligonucleotides were chemically syn-thesized by the W M Keck Core Facility (Yale University, New Haven, CT, USA) Primers to amplify mature NlpI (residues 20–294) were 5¢-aataatccatggggagtaatacttcctggcgta aaagtgaagtcc-3¢ and 5¢-attattggatccctattgctggtccgattctgccag-3¢ 3-TPR NlpI primers (residues 62–197) were 5¢-aataatccatgg gggcacagcttttatatgagcgcggag-3¢ and 5¢-aataatggatcctcactgttc cttatccgatttttcgaagtgc-3¢ PCR products were doubly diges-ted with NcoI and BamHI (New England Biolabs, Beverly,

MA, USA), and puriﬁed by agarose gel electrophoresis onto dialysis membrane, prior to ligation into doubly diges-ted, dephosphorylated expression vector pET11a-HT This vector was assembled in-house from vectors pProEX-HTa (Invitrogen) and pET11a [32] (Novagen, San Diego, CA, USA), and places cloned sequences under T7 promoter con-trol Expression in an E coli DE3 bacterial host produces

an N-terminal hexahistidine-tagged protein, cleavable with TEV protease Ligation products were transformed into electrocompetent E coli DH10B (Invitrogen), and trans-formants sequenced by the W M Keck Facility

Expression and purification

Plasmids, veriﬁed by DNA sequencing, were transformed into E coli BL21 (DE3) Gold (Stratagene), and grown in

car-benicillin at 37C until cell culture absorbance at 600 nm

induction with 100 lm isopropyl thio-b-d-galactoside Cells were harvested by centrifugation (6000 g, 20 min) after 4 h

(SeMet)-labelled NlpI was expressed in E coli methionine auxotroph B834 (DE3), grown in M9 medium [33]

thiamine At a cell density of D600¼ 0.4, bacteria were har-vested and resuspended in fresh M9 supplemented with

Fig 7 Structural homologues of NlpI Structure alignment of NlpI

(blue) with (A) p67 phox and (B) domain III of MalT (yellow)

Super-imposition of coordinates was performed with LSQ _ EXPLICIT and

LSQ _ IMPROVE [45].

Tiêu đề	The crystal structure of NlpI A prokaryotic tetratricopeptide repeat protein with a globular fold
Tác giả	Christopher G. M. Wilson, Tommi Kajander, Lynne Regan
Người hướng dẫn	Lynne Regan
Trường học	Yale University
Chuyên ngành	Molecular Biophysics and Biochemistry
Thể loại	báo cáo khoa học
Năm xuất bản	2004
Thành phố	New Haven

Định dạng
Số trang	14
Dung lượng	0,98 MB