For this latter mecha-nism, which is the major one for proteins larger than 30–35 kDa, specific and direct nuclear targeting requires the presence of a nuclear localization signal NLS, wh
Trang 1possess different types of nuclear localization signals
Ga´bor Mere´nyi1, Emese Ko´nya1 and Bea´ta G Ve´rtessy1,2
1 Institute of Enzymology, Biological Research Center, Hungarian Academy of Sciences, Budapest, Hungary
2 Department of Applied Biotechnology, Budapest University of Technology and Economics, Budapest, Hungary
Introduction
In eukaryotic organisms, proteins with cognate nuclear
function must penetrate the nuclear envelope after
translation in the cytoplasm Nuclear import and
export of proteins can proceed by active or passive
transport, or as a member of protein complex actively
targeted into the nucleus [1–3] For this latter mecha-nism, which is the major one for proteins larger than 30–35 kDa, specific and direct nuclear targeting requires the presence of a nuclear localization signal (NLS), which is the relevant sequence information in
Keywords
cellular trafficking; Drosophila melanogaster;
dUTPase; nuclear localization signal;
uracil-DNA degrading factor
Correspondence
B.G Ve´rtessy, Institute of Enzymology,
Biological Research Center, Hungarian
Academy of Sciences, Karolina u´t 29,
H-1113 Budapest, Hungary
Fax: +36 1 466 5465
Tel: +36 1 279 3116
E-mail: vertessy@enzim.hu
(Received 4 August 2009, revised 23
February 2010, accepted 1 March 2010)
doi:10.1111/j.1742-4658.2010.07630.x
Adequate transport of large proteins that function in the nucleus is indis-pensable for cognate molecular events within this organelle Selective pro-tein import into the nucleus requires nuclear localization signals (NLS) that are recognized by importin receptors in the cytoplasm Here we inves-tigated the sequence requirements for nuclear targeting of Drosophila pro-teins involved in the metabolism of uracil-substituted DNA: the recently identified uracil-DNA degrading factor, dUTPase, and the two uracil-DNA glycosylases present in Drosophila For the uracil-DNA degrading factor, NLS prediction identified two putative NLS sequences [PEKRKQE(320– 326) and PKRKKKR(347–353)] Truncation and site-directed mutagenesis using YFP reporter constructs showed that only one of these basic stretches is critically required for efficient nuclear localization in insect cells This segment corresponds to the well-known prototypic NLS of SV40 T-antigen An almost identical NLS segment is also present in the Drosophila thymine-DNA glycosylase, but no NLS elements were pre-dicted in the single-strand-specific monofunctional uracil-DNA glycosylase homolog protein This latter protein has a molecular mass of 31 kDa, which may allow NLS-independent transport For Drosophila dUTPase, two isoforms with distinct features regarding molecular mass and subcellu-lar distribution were recently described In this study, we characterized the basic PAAKKMKID(10–18) segment of dUTPase, which has been pre-dicted to be a putative NLS by in silico analysis Deletion studies, using YFP reporter constructs expressed in insect cells, revealed the importance
of the PAA(10–12) tripeptide and the ID(17–18) dipeptide, as well as the role of the PAAK(10–13) segment in nuclear localization of dUTPase We constructed a structural model that shows the molecular basis of such rec-ognition in three dimensions
Abbreviations
NLS, nuclear localization signal; LD-DUT, long isoform of dUTPase; NTT-DUT, N-terminally truncated short isoform of dUTPase; SMUG1, single-strand-specific monofunctional uracil-DNA glycosylase 1; T-ag, T-antigen; TDG, thymine-DNA glycosylase; UDE, uracil-DNA degrading factor.
Trang 2sequence motifs have been identified to date, and there
is no unique well-defined consensus amino acid
sequence for all NLS [4,5] However, major common
characteristics of these sequences are (i) a high content
of basic amino acid residues such as lysine (K) and
arginine (R), and (ii) the presence of conserved
pro-line(s) (P) potentially involved in breaking secondary
structural elements within the NLS One group of
sim-ple NLS includes monopartite motifs, generally defined
as a short amino acid region consisting of 4–6 basic
residues in a row, like the classic NLS of SV40 large
T-antigen (SV40 T-ag) [6] Another type of NLS, such
as the NLS of nucleoplasmin in Xenopus laevis,
com-prises bipartite motifs, which contain two distinct
stretches of positively charged clusters separated by a
mutation-tolerant linker region [7] In addition,
sequences containing several neutral or even negatively
charged conserved residues may also act as functional
monopartite NLS, with the negatively charged
aspar-tate⁄ glutamate (D ⁄ E) also contributing to NLS
func-tion [8] Interestingly, the NLS of human RanBP3 [9]
is an unusual signal with close homology to the NLS
of c-Myc [10]
Nuclear proteins containing NLS motifs could enter
into the nucleus via the nuclear pore, utilizing a strictly
organized mechanism maintained by karyopherin
mol-ecules and the nuclear pore complex [11] The nuclear
pore complex is a large protein complex consisting of
multiple subunits and located in the nuclear
mem-brane It is also the main possibility for exchange of
small particles, e.g ions, nucleotides, etc., between the
nuclear and cytosolic compartments Importin b,
a type of karyopherin molecule, is a nuclear transport
receptor, which can bind its molecular cargo either
directly or indirectly through adaptor proteins such as
importin a Importin b is unable to bind directly to
classical nuclear targeting motifs such as the NLS of
SV40 T-ag or the NLS of nucleoplasmin, but could
mediate nuclear import indirectly in association with
importin a Importin a possesses two major domains
for its adaptor function, the importin b binding (IBB)
domain in its N-terminus and the C-terminal
NLS-binding domain In the absence of importin b, an
auto-inhibiting part of the IBB domain forms an
intra-molecular interaction with the NLS-binding domain,
preventing the association with NLS on the cargo
pro-tein Thus, the presence or absence of importin b
regu-lates the NLS binding ability of importin a The
relatively large NLS-binding domain of importin a
consists of ten armadillo repeats, each constituting
three a-helices In association with each other, the
armadillo repeats form a large concave superhelical
in extended conformation to the binding pockets of the superhelical surface of importin a These binding pockets contain several conserved residues (e.g aspara-gine, tryptophan and negatively charged residues) involved in hydrophobic and electrostatic interactions with the positively charged residues of the NLS (see [1] for recent review)
Here, we wished to identify and characterize NLS for Drosophila melanogaster proteins involved in uracil-DNA metabolism Four such major proteins have been described to date: (i) the newly identified uracil-DNA degrading factor (UDE) [12,13], (ii) dUT-Pase, which is responsible for prevention of uracil incorporation into DNA [14], and (iii) two DNA gly-cosylases, thymine-DNA glycosylase (TDG) [15] and the single-strand-specific monofunctional uracil-DNA glycosylase 1 (SMUG1) homolog protein
The UDE protein, encoded by the CG18410 gene in the D melanogaster genome, was recently identified in
a pull-down screen on uracil-DNA from larval extracts [12] In vitro studies have shown that this protein spe-cifically degrades uracil-containing DNA, but lacks any appreciable homology to previously described ura-cil-DNA-recognizing proteins CG18410 gene expres-sion may be under developmental control, and the protein has been suggested to play a role in metamor-phosis in Drosophila The subcellular localization of this protein had not been characterized
dUTPase catalyzes the cleavage of dUTP into dUMP to control cellular dUTP⁄ dTTP ratios, and is
an essential enzyme in both prokaryotes and eukary-otes [16,17] Lack of dUTPase leads to uracil-substi-tuted DNA that perturbs base excision repair, resulting in DNA fragmentation and thymine-less cell death [14] Most dUTPases are homotrimers with native molecular masses of approximately 50–65 kDa [18–24] Both human and D melanogaster cells contain
a nuclear isoform of dUTPase, and the NLS segment
of the human enzyme has been investigated in detail [25] In D melanogaster dUTPase, a similar N-terminal segment was recently proposed as the NLS region [26]
In D melanogaster, two physiological isoforms of the enzyme were identified, with apparent molecular masses of 69 and 63 kDa for the native homotrimers (termed long isoform, LD-DUT, and the N-terminally truncated short isoform, NTT-DUT, respectively) [27] Only LD-DUT contains the complete putative NLS sequence [PAAKKMKID(10–18)], while NTT-DUT lacks 14 residues at the N-terminus This segment shows a high degree of flexibility and cannot be located in the 3D structure of the protein determined
by X-ray crystallography (PDB ID 3ECY) [21])
Trang 3Uracil-DNA glycosylases are the key repair enzymes
that remove uracil from DNA by catalyzing cleavage
of the N-glycosidic bond [28] To perform this function
in eukaryotic cells, these enzymes must reside in the
nuclear or mitochondrial compartments ([29] There
are four or five major families of uracil-DNA
glycosy-lases, but only two of these are encoded in the
D melanogaster genome [30] The molecular mass of
these two glycosylases, based on reported sequences
[15], are 191 kDa for TDG and 31 kDa for the
SMUG1 homolog No quantitative data are available
indicating potential oligomerization for the monomeric
species, and the family member uracil-DNA glycosylase
is a monomer [31]
In the present study, we aimed to (i) determine the
subcellular distribution of UDE, (ii) identify sequence
determinants essential for nuclear translocation in
pro-teins involved in uracil-DNA metabolism in
Drosoph-ila, and (iii) functionally characterize these NLS Based
on in silico prediction, we fused various sequence
seg-ments from the ORF of UDE and dUTPase to the
yellow fluorescent protein (YFP) and generated
chime-ric reporter constructs In addition, to characterize the
essential and sufficient amino acids of the NLS, we
performed deletion studies and site-directed
mutagene-sis on the putative NLS regions For transient
transfec-tion studies, we used the Sf9 homogeneous insect cell
line, which has superior characteristics for subcellular
sorting analysis compared with the Drosophila
Schnei-der 2 cell line, including convenient generation time,
and its morphology allows straightforward microscopic
detection of cellular compartments
Results and Discussion
Subcellular targeting of UDE
Nuclear targeting of UDE may be critical for
perfor-mance of the suggested degradation function on
geno-mic DNA containing uracil [12] In silico prediction
(using PSORTII [32]; http://psort.ims.u-tokyo.ac.jp/)
suggested two individual clusters of residues as a
puta-tive NLS region, separated by 21 amino acids, in the
C-terminus of the protein (Fig 1A, and Tables 1 and
2) The first cluster (NLS1), PEKRKQE(320–326),
consists of both positively and negatively charged
resi-dues The second stretch (NLS2), PKRKKKR(347–
353)E, is located at the very end of the C-terminus and
has a high proportion of positively charged amino
acids Underlined residues are predicted to be part of
the NLS Each sequence starts with the neutral amino
acid proline and ends its context with glutamic acid
We fused the full-length UDE, containing these two
predicted sequences, to the N-terminus of YFP After Sf9 cell transfection using the chimera construct, fluo-rescence was observed on samples of fixed cells The 22.2 kDa YFP alone, used as a control, could pene-trate non-selectively through the nuclear pore, most probably because its smaller molecular mass allows passive diffusion Fluorescence microscopy analysis showed that the YFP-tagged UDE has an exclusive nuclear localization in Sf9 cells (Fig 2A and Table 3)
In the control experiment, YFP alone was observed throughout the cell (Fig 2K and Table 3) These data demonstrate that the wild-type UDE is targeted specifi-cally and exclusively into the nucleus, in agreement with its putative nuclear function in insect cells
Subcellular distribution of C-terminal truncated forms of UDE
To test whether the nuclear import of UDE requires any or both of the predicted signals, various C-termi-nally truncated UDE species were linked to the N-ter-minus of the YFP reporter (Fig 1) In the first construct, UDED(316)355)–YFP, a large part of the C-terminus was deleted, including both putative NLS segments In the second construct, UDED(346)355)– YFP, the last ten residues of the C-terminus were removed, including the PKRKKKR(347–353) sequence The reporter constructs were introduced into Sf9 cells and subsequently analyzed by fluorescent microscopy The results show that lack of the full-length flexible C-terminal region, containing both of the predicted signals, totally abolished the nuclear dis-tribution, causing significant cytoplasmic retention of UDE (Fig 2B and Table 3) When the last ten residues
of the C-terminus, including only the second predicted NLS, were deleted, the pattern of subcellular distribu-tion was also exclusively cytoplasmic (Fig 2C) These results suggest that the PEKRKQE(320–326) sequence
on its own is not able to translocate the protein into the nuclear compartment In contrast, the presence of the PKRKKKR(347–353) sequence, consisting of six contiguous positively charged amino acids, is critical for exclusive nuclear localization of UDE The PKRKKKR(347–353) segment is almost identical to the NLS of SV40 T-ag, indicating a powerful capabil-ity for function as an NLS
Subcellular targeting of UDE containing specific site mutations in the NLS sequence
To extend our investigations, we generated separate mutations to identify amino acids responsible for the nuclear targeting function of the PKRKKKR sequence
Trang 4(Fig 1) The K350A⁄ K351A double mutation slightly
altered the pattern of subcellular distribution,
indicat-ing attenuation of the nuclear targetindicat-ing effect (Fig 2D
and Table 3) The K350A⁄ K351A ⁄ K352A ⁄ R353A
quadruple mutation also perturbed the exclusive
nuclear targeting of UDE, resulting in significant
cyto-plasmic retention (Fig 2E) Based on these results, the
PKRKKKR(347–353) sequence is suggested to be a
strong NLS sequence with high mutation tolerance In
accordance with the putative segments defined by
in silico prediction (Table 1), it was found that the
presence of the KPKR(346–349) segment is sufficient
for partial nuclear localization of the protein
Subcellular targeting potential of the predicted
UDE NLS1 and NLS2 sequences
To determine whether either of the two predicted NLS
sequences possess strong nuclear targeting potential on
their own, the PEKRKQE (NLS1) and PKRKKKR
(NLS2) coding sequences were fused as a C-terminal tag
to YFP protein (Fig 1) The constructs YFP–UDE-NLS1 and YFP–UDE-NLS2 were transiently
transfect-ed into Sf9 cells Expression and intracellular appear-ance of the fluorescent proteins were observed by fluorescent microscopy The results show that the NLS2 segment has selective and powerful targeting potential for accumulation of YFP in the nucleus (Fig 2F and Table 3) The pattern of subcellular distribution of YFP–UDE-NLS1 was not exclusively nuclear or cyto-plasmic, although some accumulation was observed within the nuclear compartment compared to the YFP control (compare Fig 2G and K) Further, the C-termi-nal portion of UDE was fused to YFP and expressed in Sf9 cells This UDED(1)319)–YFP reporter construct containing both predicted NLS sequences was exclusively retained in the nucleus (Fig 2H) After introducing quadruple mutations (K350A⁄ K351A ⁄ K352A⁄ R353A) into this construct UDED[1)319 (350AAAA353)]–YFP, the exclusive nuclear distribu-tion was highly perturbed, but increased nuclear accu-mulation was observed compared to YFP alone
B
C
UDEΔ(316–355)-YFP UDE WT -YFP
UDEΔ(346–355)-YFP
UDEΔ(1–319)-YFP UDEΔ[1–319 (350AAAA353)]-YFP
YFP-UDE-NLS2Δ350–353 YFP-UDE-NLS2 YFP-UDE-NLS1
UDE(350AA351)-YFP UDE(350AAAA353)-YFP
Fig 1 Scheme of D melanogaster UDE constructs used in the present study (A) Position and context of putative nuclear localization sequences (underlined) within the flexible C-terminus of D melanogaster UDE are indicated (B) Schematic representation of various UDE– YFP reporter constructs The wild-type (wt), flexible C-terminally truncated [D(316 )355)] and the NLS truncated [D(346)355)] coding sequences were fused in-frame to the N-terminus of YFP protein, resulting in UDE WT –YFP, UDED(316 )355)–YFP and UDED(346)355)–YFP reporter constructs The UDE(350AA351)–YFP reporter construct contains the K350A and K351A mutations, and the UDE(350AAAA353)– YFP reporter construct contains the K350A, K351A, K352A and R353A mutations The truncated reporter constructs UDED(1 )319)–YFP and UDED[1 )319(350AAAA353)]–YFP are also indicated The relevant regions, positions and mutations of the NLS of UDE are indicated by differ-ently shaded boxes (C) The predicted NLS sequences (NLS1 and NLS2) and the deleted variant of NLS2 were fused in-frame to the C-termi-nus of the YFP ORF generating the YFP–UDE-NLS1, YFP–UDE-NLS2 and YFP–UDE-NLS2D(350 )353) reporter constructs Establishment of vector constructs was performed as described in Experimental procedures.
Trang 5(Fig 2I,K) The last examined reporter construct YFP–
UDE-NLS2D(350)353), which possesses only three
basic residues [KPKR(346–349)] from the NLS2
seg-ment fused to YFP, also showed localization in the
nucleus and the cytoplasm, with some accumulation
within the nucleus (Fig 2J)
These observations indicate that the NLS2 segment
is a strong monopartite NLS, and that the
contribu-tion of the predicted NLS1 to nuclear localizacontribu-tion is
negligible Within the NLS2 segment, both the KPKR
and the KKKR tetrapeptides contribute to nuclear
localization
Prediction of NLS signals in Drosophila
uracil-DNA glycosylases
Table 1 lists the predicted NLS signals for the TDG
protein Several clusters of putative localization
signals were observed Among these, the PKKRG RKKK(711–719) sequence is almost identical to the NLS of the SV40 T-ag and also to the UDE NLS segment As the SV40 T-ag has been extensively char-acterized [33] and we also found in our present experi-ments that such a sequence has very strong nuclear localization potential, we propose that this sequence also acts as an NLS in the TDG protein For the SMUG1 homolog protein, no nuclear localization sig-nal was predicted by the PSORTII program (Table 1) Lack of predicted signals cannot be taken as evidence for the actual absence of NLS segments, as prediction performs well only for classical NLS It is also worth-while noting that the molecular size of SMUG1 may allow passive translocation to the nucleus
Subcellular distribution of the D melanogaster dUTPase isoforms
For D melanogaster dUTPase, prediction identified the underlined segment within PAAKKMK(10–16)ID
as a conventional NLS comprising a short cluster of non-polar and basic residues (Fig 3, and Tables 1 and 2) To determine the subcellular distribution of
D melanogaster dUTPase isoforms in the Sf9 cell line,
Table 1 In silico predictions of putative nuclear localization signals
of Drosophila dUTPase, UDE, TDG and SMUG1 homolog proteins.
To identify the putative nuclear localization sites, the full length
open-reading frame sequences of the proteins were obtained from
the UniProt database (http://www.uniprot.org) and analyzed using
PSORTII (http://psort.ims.u-tokyo.ac.jp/) Putative signal sequences,
defined as potential NLS regions, are shown, with the number in
parentheses indicating the number of the first residue.
Protein
Uniprot ID
of protein
(UniProtKB ⁄
TrEMBL)
ORF length (amino acids
Sequences defined
as putative NLS segments
PKRK (347) KRKK (348) RKKK (349) KKKR (350) PEKRKQE (320) PKRKKKR (347)
RKKK (716) RKKH (760) KKKR (1088) RPKK (1093) PKKK (1141) KKKR (1142) RPKK (1147) PNNRKRQ (114) PMPKKRG (709) PKKRGRK (711) PKERKKH (757) PLEKKKR (1085) PKKIKGQ (1094) PKKKRGR (1141) PKKLKPA (1148)
Table 2 Comparison of UDE, dUTPase and TDG NLS segments with NLS sequences of various proteins The monopartite sequences listed show close similarity to either the SV40 T-ag NLS
or the c-Myc NLS segments The NLS sequences of UDE and TDG show close homology to the SV40 T-ag NLS, but the D melanogas-ter dUTPase NLS belongs to the c-Myc group Inmelanogas-terestingly, the NLS segment of human dUTPase is more similar to the first group
of sequences For comparison, the classic bipartite NLS sequence
of X laevis nucleoplasmin is shown, which possesses an additional short cluster of basic residues separated by 10 amino acids from the basic stretch, which has close homology with the NLS of SV40 T-ag SV40 T-ag, simian virus 40 large T-antigen [6]; v-Jun, sarcoma virus 17 oncogene homolog [39]; H2B, histone 2B [40]; UDE, uracil-DNA degrading factor; human dUTPase [25]; c-Myc, myelocytoma-tosis cellular oncogene [10]; RanBP3, Ran binding protein 3 [9].
Monopartite
H2B of Saccharomyces cerevisiae GKKRSKV
dUTPase of D melanogaster PAAKKMKID
Bipartite
Trang 6reporter constructs were created by N-terminal fusion
to YFP (Fig 3B and Table 3) Cellular targeting of
both isoforms was subsequently determined via cell
transfection experiments followed by fluorescent
micro-scopic detection The results show that the long
iso-form of dUTPase (LD-DUT) is specifically targeted
into the nucleus, but the short one (NTT-DUT) was
not able to enter into the nuclear compartment and remained exclusively in the cytoplasm (Fig 4A,B) This is in agreement with studies performed in Dro-sophila Schneider S2 cells [26]
These results indicated that the presence of the pre-dicted complete targeting sequence is necessary and sufficient for exclusive nuclear targeting of the long
F
G
H
I
J
K
B
A
C
D
E
Fig 2 Subcellular localization of D melanogaster UDE protein and its various sequence derivatives Fluorescence microscopy observations show the subcellular distribution of chimeric UDE constructs (A) Wild-type UDE (UDE WT –YFP) was targeted exclusively to the nucleus (B,C) Deletion studies showed that removal of the entire flexible C-terminus or the last ten residues of the C-terminus of the UDE ORF results in exclusive cytoplasmic localization of chimeric constructs UDED(316 )355)–YFP and UDED(346)355)–YFP, respectively (D) The reporter construct UDE(350AA351)–YFP, which contains a double K ⁄ A mutation, is predominantly located in the nucleus and slightly in the cytoplasm (E) Quadruple mutations in the reporter construct [UDE(350AAAA353)–YFP] have an attenuating effect on nuclear localization, with most of the construct accumulating within the nucleus, although cytoplasmic localization was also observed (F) The YFP–UDE-NLS2 reporter localized almost exclusively in the nucleus (G) The YFP–UDE-NLS1 construct was seen in both the nuclear compartment and the cytoplasm (H) The UDED(1 )319)–YFP reporter, which contains both predicted NLS sequences, was exclusively retained in the nucleus (I) The UDED[1 )319(350AAAA353)]–YFP construct was seen in both the nucleus and the cytoplasm, but seemed to accumulate in the nucleus (J) The reporter construct YFP–UDE-NLS2D(350 )353), which possesses only three basic residues from the NLS segment, did not show any selective compartmentalization, and was distributed almost equally in the nucleus and the cytoplasm (K) YFP alone was used as
a negative control The cellular distribution of YFP was approximately the same within the nuclear and cytoplasmic compartments.
Trang 7Table 3 Summary of results for the subcellular distributions of reporter constructs Details of the reporter constructs for dUTPase and UDE are shown in the first three columns The observed subcellular localizations of reporter constructs are indicated by plus and minus signs Two plus signs indicate distribution between the nuclear and cytoplasmic compartments; one plus sign indicates exclusion from either the nucleus or the cytoplasm.
Protein Name of reporter construct NLS sequence present in reporter construct
Localization
A
B
C DUT-NLS-YFP
DUT-NLSΔ(10–12)-YFP DUT-NLSΔ(10–13)-YFP DUT-NLSΔ(17–18)-YFP DUT-NLSΔ(10–12,17–18)-YFP DUT-NLSΔ(10–13,17–18)-YFP
Fig 3 Scheme of D melanogaster dUTPase constructs used in the present study (A) The position and context of putative nuclear localiza-tion signals (underlined) are indicated in the N-terminus of the long isoform of D melanogaster dUTPase (B) The long (LD-DUT WT ) and short (NTT-DUT WT ) isoforms of the D melanogaster dUTPase coding sequences were fused in-frame to the N-terminus of the YFP ORF to gener-ate the LD-DUT–YFP and NTT-DUT–YFP chimeric constructs, respectively The relevant motifs, regions and positions of the NLS of dUTPase are indicated by differently shaded boxes (C) The NLS sequence (PAAKKMKID) and its truncated sequence variants (KKMKID and KMKID) were fused in-frame to the N-terminus of the YFP ORF generating the DUT-NLS–YFP, DUT-NLSD(10 )12)–YFP and the DUT-NLSD(10)13)– YFP reporter constructs Further reporter constructs, DUT-NLSD(17 )18)–YFP, DUT-NLSD(10)12,17)18)–YFP and DUT-NLSD(10)13,17)18)– YFP, are also indicated, which were generated in the way, but all lack the ID(17–18) dipeptide Establishment of vector constructs was performed by the general cloning method described in Experimental procedures.
Trang 8isoform (LD-DUT) The partial segment MKID(15–18),
present on the short isoform, cannot drive nuclear
import In the case of the short isoform (NTT-DUT),
absence of the first 14 residues of the N-terminus,
including the PAAKK(10–14) segment, dramatically
alters the translocation pattern of dUTPase
Nuclear targeting potential of the dUTPase NLS
sequence and its truncated derivatives
To confirm that the complete putative NLS sequence
has nuclear targeting potential of its own, the
PAAKKMKID coding sequence was fused as an
N-terminal tag to YFP protein (Fig 3) The construct
(DUT-NLS–YFP) was transiently transfected into Sf9 cells After cell fixation, the expression and intracellu-lar localization of the fluorescent protein were observed by fluorescent microscopy The results show that this putative NLS sequence was able to confer nuclear localization to the YFP protein (Fig 4C and Table 3) DUT-NLS–YFP is found predominantly in the nuclear compartment, demonstrating that this sequence, which possesses a cluster of basic amino acids flanked by non-polar and acidic residues, is a powerful NLS
In order to identify amino acid residues that are essential for NLS function, we constructed trun-cated derivatives of the NLS sequence linked to the
A
B
C
D
E
F
G
H
I
(10–12,17–18) -YFP
(10–13,17–18) -YFP
Fig 4 Subcellular localization of the isoforms of D melanogaster dUTPase and its various NLS sequence derivatives Fluorescence micros-copy observations reveal the subcellular distribution of chimeric constructs (A,B) The long isoform of dUTPase (LD-DUT WT –YFP) was local-ized to the nucleus exclusively, and the short isoform (NTT-DUT WT –YFP) was present exclusively in the cytoplasm (C) NLS sequence studies show that, in the presence of the complete nuclear localization signal, the reporter construct DUT-NLS–YFP is located in the nucleus (D) Deletion of the first three residues (PAA), producing construct DUT-NLSD(10 )12)–YFP) slightly perturbed exclusive nuclear localization, with some cytoplasmic localization observed (E) Deletion of the first four residues (PAAK), producing the reporter construct DUT-NLSD(10 )13)–YFP, resulted in localization to the nucleus and the cytoplasm in an approximately equal ratio (F) The subcellular localization of the reporter construct DUT-NLSD(17 )18)–YFP, lacking the ID(17–18) dipeptide, was nuclear, with some infiltration into the cytoplasm (G) The DUT-NLSD(10 )12,17)18)–YFP construct, which lacks the tripeptide PAA and the ID(17–18) dipeptide, shows an almost equal distribu-tion in the nucleus and the cytoplasm (H) The subcellular targeting of the DUT-NLSD(10 )13,17)18)–YFP reporter was also not selective, showing close to equal distribution in the nucleus and the cytoplasm (I) YFP alone was used as a negative control The cellular distribution
of YFP was approximately the same within the nuclear and cytoplasmic compartments.
Trang 9N-terminus of the YFP reporter In the first construct,
DUT-NLSD10)12–YFP, the neutral PAA tripeptide
was removed and the remaining part of the sequence,
KKMKID, was fused to the YFP reporter In the
sec-ond construct, the PAAK residues were deleted and
the KMKID stretch was fused to the reporter,
result-ing in the chimeric fluorescent construct
DUT-NLSD(10)13)–YFP After transfection and subsequent
fixation of Sf9 cells, the NLS potential of the
individ-ual truncated derivatives was monitored by fluorescent
microscope Observations show that deletion of the
PAA tripeptide slightly perturbs nuclear localization,
as cytoplasmic fluorescence was also observed
(Fig 4D) Although the PAA neutral tripeptide alone
may not define subcellular compartmentalization for
proteins, its position upstream of the short cluster of
basic residues may be essential to relax the secondary
structure of polypeptide chain, facilitating the
molecu-lar interaction with importins Removal of these three
non-basic residues of the dUTPase NLS resulted in
moderate perturbation of nuclear import and
accumu-lation In the truncated construct lacking the PAAK
segment, we observed greatly increased cytoplasmic
localization of the fluorescent reporter construct
(Fig 4E) This observation indicates that removal of
only one positively charged residue in addition to the
PAA tripeptide strongly alters recognition
characteris-tics within the nuclear import machinery
Furthermore, we established and examined three
additional NLS–reporter constructs lacking the ID(17–
18) dipeptide of the putative NLS sequence The
subcel-lular distribution of the DUT-NLSD(17)18)–YFP
construct was nuclear, with some infiltration in the
cyto-plasm (Fig 4F) The DUT-NLSD(10)12,17)18)–YFP
construct, which lacks the first PAA tripeptide, shows
an almost equal distribution within the nucleus and the
cytoplasm (Fig 4G) The subcellular targeting of the
third reporter construct, DUT–NLSD(10)13,17)18)–
YFP, which lacks the PAAK residues, was also close
to equal distribution between the nucleus and the
cyto-plasm (Fig 4H) These results indicate that the lack of
ID(17–18) might slightly decrease the exclusive nuclear
localization potential of the predicted NLS sequence
Additional oligopeptide deletions (PAA and PAAK)
have a further negative effect on the nuclear targeting
potential of the NLS sequence examined
Structural model of the Drosophila dUTPase NLS
segment in complex with importin a protein
Binding of the NLS segment to importin a has been
characterized by in-depth structural studies that allow
molecular insight into the specific interactions Based on
the published 3D structure of yeast importin a in com-plex with the c-Myc NLS segment peptide (PDB ID 1EE4) [34], and the close similarity between the NLS segments of c-Myc and Drosophila dUTPase (Table 2),
we modeled this latter peptide onto the c-Myc peptide
in the NLS peptide–yeast importin a structure Fig-ure 5A shows the alignment of the yeast and Drosophila importin a protein sequences, which show 69% similar-ity and 54% identsimilar-ity within the ten armadillo domains responsible for NLS recognition Figure 5A also shows the aligned sequence of a mammalian importin a (mouse importin a, which is 94% identical to the human sequence) (PDB ID 1IAL) [35] For mammalian impor-tins, 3D structures of complexes with other types of NLS peptides have been reported [36–38] The align-ments in Fig 5A show the high degree of conservation
of helical structure and residues interacting with NLS peptides Figure 5B shows the structural models of the two NLS peptides in complex with yeast importin a (c-Myc NLS peptide in turquoise, Drosophila dUTPase NLS peptide in green), indicating very close superposi-tion of the two NLS segments The close overlap is indi-cated by the observation that the two colors (green and turquoise) overlap considerably, and it is mostly the green color that is seen as the dUTPase NLS peptide was selected to be the ‘upper’ one in pymol Conse-quently, most of the molecular interactions are equally present in both NLS peptides Importantly, all impor-tin a amino acids that contain atoms within 4 A˚ of the NLS peptides (displayed in orange in Fig 5A,B) are conserved between the yeast and Drosophila importin a proteins, strengthening the assumption that the modeled recognition does take place in the physiological com-plex There are two noteworthy differences between the NLS peptides of c-Myc and D melanogaster dUTPase: lysine at position 14 in the dUTPase NLS is an arginine
in c-Myc, while methionine at position 15 in the dUTPase NLS is a valine in c-Myc With regard to the important role of the PAAK(10–13) segment in the NLS peptide, it is noteworthy that the e-NH2 group of the lysine residue at position 13 makes numerous con-tacts: it is within H-bonding distance to three oxygen atoms of conserved amino acids within importin a (the main-chain oxygen of glycine at position 168, the side-chain hydroxyl oxygen of threonine at position 173, and the side-chain carboxylate oxygen of aspartate at posi-tion 210; the numbering of the Drosophila sequence is used) However, the subsequent lysine residue at posi-tion 14 (arginine in c-Myc) cannot establish polar inter-actions with the carboxylate oxygen of aspartate at position 237 (the electrostatic bonding partner of the arginine residue in the c-Myc peptide) due to its shorter side chain The methinone residue at position 15,
Trang 10Fig 5 Modeling the interactions between
the D melanogaster dUTPase NLS segment
and importin a protein (A) Sequence
align-ment for armadillo domains of Mus
muscu-lus (M mus.), D melanogaster (D mel.)
and yeast importin a Residues within the
a-helices constituting the armadillo domains
are shown on a pink background; residues
that contain atoms within 4 A ˚ of the NLS
peptides of c-Myc or D melanogaster
dUT-Pase (see Fig 5B) are on an orange
back-ground Asterisks indicate identical residues,
semicolons and dots show highly conserved
or conserved replacements, respectively.
Ten armadillo domains (ARM) are shown.
(B) Three-dimensional structural model of
the NLS peptide–importin a complex The
protein surface is shown for the first five
armadillo domains in either pink (for the
a-helices) or brown (for other protein parts).
The NLS peptides of c-Myc or D
melanog-aster dUTPase and importin a residues that
contain atoms within 4 A ˚ of the peptides
are shown as stick models with atomic
coloring (red, oxygen; blue, nitrogen; yellow,
sulfur; orange, green or turquoise, carbon
atoms of importin a, dUTPase NLS and
c-Myc NLS, respectively) For orientation,
most residues of the dUTPase NLS are
labeled, together with four residues of
importin a (see text for details) Note that
the dUTPase NLS peptide can adopt a
dock-ing conformation equivalent to that of the
c-Myc peptide on the importin protein
surface.