Young and Dingyi Wen Biogen, Inc., Cambridge Center, Cambridge, MA, USA The disulfide structure of the CRIPTO/FRL-1/CRYPTIC CFC domain of human Cripto protein was determined by a combinat
Trang 1The CRIPTO/FRL-1/CRYPTIC (CFC) domain of human Cripto
Functional and structural insights through disulfide structure analysis
Susan F Foley, Herman W T van Vlijmen, Raymond E Boynton, Heather B Adkins, Anne E Cheung, Juswinder Singh, Michele Sanicola, Carmen N Young and Dingyi Wen
Biogen, Inc., Cambridge Center, Cambridge, MA, USA
The disulfide structure of the CRIPTO/FRL-1/CRYPTIC
(CFC) domain of human Cripto protein was determined by
a combination of enzymatic and chemical fragmentation,
followed by chromatographic separation of the fragments,
and characterization by mass spectrometry and N-terminal
sequencing These studies showed that Cys115 forms a
disulfide bond with Cys133, Cys128 with Cys149, and
Cys131 with Cys140 Protein database searching and
molecular modeling revealed that the pattern of disulfide
linkages in the CFC domain of Cripto is the same as that in
PARS intercerebralis major Peptide C (PMP-C), a serine
protease inhibitor, and that the EGF-CFC domains of
Cripto are predicted to be structurally homologous to the
EGF-VWFC domains of the C-terminal extracellular
portions of Jagged 1 and Jagged 2 Biochemical studies of the interactions of ALK4 with the CFC domain of Cripto by fluorescence-activated cell sorter analysis indicate that the CFC domain binds to ALK4 independent of the EGF domain A molecular model of the CFC domain of Cripto was constructed based on the nuclear magnetic resonance structure of PMP-C This model reveals a hydrophobic patch in the domain opposite to the presumed ALK4 binding site This hydrophobic patch may be functionally important for the formation of intra or intermolecular complexes
Keywords: Cripto; disufide structure; CFC domain; model
Cripto is a member of a family of proteins that includes
human Cripto and Criptic, murine Cripto and Criptic, frog
FRL-1, zebrafish one-eyed pinhead protein (oep) and chick
Cripto [1–3] The involvement of these proteins in early
embryonic development is well established [1,2,4–11], and
other recent investigations indicate that Cripto is
over-expressed in a number of human cancers [2] These proteins
are characterized by two cysteine-rich structural motifs: an
epidermal growth factor (EGF)-like domain and a
CRI-PTO/FRL-1/Cryptic (CFC) domain, the latter of which is
considered unique to this family Previous characterization
of human recombinant Cripto (residues 1–169) showed that
mature protein begins at Leu31, that Asn79 is
N-glycosyl-ated with >90% occupancy, Ser40 and Ser161 are partially
O-glycosylated [2], and Thr88 is modified with a single
O-linked fucose [12] Ser161 is the predicted x-site for
propeptide cleavage and glycosylphosphatidylinositol (GPI) attachment, and the segment comprising residues 170–188 is the predicted signal peptide for GPI-anchorage (Fig 1) [1] Evidence of an EGF-like domain structure in Cripto-related proteins is based on amino acid sequence homology [1,2,4,13,14], molecular modeling [1,14], and gene structure [1,2,4,14] The CFC region has no predictive model for disulfide linkage of its six cysteines
The role of the EGF-CFC family of proteins in embryogenesis is still being elucidated, but information to date suggests that Cripto is required for Nodal binding to the ActRIIB/ALK4 receptor complex and for Nodal activation of similar to mothers against decapentaplegic peptide-2 (Smad-2) [15–18] Moreover, point mutation experiments with Cripto have shown that the EGF domain
is necessary for binding to Nodal and the CFC domain is responsible for binding to ALK4 [16,17,19] A naturally occurring Pro125fi Leu mutation in the CFC domain of Cripto has been correlated with developmental anomalies
in the midline and forebrain in human fetuses, and an engineered construct with the same Pro125fi Leu muta-tion was inactive in a rescue model of the oep phenotype
in zebrafish [19] These findings highlight the biological importance of Cripto and underline the functional signi-ficance of the CFC domain In the present work, we have solved the disulfide structure of the CFC domain and have conducted biochemical studies that detail the interactions
of ALK4 with the CFC domain From molecular modeling studies, we have shown that the CFC domain of Cripto is structurally homologous to the von Willebrand Factor type C-like domain and that Cripto protein is structurally similar to the C-terminal extracellular portions of Jagged 1 and Jagged 2
Correspondence to Dingyi Wen, Biogen, Inc., 14 Cambridge Center,
Cambridge, MA 02142, USA.
Fax: + 1 617 679 2616, Tel.: +1 617 679 2362.
E-mail: Dingyi_Wen@biogen.com
Abbreviations: CFC, CRIPTO/FRL-1/Cryptic; PMP-C, PARS
inter-cerebralis major Peptide C; oep, zebrafish one-eyed pinhead protein;
EGF, epidermal growth factor; LTbR, Lymphotoxinb Receptor;
FACS, fluorescence-activated cell sorter; PTH, phenylthiohydantoin;
Cripto delC-Fc, Cripto (amino acids 1–169) fused to the hinge and
Fc region of human IgG1; NEM, N-ethylmaleimide; IAM,
iodo-acetamide; NES, 2-(N-ethylsuccinimidyl); ESI, electrospray
ionization; VWFC, von Willebrand Factor C domain;
GPI, glycosylphosphatidylinositol.
(Received 7 May 2003, revised 3 July 2003,
accepted 11 July 2003)
Trang 2Experimental procedures
Protein expression and purification
Recombinant human Cripto-1 was expressed in Chinese
hamster ovary cells as a C-terminally truncated form,
comprising amino acid residues 1–169, and was purified by
immunoaffinity chromatography on an anti-Cripto mAb
column [12]
Fluorescence-activated cell sorter (FACS) analysis
Analysis of Cripto–ALK4 interactions by flowcytometry
was performed essentially as described earlier [19] Briefly,
human 293 cells were transfected with plasmids expressing
human ALK4 (provided by M Whitman, Harvard
Medical School), full length wild-type Cripto, Cripto
(N85G/T88A) or Cripto (H120G/W123G) using Fugene
(Roche) according to the manufacturer’s instructions
After 48 h, cells were processed for flow cytometry
Approximately 5· 105 cells were incubated with
10 lgÆmL)1 of either human Cripto delC-Fc, CFC-Fc,
EGF-Fc, LTbR (Lymphotoxinb Receptor)-Fc or human
ALK4-Fc (R&D Systems) followed by a PE
(phycoery-thrin)-conjugated anti-human Fc secondary Ig (Jackson
Immunoresearch)
Estimation of free thiol groups
Approximately 50 lg of human Cripto (residues 74–169,
which covers the combined EGF-CFC domains) was
incubated in 100 mM N-ethylmeleimide (NEM), 6M
guanidine HCl, 72 mM Mes, pH 6.0, at 37C for 1 h
The control sample was incubated in the same way, but
without NEM Samples were desalted using ethanol
precipitation [20] This treatment was followed by
complete reduction of cystines with 30 mM dithiothreitol
in 8M guanidine HCl, 150 mM Tris HCl, pH 8.5, at
45C for 45 min, followed by alkylation with 75 mM
iodoacetamide (IAM) at room temperature for 1 h
After desalting using ethanol precipitation, the sample
was deglycosylated with PNGase F (Glyko) in 2Murea,
50 mM sodium phosphate, pH 7.6, 20 mM methylamine
HCl, 5 mM EDTA, at 37C overnight Intact mass was
measured on-line using a ZMD (electrospray ionization)
mass spectrometer (Waters) The molecular masses
were generated by deconvolution with the MAXENT 1
program
Generation of disulfide-linked CFC fragments of Cripto using CNBR and endoproteinase lys-c
Nonreduced Cripto was treated with 1 M CNBr in 70% formic acid at room temperature for 36 h To remove residual CNBr, the treated sample was dried under vacuum
in a SpeedVac concentrator, suspended in HPLC-grade water and dried again The water wash and drying steps were repeated, after which the final pellet was dissolved in
200 mMMes, pH 6.0 Approximately 60 lg of protein in
160 mMMes, pH 6.0, 20% 2-propanol, was digested with
6 lg of endoproteinase Lys-C (WAKO) at room tempera-ture for 24 h An additional 6 lg of the enzyme was added and the sample again incubated for 24 h Solid guanidine HCl was added to a final concentration of 6Mto quench the reaction
Separation of disulfide-linked CFC fragments of Cripto Fragments of Cripto were separated by reverse-phase HPLC (rp-HPLC) on a 1-mm· 250-mm Vydac C4column using a Waters Separation Module, Model 2690 Solvent A was 0.1% (v/v) trifluoroacetic acid in water and solvent B was 0.08% (v/v) trifluoroacetic acid in 75% acetonitrile
A linear gradient from 0 to 60% acetonitrile over 160 min was applied at flow rate of 0.07 mLÆmin)1 The column temperature was 30C Fractions were collected at 1-min intervals Fractions that were determined by mass analysis
to be enriched in the CFC-containing fragment were pooled and concentrated to dryness under vacuum The pellet was dissolved in 160 mM Mes, pH 6.0, 20% (v/v) 2-propanol,
1 mMCaCl2 Digestion was carried out at room tempera-ture using the following regimen: 1.8 lg of thermolysin was added at time 0 and after 24 h, and then 1.8 lg of endoproteinase Lys-C was added after 48 h The progress
of the digestion reaction was monitored by MALDI-TOF MS For this purpose, aliquots were removed after 24,
48 and 72 h and desalted using Millipore C18ZIP TIPsTM with or without reduction with 25 mMdithiothreitol in 4M guanidine HCl, 80 mM Tris HCl, pH 8.5, at room tem-perature for 1–2 h prior to mass analysis The enzymatic digest was stopped by the addition of solid urea to 6M The thermolysin and endoproteinase Lys-C digests were separ-ated on an rp-HPLC Vydac C18 column using a 100-min linear gradient of 0–75% acetonitrile Solvent A was 0.03% (v/v) trifluoroacetic acid in water, and solvent B was 0.024% (v/v) trifluoroacetic acid in 75% (v/v) acetonitrile One-minute fractions were collected and concentrated to dryness under vacuum, and the residue was resuspended in 5 lL of 0.1% (v/v) trifluoroacetic acid, 30% (v/v) acetonitrile Peptide analysis using MALDI-TOF MS
MALDI-TOF MS was carried out on a Voyager-DETM STR mass spectrometer (Applied Biosystems) in either linear or reflector mode using a-cyano-4-hydroxy cinnamic acid as matrix Results generated using the linear mode are expressed as average, protonated masses, those collected in the reflector mode, as protonated, monoisotopic masses An aliquot of 0.5 lL or 1 lL of each test sample was applied to the target plate After partial evaporation of the sample droplet at room temperature, 0.5 lL of matrix (10 mgÆmL)1
Fig 1 The predicted sequence of human Cripto-1 encoded from DNA is
shown: the signalpeptide is italicized and corresponds to residues 1–30.
Cripto del-C covers residues 31–169, with the CFC domain in bold.
The arrowindicates the predicted x-site for propeptide cleavage and
GPI-attachment; the signal region for GPI anchorage is underlined.
Trang 3in 50% acetonitrile/0.1% trifluoroacetic acid, v/v) was
applied Data acquisition and analysis were controlled by
GRAMS/32 software (version 4.11, Level 2)
N-terminal sequencing
Sequencing was carried out on an Applied Biosystems
Procise 494 cLC sequencer that was run in the pulsed liquid
mode The resulting PTH (phenylthiohydantoin) amino
acids were separated using an ABI 140D Solvent Delivery
System with a 0.8-mm· 250-mm, C18 PTH column and
were monitored on-line using an ABI 785A programmable
absorbance detector Data were analyzed using the
ABI610Adata analysis software
Homology search
A BLAST search [21] of the SWISSPROT database was
carried out using the primary sequence of the CFC region
(residues 113–154) of human Cripto as the query motif
Disulfide pattern search
The experimentally defined disulfide bond pattern of the
CFC region of CRIPTO was used to query an in-house
disulfide database built from annotations in SWISSPROT
The search method reports all proteins with the same
disulfide topology, e.g C1-C4/C2-C6/C3-C5 (C1 is the first
cysteine in the domain, C2 is the second, C3 is the third,
etc.), and ranks them according to sequence spacing
between the cysteines
Comparative modeling
The 3-D structures of the EGF-like domain and the CFC
domain were modeled separately using the MODELER
module [22] of theINSIGHT IIsoftware package (Accelrys,
Inc., San Diego, CA, USA) The NMR structure of mouse
EGF [Protein Data Bank code (pdb): 1epi] was used for
modeling of the EGF-like domain The CFC domain was
built using the NMR structure of the proteinase inhibitor,
PMP-C (pdb: 1pmc)
Motif search
Motif elements, identified independently by homology or
disulfide pattern, were used to query the nonredundant
database TREMBL using the DART program (a domain
motif search algorithm)
Results
The Cripto CFC domain is a functional unit
The predicted amino acid sequence (Fig 1) of mature
human Cripto contains 12 cysteine residues, six in the EGF
domain and six in the CFC domain To test whether the
CFC domain could retain its function independent of the
EGF domain, we generated a soluble form of the CFC
domain, comprised of the signal peptide and amino acids
112–169, fused to the hinge and Fc region of human IgG1
(CFC-Fc), and tested its ability to bind to ALK4
Previously, we showed in a FACS assay that full length Cripto (amino acids 1–169) fused to human Fc (Cripto delC-Fc) bound to human 293 cells expressing ALK4, but not to control 293 cells lacking ALK4 [19] We have now evaluated the binding of soluble CFC-Fc to ALK4-293 expressing cells by FACS assay, using Cripto delC-Fc as the positive control and LTbR-Fc as a negative control Figure 2A shows the results of this comparison A signifi-cant shift in mean fluorescence for ALK4-293 cells was seen
in the presence of either Cripto delC or CFC-Fc, but not with LTbR-Fc (Fig 2A2) A small shift was also seen with EGF-Fc, but this shift was not dependent on ALK4 expression These experiments also showthat the shift in mean fluorescence for CFC-Fc (Figs 2 and 3) binding to ALK4-293 cells is of similar magnitude as the shift of the positive control, Cripto delC-Fc (Fig 2A1), and therefore that the CFC domain is sufficient for the interaction of Cripto with ALK4
To verify the role of the CFC domain in ALK4 binding,
we analyzed the effects of point mutations in the EGF and CFC domains by FACS analysis, using mutations in the EGF and CFC domains known to disrupt function, specifically either downstream signaling or ALK4 binding [12,16] We have also compared the ability of both types of mutants, i.e the CFC domain mutant, H120G/W123G, and the EGF domain mutant, N85G/T88A, to bind to ALK4
Fig 2 FACS analysis of the interactions between Cripto and ALK4 (A) Incubation of soluble Cripto delC-Fc (A1), EGF-Fc (A2), and CFC-Fc (A3) with 293 cells expressing ALK4 The cells expressing ALK4 (bold, solid curve) are compared to the control cells that do not express ALK4 (solid curve) Incubation of LTbR-Fc with 293 cells expressing ALK4 was used as a control for the Fc portion of the proteins (dashed curve) (B) Incubation of ALK4-Fc with 293 cells expressing full length wild-type Cripto (B1), Cripto N85G/T88A (B2),
or Cripto H120G/W123G (B3) Cells expressing Cripto or mutants (bold, solid curve) are compared to the control cells that do not express any Cripto proteins (solid curve).
Trang 4by FACS (Fig 2B) The results showed that ALK4-Fc
binds well to cells expressing either wild type Cripto
(Fig 2B1) or the EGF domain mutant, N85G/T88A
(Fig 2B2), but does not bind to cells expressing the CFC
mutant, H120G/W123G (Fig 2B3) This and the previous
experiments demonstrate that the CFC domain is involved
in ALK4 binding
Determination of disulfide linkages in the CFC domain
Determination of whether there are free thiol groups in the
protein was done by alkylation of the protein with NEM
under nonreducing conditions followed by alkylation with
IAM under reducing conditions Alkylation of a cysteine
with NEM will add 125.1 Da to the mass of the protein or
peptide, whereas alkylation with IAM will add a mass of
56.9 Da The results from ESI mass spectrometric analysis
showed a range of molecular masses corresponding to
residues 74–169 completely alkylated with IAM, with
heterogeneity in glycosylation Masses corresponding to
protein containing 2-(N-ethylsuccinimidyl)-cysteine
(NES-Cys) residues were not detected Therefore, we conclude that
all of the cysteine residues in the protein are disulfide-linked
To study the disulfide structures of the CFC domain,
a double cleavage strategy was developed using CNBr
treatment followed by endoproteinase Lys-C cleavage This
strategy took advantage of a Lys residue (Lys112) between
the EGF-like domain and the CFC domain and a Met
residue (Met154) between the last Cys in the CFC domain
and the O-linked glycosylation site at residue Ser161 The
dual digest was then separated by rpHPLC and the fractions
containing the CFC domain were identified by
MALDI-TOF MS and were pooled for further analysis (see below)
In the CFC region, there are three Lys residues that might
be cleaved by endoproteinase Lys-C and two Trp residues
that could be oxidized during CNBr treatment [23]
Additional cleavage can take place on the C-terminal side
of oxidized Trp [24] The observed protonated mass (MH+)
of the major component in the pooled fractions containing the CFC domain was 4702.4 Da (Fig 3), which is consis-tent with fragments having either two oxidized Trp residues and one cleavage at a Lys residue or one oxidized Trp and cleavages at two of the Lys residues In-source fragment ions, MH+¼ 1599.5 Da and MH+¼ 3105.9 Da (Fig 3) indicated that the 4702-Da component was derived mainly from the peptides 113–126 (calculated m/z¼ 1599.8) and 127–154 (calculated m/z¼ 3106.8), linked by a disulfide bond A minor component generated by an additional cleavage after oxidized Trp123 (calculated MH+¼ 4365.12 Da, based on disulfide linked peptides 113–123 and 127–154) was also identified (Fig 3) The pooled fractions were analyzed by MALDI-TOF MS after reduc-tion also The results support the identificareduc-tion of the CFC peptides predicted from in-source fragmentation N-terminal sequencing results also supported this inter-pretation (data not shown)
The CFC domain-containing fractions were further digested with thermolysin, followed by endoproteinase Lys-C Twenty percent propanol was added to the digest
to promote preferential cleavages by thermolysin at the N-terminus of leucine, isoleucine, and phenylalanine [25] The extent of proteolytic cleavage between cysteine residues was monitored by MALDI-TOF MS after reduction (data not shown) Figure 4 shows the mass spectrum of the nonreduced digest after all enzyme treatment For the sake
of simplicity, we use C1 for the first cysteine residue in the CFC domain, C2 for the second, C3 for the third, etc We will use this nomenclature in the following discussion Interpretation of the data for the nonreduced sample is supported by identification of the peptides necessary to form the predicted disulfide bonds For example, mass signal detected at m/z¼ 2243.1 was interpreted as a disulfide-linked component composed of peptides 113–126 [C1 (Cys115)] and 133–137 [C4 (Cys133)] (Fig 4) Corres-ponding peptide 113–126 (m/zcal¼ 1598.7) and peptide 133–137 (m/zcal¼ 646.2) were detected both under reducing
Fig 4 MALDI-TOF mass spectrum of the nonreduced CFC domain after all enzymatic treatments The spectrum was derived in the reflector mode and all masses correspond to protonated monoisotopic mass Enzyme fragment peaks are identified with asterisks and in-source fragments are underlined.
Fig 3 MALDI-TOF mass spectrum of the CFC domain-containing
fractions under nonreducing conditions Peptide a, ENCGSVPHD
TW OX LPK; peptide b, ENCGSVPHDTW OX and peptide c, KCSLC
KCW OX HGQLRCFPQAFPQAFLPGCDGLVM The spectrum was
obtained in the linear mode and all masses correspond to protonated
average masses Oxidized Trp residues are represented as W OX and the
Met residue converted to homoserine lactone is in italics Masses
corresponding to intact CFC domain were not present In-source
fragments are indicated with asterisks.
Trang 5conditions and as in-source fragment ions derived from the
disulfide-linked peptide (Fig 4) The mass spectrometric
data also clearly demonstrate that C3 (Cys131) forms a
disulfide bond with C5 (Cys140) as evidenced by
disulfide-linked peptides at masses 957.5, 1066.6, 1123.7, and 1194.7
In addition, certain in-source fragments expected from these
disulfide-linked peptides are present, i.e at 763.4 and
834.5 Da (Fig 4) As C1 is disulfide-bonded to C4 and
C3 is disulfide-bonded to C5, it can be deduced that C2
must be linked to C6, although the corresponding mass was
not detected, presumably due to ion suppression To
confirm this deduction, the thermolysin digest was separated
by rpHPLC and the peaks were analyzed by
MALDI-TOF MS In one of the major peaks, masses of 1042.5 and
914.5 corresponding to the disulfide-linked peptides
FLPGC(6)DG with KC(2)S and FLPGC(6)DG with
C(2)S, respectively, were detected Other disulfide-linked
peptides, such as C1-C4 and C3-C5, were also identified by
MALDI-TOF MS in different fractions The fractions
containing disulfide-linked peptides were evaluated by
N-terminal sequencing The mass spectrometric and
N-terminal sequencing results confirmed that C1 is linked
to C4, C2 to C6, and C3 to C5
Primary structure search
The amino acid sequence information for the CFC region
of human Cripto was used to carry out aBLASTsearch of
the combined SWISSPROT/TREMBL database An
initial search showed matches to the VWFC (von
Willebrand Factor C)-like domain in human and chicken
a-1 collagen, mouse and human NELL 2, and chicken
NEL, with low homology (e-value > 0.1), in addition to
other Cripto and Cripto-like proteins The VWFC
domain is defined by a pattern of 10 cysteine residues
of undetermined connectivity, but, the similarity of the
Cripto CFC domain to the above-listed proteins is confined to the portion of the VWF-C motif containing the first six cysteine residues
Motif search Subsequent searches of the protein database with DART, using combined EGF-like/VWFC sequences as queries, provided additional matches, some of which are listed in Table 1 Many of the identified proteins contain both EGF-like and VWFC domains, but, only in human Jagged 1 and Jagged 2, Drosophila Serrate, and NELL 1 were both domains adjacent to and in the same order as the putative EGF-like and CFC regions of human Cripto Furthermore, only in Jagged 2 are the regions adjacent to the membrane interface (transmembrane) region To examine the strength
of this relationship, we aligned the amino acid sequences of human Jagged 2 and Cripto (Fig 5) The conservation of residues such as cysteine, proline, glycine, and tryptophan, which are important for the protein folding, is highlighted in the alignment [26]
Disulfide pattern search and comparative modeling The disulfide pattern that was determined experimentally for human Cripto (i.e C1-C4, C2-C6, C3-C5) was used to query a disulfide database compiled from SWISSPROT annotations (van Vlijmen, H W T., Gupta, A & Singh, J., Biogen Corp., unpublished observations) This is an ortho-gonal method for exploring relationships, and revealed two small, structurally related serine protease inhibitors, PMP-D2 and PMP-C [27], that were not uncovered usingBLAST
on the SWISSPROT/TREMBL database Based on the NMR structure of PMP-C (Protein Data Bank code, 1pmc) and the sequence alignment shown in Fig 6, a 3-D model was built for the CFC domain of Cripto (Fig 7) In the
Table 1 Summary of some of the proteins identified as containing VWFC-like domains Definition of motifs as VWFC-like are based on SWISS-PROT annotations and NCBI DART predictions.
Protein name No of EGF-like domains No of VWFC-like domains No of Cys in VWFC-like domains
NELL2 (human and rat) 6 5 10 in domains 1–4, 8 in domain 5 Protein kinase C (BP) 6 5 10 in domains 1–4, 8 in domain 5
Fig 5 Alignment of the sequences of human Cripto and human Jagged 2 Conserved residues are framed with solid lines and homologous residues are framed with dashed lines The sequence identity over the alignment length is 26%.
Trang 6computed model of the Cripto CFC domain, one side of the
molecule has a high concentration of hydrophobic residues,
including Trp134, Leu138, Phe141, Pro142, Phe145, and
Leu146 These hydrophobic residues are on the side of the
protein opposite to residues His120 and Trp123 that have
been implicated in binding of Cripto to ALK4 The
hydrophobic residues may play a role in the folding of
Cripto by interacting with the EGF-like domain, or they
may constitute the interaction site with other signaling
components A 3-D model of the EGF-like domain of
Cripto was also built, based on the NMR structure of murine EGF (Protein Data Bank code, 1epi), by aligning the cysteine residues as described previously [28] Two theoretical models of the full-length Cripto protein were constructed by connecting the EGF and CFC modules In the first model (Fig 7A) the domains are arranged in an extended conformation, analogous to the conformation found for the solution structure of a covalently linked pair
of EGF domains from human fibrillin-1 [29] The second model has a more globular structure in which the EGF and
Fig 7 Modelfor EGF-CFC domains of Cripto Hypothetical structures for EGF-CFC domains are shown in an extended conformation (A) or closed conformation (B) In the CFC domain, disulfide bonds Cys115-Cys133, Cys128-Cys149, and Cys131-Cys140 are indicated by DS1, DS2 and DS3, respectively Residues H120 and W123 have been implicated in Alk4 binding and are shown in purple Residues N79 and T88 (shown in red) are modified through N-linked glycosylation and O-linked fucosylation, respectively Residues N79, N85, R104, and E107 (the latter three shown in blue) have been shown to be important in Nodal induction of Smad2 phosphorylation [16] Nter designates the location of the expressed amino-terminus; Cter designates the location of the expressed carboxyl-terminus.
Fig 6 Alignment of the sequences of the CFC domain of human Cripto and PMP-C Conserved residues are framed with solid lines and homologous residues are framed with dashed lines.
Trang 7CFC modules have a large number of noncovalent contacts
(Fig 7B), analogous to the crystal packing of the EGF-like
domains from human factor IX [30]
We have also modeled the EGF-like (15th EGF domain)
and adjacent VWF-C domains of human Jagged 2, using
the same approach described for the Cripto EGF-like and
CFC domains and found that there are no structural
incompatibilities
Discussion
We have used chemical and enzymatic fragmentation, mass
spectrometry, and N-terminal sequence analysis to
charac-terize the disulfide linkages of the cysteine residues in the
CFC region of human Cripto From these studies, we show
that the six cysteines are linked in three disulfide bonds,
C1-C4, C3-C5, C2-C6 We performed these experiments on
a truncated, recombinant version of human Cripto,
con-taining residues 31–169 of wild type human Cripto We
consider the results a valid representation of the wild type
structure because a seventh Cys residue, Cys181, is located
in the predicted GPI signal sequence that would normally
be cleaved off during processing of the wild type protein
[31,32] Furthermore, it has been demonstrated that this
soluble, C-terminally truncated recombinant human Cripto
protein is biologically active [2] Using both the primary
sequence of the CFC region and the experimentally defined
disulfide pattern to query protein sequence and disulfide
databases, we obtained matches to a group of proteins
containing a VWFC-like motif The VWFC-like motif is
believed to play an important role in the formation of
certain protein complexes, examples including
thrombo-spondin 1 (TSP1), which binds to CD36 on endothelial cells
[29], and procollagen IIA and chordin, which bind to bone
morphogenic protein [33] The binding properties of these
proteins have led to the hypothesis that proteins containing
VWFC-like domains (Cys-rich) act as TGFbeta sinks in
modulating development [33] Although most of the
docu-mented VWFC-like motifs contain 10 cysteine residues,
there are several instances where such regions have fewer
than 10, e.g the C-terminal VWFC domain of NEL
(chicken), NELL 1 (rat) and NELL 2 (human and rat), and
the last VWFC region of murine tectorin – all of which
contain only eight cysteine residues In all of the examples of
proteins containing shortened VWFC-like domains, the
motif is abbreviated by loss of the C-terminal region,
covering residues Cys9 and Cys10 These observations
suggest that the CFC region in Cripto can be considered as a
truncated form of the VWFC-like domain Assuming that
the CFC region of Cripto is VWFC-like, we infer that the
EGF-CFC family of proteins is a variation of an already
described theme for which there are many examples in
modular proteins Among them are several that have a
juxtaposition of the EGF and CFC domains seen in Cripto,
for example, NELL 1, NELL 2, JAGGED 1 and
JAGGED 2, in which at least one of the EGF-like domains
is N-terminal to a VWFC-like domain (Table 1) For
JAGGED 1 and JAGGED 2, the similarity extends to the
position of the membrane attachment sequence, specifically,
a trans-membrane domain that is C-terminal to the
VWFC-like domain, and we found that there was a striking
degree of structural similarity between Cripto EGF-CFC
and human JAGGED 2 (Fig 5) As with Cripto, human JAGGED 2 is involved in signal transduction as a ligand for the NOTCH receptor, another EGF homolog [34] Moreover, similar to Cripto, a major function of JAGGED 2 is in patterning and morphogenesis in early embryonic development [35,36] Although JAGGED 2 is not fucosylated as Cripto is [2,12], the function of NOTCH ligand is reportedly regulated by fucosylation of the Notch receptor [35] The specific role of the individual domains of human JAGGED has not been delineated, but Serrate, the Drosophila version of JAGGED, has been investigated Hukriede et al [37] have shown that a truncated form of Serrate, lacking the VWFC region [38], binds to NOTCH but does not activate NOTCH signaling The functions of the domains in Cripto are still being investigated, but initial information published previously by Yeo et al [16] and described here indicate that the EGF and CFC domains have different functions Yeo et al showed that ALK4 was coimmunoprecipitated with the CFC domain of murine Cripto, but not with the CFC mutant (H120G/W123G) [16] Here we have confirmed and expanded upon these findings using ALK4 and human Cripto, and have demon-strated that the CFC domain alone is sufficient for ALK4 binding These experiments highlight the important role of the CFC domain, like other VWFC-domains [29,33], in complex formation
Recently, Minchiotti et al [7] postulated a structural model of human Cripto based on the beta-trefoil fold of basic FGF In this model, the EGF-like and CFC domains form the second and third lobes of the trefoil structure, respectively We nowbelieve this model to be incorrect because it cannot accommodate the actual disulfide connectivities in the CFC domain of Cripto described here Using our experimentally determined disulfide pat-tern in the CFC domain to search a disulfide database compiled from SWISSPROT, we identified a structurally known homologue, chymotrypsin inhibitor PMP-C Because of amino acid sequence similarities and disulfide linkage identity between the Cripto CFC domain and PMP-C, we built a model of the Cripto CFC domain using the NMR solution structure of PMP-C as a template (Fig 7) Our model is consistent with data from previous functional studies [7,16,19], as well as from the current study, in particular, the observation that mutations in the CFC domain at His120 and Trp123 abolish ALK4 interactions (Fig 2B) In our model (Fig 7), the side-chains of His120 and Trp123 are solvent-exposed, allowing for possible protein–protein interaction Interestingly, in our CFC model, we have identified a hydrophobic patch consisting of Trp134, Leu138, Phe141 and Pro142 Leu138 and homologues of Trp134 and Phe141 are conserved throughout the Cripto family [1] and are clustered on the side of the CFC domain opposite the presumed ALK4 binding site (which includes His120 and Trp123) This hydrophobic patch may be important for protein–protein interactions
Two possible structural models for full-length Cripto protein – a linear (open) configuration (Fig 7A) and a closed configuration (Fig 7B) – have been constructed by connecting an EGF-like module [28] and our CFC module (Fig 7) However, at this point we do not have enough data
to favor one model over the other Both models fulfill the
Trang 8predictions for the structure of the EGF-like domain,
namely, solvent exposure of the fucosylation site at Thr88
and the N-linked glycosylation site at Asn79, and both
allowfor potential protein–protein interactions via the
above-described hydrophobic patch Structure
determin-ation of human Cripto by NMR is in progress to address
these questions
In summary, the disulfide bond pattern for the six
cysteine residues in the CFC domain of human Cripto has
been experimentally defined as C1-C4, C2-C6, C3-C5, and
biochemical studies have shown that the CFC domain binds
to ALK4 independent of the EGF domain Database
searches based on the primary sequence have uncovered
similarities between Cripto EGF-CFC domains and the
EGF-VWFC domains of the C-terminal extracellular
portions of Jagged 1 and Jagged 2 A 3-D structural model
of the CFC domain was constructed based on the NMR
structure of PMP-C, a serine protease inhibitor having the
same disulfide connectivity This model revealed a
hydro-phobic patch that is probably important for protein binding
Two possible models for intact Cripto have also been
proposed By exploring the structural features of Cripto, as
defined by our models, we hope to increase the
understand-ing of the role of Cripto in the Nodal signal transduction
pathway
Acknowledgements
We would like to thank Dr R Blake Pepinsky for his review and editing
of this manuscript We would also like to thank Dr Joseph Rosa, Drs
Kevin Williams, Alphonse Galdes, and Alex Buko for their valuable
insights.
References
1 Colas, J.F & Schoenwolf, G.C (2000) Subtractive hybridization
identifies chick-cripto, a novel EGF-CFC ortholog expressed
during gastrulation, neurulation and early cardiogenesis Gene
255, 205–217.
2 Saloman, D.S., Bianco, C., Ebert, A.D., Khan, N.I., De Santis,
M., Normanno, N., Wechselberger, C., Seno, M., Williams, K.,
Sanicola, M., Foley, S., Gullick, W.J & Persico, G (2000) The
EGF-CFC family: novel epidermal growth factor-related proteins
in development and cancer Endocr Relat Cancer 7, 199–226.
3 Bamford, R.N., Roessler, E., Burdine, R.D., Saplakoglu, U., de la
Cruz, J., Splitt, M., Towbin, J., Bowers, P., Ferrero, G.B., Marino,
B., Schier, A.F., Shen, M.M., Muenke, M & Casey, B (2000)
Loss-of-function mutations in the EGF-CFC gene CFC1 are
associated with human left-right laterality defects Nat Genet 26,
365–369.
4 Dono, R., Scalera, L., Pacifico, F., Acampora, D., Persico, M.G.
& Simeone, A (1993) The murine cripto gene: expression during
mesoderm induction and early heart morphogenesis Development
118, 1157–1168.
5 Ding, J., Yang, L., Yan, Y.T., Chen, A., Desai, N.,
Wynshaw-Boris, A & Shen, N.N (1998) Cripto is required for correct
orientation of the anterior-posterior axis in the mouse embryo.
Nature 395, 702–707.
6 Gritsman, K., Zhang, J., Cheng, S., Heckscher, E., Talbot, W.S &
Schier, A.F (1999) The EGF-CFC protein one-eyed pinhead is
essential for nodal signaling Cell 97, 121–132.
7 Minchiotti, G., Manco, G., Parisi, S., Lago, C.T., Rosa, F &
Persico, M.G (2001) Structure-function analysis of the EGF-CFC
family member Cripto identifies residues essential for nodal sig-nalling Development 128, 4501–4510.
8 Saloman, D.S., Bianco, C & De Santis, M (1999) Cripto: a novel epidermal growth factor (EGF) -related peptide in mammary gland development and neoplasia Bioessays 21, 61–70.
9 Schier, A.F., Neuhauss, S.C., Helde, K.A., Talbot, W.S & Driever, W (1997) The one-eyed pinhead gene functions in mesoderm and endoderm formation in zebrafish and interacts with no tail Development 124, 327–342.
10 Xu, C., Liguori, G., Persico, M.G & Adamson, E.D (1999) Abrogation of the Cripto gene in mouse leads to failure of post-gastrulation morphogenesis and lack of differentiation of cardio-myocytes Development 126, 483–494.
11 Zhang, J., Talbot, W.S & Schier, F (1998) Positional cloning identifies zebrafish one-eyed pinhead as a permissive EGF-related ligand required during gastrulation Cell 92, 241–251.
12 Schiffer, S.G., Foley, S.F., Kaffashan, A., Hronowski, X., Zichittella, A.E., Yeo, C.Y., Miatkowski, K., Adkins, H.B., Domon B., Whitman, M., Salomon, D., Sanicola, M & Williams, K.P (2001) Fucosylation of Cripto is required for its ability to facilitate nodal signaling J Biol Chem 276, 37767–37777.
13 Ciccodicola, A., Dono, R., Obici, S., Simeone, A., Zollo, M & Persico, M.G (1989) Molecular characterization of a gene of the EGF family expressed in undifferentiated human NTERA2 teratocarcinoma cells EMBO J 8, 1987–1991.
14 Shen, M.M., Wang, H & Leder, P (1997) A differential display strategy identifies Cryptic, a novel EGF-related gene expressed in the axial and lateral mesoderm during mouse gastrulation Development 124, 429–442.
15 Reissmann, E., Jornvall, H., Blokzijl, A., Andersson, O., Chang, C., Minchiotti, G., Persico, M.G., Ibanez, C.F & Brivanlou, A.H (2001) The orphan receptor ALK7 and the Activin receptor ALK4 mediate signaling by Nodal proteins during vertebrate develop-ment Genes Dev 15, 2010–2022.
16 Yeo, C.Y & Whitman, M (2001) Nodal signals to Smads through Cripto-dependent and Cripto-independent mechanisms Mol Cell
7, 949–957.
17 Bianco, C., Adkins, H.B., Wechselberger, C., Seno, M., Normanno, N., De Luca, A., Sun, Y., Khan, N., Kenny, N., Ebert, A., Williams, K.P., Sanicola, M & Salomon, D (2002) Cripto-1 activates nodal- and ALK4-dependent and – independ-ent signaling pathways in mammary epithelial Cells Mol Cell Biol 22, 2586–2597.
18 Yan, Y., Liu, J., Luo, Y.E.C., Haltiwanger, R.S., Abate-Shen, C.
& Shen, M.M (2002) Dual roles of Cripto as a ligand and coreceptor in the nodal signaling pathway Mol Cell Biol 22, 4439–4449.
19 De la Cruz, J.M., Bamford, R.N., Burdine, R.D., Roessler, E., Barkovich, A.J., Donnai, D., Schier, A.F & Muenke, M (2002) A loss-of-function mutation in the CFC domain of TDGF1 is associated with human forebrain defects Hum Genet 110, 422–428.
20 Pepinsky, R.B (1991) Selective precipitation of proteins from guanidine hydrochloride-containing solutions with ethanol Anal Biochem 195, 177–181.
21 Altschul, S.F., Gish, W., Miller, W., Myers, E.W & Lipman, D.J (1990) Basic local alignment search tool J Mol Biol 215, 403–410.
22 Sali, A & Blundell, T.L (1993) Comparative protein modelling by satisfaction of spatial restraints J Mol Biol 34, 779–815.
23 Morrison, J.R., Fidge, N.H & Gergo, B (1990) Studies on the formation, separation, and characterization of cyanogen bromide fragments of human AI apolipoprotein Anal Biochem 186, 145–152.
Trang 924 Boulware, D.W., Goldsworthy, P.D., Nardella, F.A & Mannik,
M (1985) Cyanogen bromide cleaves Fc fragments of pooled
human IgG at both methionine and tryptophan residues Mol.
Immunol 22, 1317–1322.
25 Welinder, K.G (1988) Generation of peptides suitable for
sequence analysis by proteolytic cleavage in reversed-phase
high-performance liquid chromatography solvents Anal Biochem 174,
54–64.
26 Naismith, J.H & Sprang, S.R (1998) Modularity in the
TNF-receptor family Trends Biochem Sci 23, 74–79.
27 Mer, G., Hietter, H., Kellenberger, C., Renatus, M., Luu, B &
Lefevre, J.F (1996) Solution structure of PMP-C: a newfold in
the group of small serine proteinase inhibitors J Mol Biol.
258, 158–171.
28 Lohmeyer, M., Harrison, P.M., Kannan, S., DeSantis, M.,
O’Reilly, N.J., Sternberg, M.J., Salomon, D.S & Gullik, W.J.
(1997) Chemical synthesis, structural modeling, and biological
activity of the epidermal growth factor-like domain of human
cripto Biochemistry 36, 3837–3845.
29 Daw son, D.W., Pearce, S.F., Zhong, R., Silverstein, R.L., Frazier,
W.A & Bouck, N.P (1997) CD36 mediates the In vitro inhibitory
effects of thrombospondin-1 on endothelial cells J Cell Biol 138,
707–717.
30 Rao, Z., Handford, P., Mayhew , M., Knott, V., Brow nlee, G.G &
Stuart, D (1995) The structure of a Ca (2+)-binding epidermal
growth factor-like domain: its role in protein–protein interactions.
Cell 82, 131–141.
31 Ferguson, M.A & Williams, A.F (1988) Cell-surface anchoring of proteins via glycosyl-phosphatidylinositol structures Annu Rev Biochem 57, 285–320.
32 Englund, P.T (1993) The structure and biosynthesis of glycosyl phosphatidylinositol protein anchors Annu Rev Biochem 62, 121–138.
33 Larrain, J., Bachiller, D., Lu, B., Agius, E., Piccolo, S & De Robertis, E.M (2000) BMP-binding modules in chordin: a model for signalling regulation in the extracellular space Development
127, 821–830.
34 Muskavitch, M.A (1994) Delta-notch signaling and Drosophila cell fate choice Dev Biol 166, 415–430.
35 Hicks, C., Johnston, S.H., diSibio, G., Collazo, A., Vogt, T.F & Weinmaster, G (2000) Fringe differentially modulates Jagged1 and Delta1 signalling through Notch1 and Notch2 Nat Cell Biol.
2, 515–520.
36 Lanford, P.J., Lan, Y., Jiang, R., Lindsell, C., Weinmaster, G., Gridley, T & Kelley, M.W (1999) Notch signalling pathway mediates hair cell development in mammalian cochlea Nat Genet.
21, 289–292.
37 Hukriede, N.A & Fleming, R.J (1997) Beaded of Goldschmidt,
an antimorphic allele of Serrate, encodes a protein lacking trans-membrane and intracellular domains Genetics 145, 359–374.
38 Hukriede, N.A., Gu, Y & Fleming, R.J (1997) A dominant-negative form of Serrate acts as a general antagonist of Notch activation Development 124, 3427–3437.