Báo cáo khoa học: Wheat germ cell-free platform for eukaryotic protein production potx

Markley Center for Eukaryotic Structural Genomics, Biochemistry Department, University of Wisconsin-Madison, Madison, WI, USA Introduction One of the most important tasks in biotechnolog

Trang 1

Wheat germ cell-free platform for eukaryotic protein

production

Dmitriy A Vinarov, Carrie L Loushin Newman and John L Markley

Center for Eukaryotic Structural Genomics, Biochemistry Department, University of Wisconsin-Madison, Madison, WI, USA

Introduction

One of the most important tasks in biotechnology

today is the development of improved systems and

strategies for synthesizing any desired protein or

pro-tein fragment in its folded, soluble form on a

prepara-tive scale This task is fundamental to the success of

structural genomics projects, which promise to

capital-ize upon numerous advances in science and

technol-ogy to change the appreciation and understanding of

biological systems Structural genomics implies a

move away from hypothesis-driven research to a

sys-tem of solving structures ﬁrst and using these struc-tures and other strucstruc-tures modeled from them as the source of hypotheses for further research The medical incentives for understanding protein structure are great Many diseases are caused by defects in a single protein that alter its folding, stability, or activity The structures of proteins involved in diseases will move

us a step closer to improving disease treatment, diag-nosis, and prevention Beyond their speciﬁc medical applications, structural genomics projects are teaching fundamental lessons about the structural basis of life

on this planet

Keywords

cell-free extract; in vitro; isotopic labeling;

NMR screening; NMR structure

determination; protein production; protein

structure; transcription; translation; wheat

germ

Correspondence

J L Markley, Biochemistry Department,

University of Wisconsin-Madison, 433

Babcock Drive, Madison, WI 53706, USA

Fax: +1 608 262 3759

Tel: +1 608 263 9349

E-mail: markley@nmrfam.wisc.edu

Website: http://uwstructuralgenomics.org

(Received 2 May 2006, revised 13 July

2006, accepted 26 July 2006)

doi:10.1111/j.1742-4658.2006.05434.x

We describe a platform that utilizes wheat germ cell-free technology to pro-duce protein samples for NMR structure determinations In the first stage, cloned DNA molecules coding for proteins of interest are transcribed and translated on a small scale (25 lL) to determine levels of protein expression and solubility The amount of protein produced (typically 2–10 lg) is suffi-cient to be visualized by polyacrylamide gel electrophoresis The fraction of soluble protein is estimated by comparing gel scans of total protein and soluble protein Targets that pass this first screen by exhibiting high protein production and solubility move to the second stage In the second stage, the DNA is transcribed on a larger scale, and labeled proteins are pro-duced by incorporation of [15N]-labeled amino acids in a 4 mL translation reaction that typically produces 1–3 mg of protein The [15N]-labeled pro-teins are screened by 1H-15N correlated NMR spectroscopy to determine whether the protein is a good candidate for solution structure determin-ation Targets that pass this second screen are then translated in a medium containing amino acids doubly labeled with15N and 13C We describe the automation of these steps and their application to targets chosen from a variety of eukaryotic genomes: Arabidopsis thaliana, human, mouse, rat, and zebrafish We present protein yields and costs and compare the wheat germ cell-free approach with alternative methods Finally, we discuss remaining bottlenecks and approaches to their solution

Abbreviations

CESG, Center for Eukaryotic Structural Genomics; GST, glutathione S-transferase; HSQC, heteronuclear single-quantum correlation

spectroscopy; IMAC, immobilized metal affinity chromatography; PDB, Protein Data Bank; [U-15N]-, uniform labeling with nitrogen-15; SAIL, stereo-array isotope labeled; Se-Met, selenomethionine.

Trang 2

Protein production remains a bottleneck in

proteo-mics, for both structural and functional studies Most

structural biology groups and structural genomics

cen-ters utilize cell-based, heterologous protein production

from Escherichia coli However, this approach fails

with many individual proteins, particularly those from

eukaryotes Failures result from no or low expression,

low solubility, or degradation Expression levels can be

improved by producing the protein of interest as a

cleavable fusion with a highly expressing protein Low

solubility can result from failure of the protein to fold

properly, aggregation of folded protein, or from

unfav-orable properties of the construct (intrinsic insolubility

of the native sequence or insolubility introduced by

a non-native sequence, such as a puriﬁcation tag or

other cloning artifact) As indicated in TargetDB, the

target registration database for structural genomics

(http://targetdb.pdb.org/), the proportion of targets

that code for ‘unique proteins’ that yield soluble

pro-tein is only about one-third for prokaryotic propro-teins

and much lower for eukaryotic proteins In this

con-text, a unique protein is deﬁned as one with a peptide

sequence exhibiting£ 30% sequence identity to the

sequence of any protein with a three-dimensional

structure deposited in the Protein Data Bank

Solubil-ity can be improved greatly by producing the protein

of interest as a cleavable fusion with a highly soluble

protein This strategy may enable the protein to fold

properly without aggregation so that it stays in

solu-tion following cleavage Many eukaryotic proteins are

‘natively disordered’, that is they do not adopt a

sin-gle, stable, folded structure Some natively disordered

proteins require an additional factor for folding: a

metal ion, a small molecule cofactor, another peptide

chain, or an oligonucleotide Other proteins may

require extensive post-translational modiﬁcation to

achieve their native folded state Platforms for

struc-tural investigations must support the production of

proteins on the scale of 2–10 mg For efﬁcient

struc-ture determination by NMR spectroscopy, the proteins

must be labeled with stable isotopes (15N or13C+15N,

or for larger proteins 2H+13C+15N) For X-ray

crys-tallography, proteins normally are labeled with

sele-nomethionine (Se-Met) to support multiwavelength

anomalous dispersion data collection for phase

deter-mination Because protein production and labeling on

this scale is expensive, it is important to screen targets

ﬁrst on a smaller scale to identify which constructs are

expressed, soluble without aggregation, folded, and

stable under the conditions used for NMR structure

determinations or crystallization trials

In vitro cell-free methods for protein synthesis with

extracts from prokaryotic [1] or eukaryotic [2] cells

offer an alternative to the E coli cell-based platforms Cell-free approaches have a number of potential advantages over other alternatives to heterologous expression in E coli cells Stable isotope or Se-Met labeling is easier with cell-free systems than with yeast, mammalian, or insect cell systems [3–5] Cell-free sys-tems may permit successful production of proteins that undergo proteolysis [6,7] or accumulate in inclusion bodies [8] in cells Cell-free systems support selective labeling strategies [9–12] that cannot be achieved in bacterial whole cell systems An important emerging approach is the incorporation of stereo-array isotope labeled (SAIL) amino acids [13], chemically synthes-ized amino acids with stereo-speciﬁcally arrayed stable isotope (2H and13C) labeling patterns that are optimal for NMR spectroscopy SAIL amino acids are being commercialized by a start-up company in Japan (Sail Technologies, Inc., Yokohama, Japan) and when avail-able will raise the threshold for high-throughput NMR structure determinations from 20 kDa to 40 kDa and above [13] The SAIL amino acids must be incorpor-ated into proteins by in vitro synthesis so as not to dis-turb the labeling pattern

Cell-free systems have been used for the production

of various kinds of proteins, including membrane pro-teins [14] and propro-teins that are toxic to cells [8,15] It

is possible to collect NMR spectra of [15N]-labeled proteins prior to isolation from the cell-free protein synthesis mixture [16,17] One of the features of cell-free protein production is that only the protein of interest is labeled, so that contaminating proteins do not show up in normal multinuclear NMR spectra Cell-free protein production protocols are streamlined compared to cell-based protocols, in that they do not require cell harvesting or cell lysis Protein puriﬁcation

is usually simpler, because the protein of interest starts out more concentrated and is isolated from a smaller set of contaminants

The RIKEN Structural Genomics Center in colla-boration with Roche has pioneered the use of cell-free protein production through a coupled transcription-translation system employing E coli extracts [18–22]

It has been found, however, that most of the pro-teins that produce well in E coli cell-free systems are the same ones that are produced successfully from

E coli cells [10] Thus, despite other potential advan-tages, the E coli cell-free approach may not greatly expand the range of proteins that can be produced in soluble, folded state, although it may be possible to overcome this limitation by redesigning the gene sequence (see below), by adding chaperones or other factors [22,24], or by reengineering ribosomal proteins [25]

Trang 3

One of the ﬁrst in vitro translation systems to be

investigated was prepared from wheat germ extracts,

but yields from this eukaryotic extract were low [2]

Y Endo and his group at Ehime University

(Matsuy-ama, Japan) achieved a breakthrough in this

technol-ogy by ﬁnding that an inhibitor of ribosomal protein

synthesis, tritin, is associated with the coat of the

wheat embryo [26] They developed a process for

removing this contaminating inhibitor and patented

this process along with methods for utilizing the

improved wheat germ extract [26–31] Endo founded

a company, CellFree Sciences Co., Ltd (Yokohama,

Japan), to commercialize the technology We found

this approach to be promising and formed a

cooper-ative undertaking with Ehime University and the

CellFree Sciences Co., Ltd with the goal of

investi-gating the potential of wheat germ cell-free protein

production as an enabling technology in our

struc-tural genomics project, the Center for Eukaryotic

Structural Genomics (CESG; Madison, WI) As

dis-cussed here, we have found this technology to be

robust, and our wheat germ cell-free pipeline now

supports high-throughput screening for protein

pro-duction and solubility and provides stable isotope

labeled protein samples for the majority of the NMR

structures determined at CESG [32,33]

CESG’s wheat germ cell-free platform

Our detailed protocol for wheat germ cell-free protein

production is available elsewhere [34] In short, the

approach consists of four steps (Fig 1A): (1) creation

of a plasmid used for in vitro transcription, (2) small

scale (25–50 lL) screening to assay the level of protein production and solubility, (3) larger scale (4–12 mL) production of [U-15N]-protein used to evaluate whe-ther solution conditions can be found that render the target suitable for NMR structure determination (soluble, monodisperse, folded, and stable), and (4) production of sufﬁcient [U-13C,15N]-protein for multi-dimensional, multinuclear magnetic resonance data collection We purchase the wheat germ extract from CellFree Sciences, Inc., the RNA polymerase from Promega (Madison, WI), and the labeled amino acids from Cambridge Isotope Laboratories, (Andover, MA) Details about these and other reagents and sup-plies are found in our publications [32–34]

The purification workflow diagram is shown in Fig 1B In step (1), a defined series of cloning proce-dures are used to create a DNA plasmid containing the target gene and 5¢ and 3¢ extensions that promote efficient transcription and translation In step (2), small scale protein expression and purification trials are car-ried out, generally in a 96 well format Successful can-didates from these screens (those estimated to yield

> 0.5 mgÆmL)1 target protein with solubility > 75%) are then selected for larger scale protein production with incorporation of [15N]-labeled amino acids Puri-ﬁed [U-15N]-protein samples produced in step (3) are then assayed by 1H-15N correlation spectroscopy (1H-15N HSQC) for their suitability as structural can-didates (they must be folded, monodisperse, and stable

at room temperature for at least 14 days) The solution conditions can be reﬁned as part of this step Targets that pass these tests are then prepared as [U-13C,15 N]-protein samples, step (4)

1st GSTrap Column

Concentrate

PreScission Protease Cleavage

2nd GSTrap Column

Protein product with cleavable N-GST tag

Protein product with non-cleavable N-(His) 6 tag

Ni-HiTrap Chelating Column

Concentrate

Superdex75 in NMR Buffer

Concentration NMR sample

Cell Free Reaction (4-12ml)

Target selection

Screening (50 µl scale)

Analysis, Expression level, Solubility, (Tag cleavage)

3 Production and analysis of [ 15 N]-protein

DNA plasmid preps, Transcription

Translation on [ 15 N]-amino acids (4 ml reaction)

Isolation, purification (tag removal) HSQC NMR analysis,

Solubility, stability, and MS analysis

4 Production of [ 13 C, 15 N]-protein

As above but with double-labeling (4 –12 ml reaction)

Structure determination by NMR

Production for structural analysis (4-12 ml scale)

Production for structural analysis

2 Small scale – Transcription, Translation

1 Cloning – PCR from cDNA, Ligation cloning

DNA plasmid prep

Successful targets

Fig 1 (A) Workflow diagram showing how wheat germ cell-free platform is used to screen constructs for the expression of sol-uble protein, to produce [ 15 N]-labeled protein for NMR screening for suitability as a struc-tural candidate, and for the production of double-labeled [ 13 C, 15 N]protein for structure determination (B) Schematic illustration of the steps involved in isolating and purifying proteins produced by wheat germ cell-free platform depending on the type of tag: non-cleavable N-(His) 6 tag or cleavable N-GST tag.

Trang 4

We have tested the wheat germ cell-free platform in

the context of NMR-based structural genomics of

eukaryotic proteins and have compared it with our

parallel E coli cell-based platform Our experience is

summarized brieﬂy as follows (a) Targets can be

screened more quickly and more economically for

pro-tein expression and solubility by the cell-free approach

than by the cell-based approach The efﬁciency of this

process is important, because we need to screen many

targets or multiple constructs of a given target in order

to ﬁnd one that produces a protein that is soluble and

well folded As an example of multiple screening of a

given target, we have screened targets with a

noncleav-able His6 tag, with a cleavable His6 tag, and with a

cleavable glutathione S-transferase (GST) tag and have

shown complementary success with these [35] (b)

Because of the smaller volumes involved, the isolation

and puriﬁcation of 1–5 mg quantities of labeled

pro-tein for NMR structural studies is faster and less labor

intensive with proteins prepared by the cell-free

approach than the cell-based approach (c) Proteins

produced with the wheat germ extract from CellFree

Sciences and labeled amino acids generally show high

levels of enrichment by mass spectrometry: > 95%

15N⁄ (14N+15N) or 13C⁄ (12C+13C) These high levels

are excellent for NMR spectroscopy (d) The cell-free

system supports the production of proteins with a

vari-ety of labeling patterns: uniform labeling with2H,13C,

and 15N, selective labeling by residue type, and SAIL

(discussed above)

We recently carried out a detailed comparison of the

wheat germ cell-free and E coli cell-based approaches

to protein production for NMR structure

determin-ation [35] In this study 96 randomly chosen

Arabidop-sis thalianatargets were carried through CESG’s wheat

germ cell-free and E coli cell pipelines If possible,

[15N]-labeled versions of each protein were produced

for analysis by1H-15N correlation NMR spectroscopy

Of the 96 targets started with, only eight from the

cell-free pipeline and ﬁve from the cell-based pipeline were

found suitable for NMR structural analysis on the

basis of the NMR results In this comparison, the ﬁve

targets that proved successful by the E coli cell-based

approach also were successful by the cell-free

approach

Our wheat germ cell-free approach appears to have

advantages over published in vitro protein production

protocols that utilize E coli S30 extract (a) Cell-free

protocols utilizing E coli extract usually call for the

testing of multiple plasmids with sequence differences

outside the protein coding region to determine one

that produces protein in high yield [36] By contrast,

with the wheat germ cell-free protocol we have found

no advantage of modifying the plasmid sequence out-side the coding region, and hence utilize a single plas-mid construct for all targets (b) Protocols for E coli S30 cell-free synthesis typically employ additives, such

as polyethylene glycol to improve protein yields [10] These additives need to be removed prior to NMR structural studies No such additives are required with the wheat germ cell-free approach probably because the wheat germ extract contains chaperones and other factors that contribute to higher protein yields (c) To achieve a high level of label incorporation from E coli S30 extract it may be necessary to take pains to remove endogenous unlabeled amino acids [10] (d) Proteins prepared from E coli S30 extract may be het-erogeneous as the result of incomplete cleavage of the N-terminal methionine This heterogeneity can lead to doubling of NMR peaks [10] An effective solution is

to make all proteins with a cleavable N-terminal sequence This complication does not occur with pro-teins produced in vitro from wheat germ extract (e) Wheat germ extracts contain chaperones, and do not require the addition of chaperones as sometimes nee-ded for high yields from E coli S30 extract [37,38]

A comparison of protein production from wheat germ extract and E coli S30 extract [39] demonstrated that

a signiﬁcantly higher proportion of multiple domain eukaryotic proteins were soluble when translated by wheat germ extract

Automation

All of the cell-free operations can be carried out by hand, and this is how we started using the technology Because of the small volume requirements for screen-ing (25–50 lL) and protein production for structural studies (4–12 mL), cell-free methods have proved amenable to automation CESG makes use a CellFree Sciences GeneDecoder1000TM robotic system (Fig 2)

in automating the small scale screening of constructs for protein production and solubility This unit makes

it possible to carry out as many as 1052 small scale (25 lL) screening reactions per week CESG has two prototype robotic units developed by CellFree Sciences for larger scale protein production (Fig 2) The Protemist10TM robotic system requires preparation

of the mRNA off-line, whereas the newer Prote-mist100TM starts with DNA and produces the mRNA transcript prior to the translation step Each of these systems supports 24 4 mL transcription and translation reactions per week Typical yields for the Protemist runs are 0.3–0.5 mg puriﬁed protein per mL reaction mixture These robotic systems handle the many steps that are tedious to carry out by hand, and work

Trang 5

through the night They have greatly reduced the

man-power requirements of cell-free screening and protein

production

Success rates with eukaryotic targets

The centers involved in the NIH Protein Structure

Ini-tiative (USA) are generating information about success

rates in going from a selected target gene to a

comple-ted and deposicomple-ted three-dimensional protein structure

The overall success rates still tend to be quite low, in

the range of 2% to 20%, depending on the center and

the types of targets selected It is clear from all centers

that the yields of structures for eukaryotic targets are

much lower than for prokaryotic targets In the

inter-est of efﬁciency and cost savings, it is important to

analyze where failures occur and to devise strategies to

minimize these The most effective routes for

improve-ment involve a combination of bioinformatics and small scale screening Bioinformatics relies on prior information and mathematical models for correlating success rates with gene sequences Small scale screening offers the most economical way of testing whether a cloned and sequenced target will proceed through the critical stages leading to a structure The initial screen-ing step determines the level of gene expression and the solubility of the product As described above, CESG’s wheat germ cell-free platform supports rapid and economical small scale screening for expression and solubility We currently test constructs with and without an N-terminal tag and have shown success in rescuing failed targets by truncating the N- and⁄ or C-termini The second screening operation relevant to NMR structure determinations is the screening of the [15N]-labeled protein target by 1H-15N HSQC spectro-scopy) This test, which is repeated after one week to

Fig 2 Fully automated protein synthesizers from CellFree Sciences (Left) GeneDecoder1000 TM , which operates in two small scale modes.

In the screening mode, it handles up to four 96 well plates per overnight run, produces 2–10 lg protein per well, and uses 1.0–5.0 mL wheat germ extract per plate In the small scale protein production mode it can handle up to two 96 well plates per overnight run, produces between 10 and 50 lg protein per well, and uses 5.0–10.0 mL wheat germ extract per plate (Center) Protemist10 TM robotic system, which

is capable of carrying out 24 4 mL translation reactions per week The unit produces 1–3 mg protein per reaction and utilizes 3 mL wheat germ extract per reaction This system requires off-line preparation of the mRNA (Right) Protemist100 TM robotic system, which supports

24 4 mL transcription and translation reactions per week Its capabilities are similar to those of the Protemist10, but it has the added feature

of automated production of mRNA These robotics systems carry out a variety of operations including solvent extraction, high level multi-channel liquid handling, centrifugation, and incubation at various temperatures An onboard microprocessor interfaced with the computer connected to the database keeps detailed log files that contain information about temperatures, volumes, and operational performance at every step.

Trang 6

determine if the protein is stable in solution, is highly

diagnostic for the success of an NMR structure

deter-mination Proteins that pass this test are then

pro-duced with [15N+13C]-labeling

We have accumulated experience in using the

cell-free platform to produce proteins from several

eukaryotic genomes These include over 722 different

structural genomics targets from human, mouse, and

Arabidopsis (Table 1) Most of the targets selected for

testing have coded for proteins less then 25 kDa,

because this is the size limit for high-throughput

structure determinations by NMR spectroscopy In

addition, we have carried out small scale wheat germ

cell-free screening of approximately 150 larger proteins

(25–70 kDa), and the success rates for expressing

sol-uble proteins appear to be comparable to our earlier

results with smaller targets presented in Table 1

We deﬁne ‘highly soluble’ as‡ 75% of the total

pro-tein being present in the soluble fraction Of the same

proteins produced with N-terminal GST tags and

N-terminal (His)6 tags, 9% more were highly soluble

with the GST tag Only 5% of proteins soluble as

GST fusions became insoluble following cleavage and

removal of the GST tag Thus the results show that

proteins fused to GST can be more highly soluble and

that the advantage may persist after the tag is removed

(presumably through improved folding of the puriﬁed

fusion protein prior to cleavage)

We have gathered statistics speciﬁc to human

pro-teins Of 174 human targets (most with unknown

func-tion) that were successfully cloned, 135 (78%) showed

expression at levels suitable for structural

investiga-tions Of these expressed proteins, 55 (41%) were

soluble at levels needed for NMR spectroscopy Of

these, 36 (66%) gave [15N]-labeled samples at levels

that could be evaluated by NMR spectroscopy To

date, nine of these human proteins yielded NMR

structures In total, CESG has determined NMR struc-tures of 18 eukaryotic proteins produced by this meth-odology (Fig 3) The average yield of puriﬁed, labeled, human proteins made for NMR structural studies has been 0.3 mgÆmL)1reaction mixture

Costs

Labor savings, coupled with the high level of incorpor-ation of labeled amino acids and the high yield of folded protein samples, makes the overall cost of the wheat germ cell-free method comparable to that of the E coli cell-based approach for NMR structure determinations

of eukaryotic proteins One of the main advantages of the automated wheat germ cell-free protein expression system is that the overall process requires much less time and effort compared to our current cell-based methods Not including the cloning steps, it generally takes 48 h (using the GeneDecoder1000TM), or 72 h (manually), to screen 96 targets for expression and solu-bility on the small scale The puriﬁcation protocols also require less time and effort than cell-based protocols because of the smaller volumes (4–12 mL versus 500–

1000 mL) and higher initial purity Using the latest in General Electric Healthcare HIS-TRAP purification technology (Piscataway, NJ), immobilized metal affinity chromatography (IMAC) purification of His tagged proteins requires 40 min of processing time and results

in protein samples that are 75–85% pure Gel ﬁltration adds an additional 3 h and can increase the purity to

> 95% for proteins < 15 kDa and to 90% for proteins

< 20 kDa GST puriﬁcation results in > 95% purity regardless of size; however, the minimal time to process the sample is greater than 10 h

Because stable isotope labeled amino acids required for NMR structure determinations are expensive, it is important that the protein yield per quantity of amino

Table 1 Statistics on eukaryotic proteins produced by CESG’s wheat germ cell-free platform.

Small scale (lg), automated 96 well format production overnight Large scale (mg), automated 8 · 4 mL production overnight

Genome

Targets

selected

Targets cloned successfully

Targets showing acceptable expression

Targets showing adequate solubility

[ 15 N]-labeled proteins produced

Acceptable [15N-1H]-HSQC spectrum

Protein stable for >

10 days

[ 13 C, 15 N]-labeled protein made a

3D structures

by NMR

Arabidopsis 381 351 (92%) 269 (77%) 120 (45%) 76 (63%) 17 (22%) 9 (53%) 9 (100%) 8 (89%) Total 722 654 (91%) 451 (69%) 189 (42%) 123 (65%) 34 (28%) 20 (59%) 20 (100%) 18 (90%)

a Average yield of purified double-labeled proteins used in structural investigations was 0.3 mgÆmL –1 reaction mixture b Percentages represent the number of successful targets at a given step divided by the number coming from the previous step (174 ⁄ 191) ¼ 91% in the case indica-ted.

Trang 7

(5) At3g01050.1

13 kDa PDB: 1SE9

(6) At2g24940.1

11 kDa PDB: 1T0G

(4) Dr.13312

12 kDa PDB: 2FB7

(7) At3g51030.1

14 kDa PDB: 1XFL

(8) Hs.102419

13 kDa PDB: 1ZR9

(9) Hs.157607

14 kDa PDB: 2ETT

(10) At2g23090.1

9 kDa PDB: 1WVK

(11) P62627 dimer 22 kDa PDB: 1Y4O

(12) At2g46140.1

19 kDa PDB: 1YYC

(1) Hs.78877

11 kDa PDB: 2G2B

(2) At5g39720.1

19 kDa PDB: 2G0Q

(3) At5g66040.1

14 kDa PDB: 1TQ1

Fig 3 Examples of three-dimensional solution structures of eukaryotic proteins determined by NMR spectroscopy from labeled samples produced by the wheat germ cell-free platform described here All structures have been deposited in the Protein Data Bank under the acces-sion codes indicated The molecular masses of the proteins are indicated; these proteins are relatively small because they were chosen as targets for high-throughput NMR structure determination, which currently has a practical size limit of 25 kDa (1) Hs.78877 is human allo-graft inflammatory factor 1 (2) At5g39720.1 is a protein of unknown function from A thaliana (3) At5g66040.1 is a single domain sulfur-transferase and is annotated as a senescence-associated protein (sen1-like protein) and ketoconazole resistance protein (4) Dr13312 is a protein of unknown function from zebrafish (5) At3g01050.1 from A thaliana has a ubiquitin-like fold, and may be prenylated at a putative C-terminal CAAX box motif so as to target the protein and its binding partners to a membrane compartment of the cell [32] (6) At2g24940.1 from A thaliana gave a structure with a cytochrome b5-like fold but with some resemblance to steroid binding proteins [42]; a subsequent NMR study showed that the protein binds progesterone This protein failed to express in the E coli cell-based pipeline (7) At3g51030.1 is

an h1 thioreodoxin from A thaliana [43] This protein was also produced from E coli cells; it gave an acceptable HSQC spectrum but failed

to crystallize (8) Hs.102419 is a human C2h2-type zinc finger protein (9) Hs.157607 is a human sorting nexin 22 px domain (10) At2g23090.1 is an unknown, partially disordered protein from A thaliana (11) P62627 from mouse is isoform 1 of Roadblock ⁄ LC7, a light chain in the dynein complex [44] (12) At2g46140.1 from A thaliana is late embryogenesis abundant (LEA) protein of a type expressed under conditions of cellular stress, such as desiccation, cold, osmotic stress, and heat [45].

Trang 8

acid supplied be high With cell-free systems (E coli or

wheat germ) 10% of the labeled amino mixture

sup-plied is incorporated into the protein produced and

puriﬁed

Although the cell-free approach is much less labor

intensive in comparison to our E coli cell-based

pipe-line, it requires more expensive reagents and supplies

Current limitations of the method stem from the

restricted availability and high cost of highly active

wheat germ extract These problems should ease as the

wheat germ cell-free approach becomes more

wide-spread and as increasing demands for cell-free extract

stimulate improvements in production technology The

costs of stable isotope labeled amino acids also may be

expected to decrease as demand accelerates Average

supplies costs currently are: US$47 per target for

clo-ning and expression solubility testing (with unpuriﬁed

reaction mixture assayed by SDS⁄ PAGE), US$370 per

mg for Se-Met protein, US$390 per mg for [15

N]pro-tein, and US$470 per mg for [13C,15N]protein (with

proteins isolated and puriﬁed)

The major advantages of the wheat germ cell-free

method over the E coli cell-based pipeline are that it

supports the production of a larger fraction of targets

as folded, soluble protein and that it is much faster to

prepare additional samples or truncated samples as

needed for successful structure determinations The

E coliapproach has a cost advantage when its protein

yields are much higher than cell-free The overall costs

of each approach appear to be similar for NMR

struc-ture determinations

Prospects

Because of the complementarity of free and

cell-based methods, we envision that it will be most

efﬁcient to screen each new target by both methods

Initially, we did not have an easy way to screen

tar-gets by the two approaches, because the cell-based

pipeline was using ligation-independent cloning

tech-nology, whereas the cell-free pipeline used ligation

clo-ning into the pEU vector To remedy this, we recently

implemented a cloning strategy that enables efﬁcient

small scale screening by cell-free and cell-based

meth-ods [40]; this approach utilizes Promega’s Flexi

Vector technology to transfer the target gene from

one plasmid to another By comparing the small scale

screening results from the two platforms, we can now

choose the one more likely to be successful If the

cell-based approach is selected for an NMR target,

we make use of a self-induction medium developed

for producing [15N] or [13C+15N]-labeled protein

from E coli cells [41]

The largest remaining bottlenecks associated with the wheat germ cell-free protocol are the limited solu-bility, aggregation, or limited stability exhibited by many targets Improvements in any of these areas would greatly lower the costs of structure determina-tions Our ongoing research is aimed at investigating reasons for failures of these types and at developing approaches for rescuing failed targets Some structural genomics centers start multiple constructs for each tar-get selected (different N- and C-termini, different fusions, or different vectors and hosts) and choose the one that yields the most soluble protein We have initi-ated a pilot study aimed at determining whether the initial production of constructs with multiple N- and C-termini for small scale screening would be more efﬁ-cient than our current approach of redesigning failed constructs

Currently, CESG’s X-ray structure pipeline requires

in the order of 10 mg of Se-Met protein for each tar-get We anticipate that as reliable small scale crystal-lization screening methods become available, the wheat germ cell-free method could become part of the X-ray crystallography pipeline We have already determined

by mass spectrometry that the wheat germ cell-free approach supports high level incorporation of Se-Met, and we have made small quantities of Se-Met-labeled proteins for use chip (Fluidigm, South San Francisco, CA) crystallization screening

Acknowledgements

We gratefully acknowledge the work of all CESG staff members and collaborators and fruitful interactions with Professor Y Endo and his group at Ehime Uni-versity, Matsuyama, Japan, and staff members of CellFree Sciences Co., Ltd (Yokohama, Japan)

in adapting their technology to research and product-ion environments Supported by NIH grants 1U54 G074901 (which supports CESG), and P41 RR02301 (which supports the National Magnetic Resonance Facility at Madison, where NMR spectroscopy was carried out)

References

1 Kramer G, Kudlicki W, Hardesty B, Higgens SJ & Hames BD (1999) Cell-free coupled transcription-trans-lation systems from Escherichia coli In Protein Expression A Practical Approach(Higgens SJ & Hames

BD, eds), pp 201–223 Oxford University Press, Oxford, UK

2 Clemens MM, Prujin GJ, Higgens SJ & Hames BD (1999) Protein synthesis in eukaryotic cell-free systems

Trang 9

In Protein Expression A Practical Approach (Higgens

SJ & Hames BD, eds), pp 129–165 Oxford University

Press, Oxford, UK

3 Cubeddu L, Moss CX, Swarbrick JD, Gooley AA,

Williams KL, Curmi PM, Slade MB & Mabbutt BC

(2000) Dictyostelium discoideum as expression host:

isotopic labeling of a recombinant glycoprotein for

NMR studies Protein Expr Purif 19, 335–342

4 Strauss A, Bitsch F, Cutting B, Fendrich G, Graff P,

Liebetanz J, Zurini M & Jahnke W (2003)

Amino-acid-type selective isotope labeling of proteins expressed in

baculovirus-infected insect cells useful for NMR studies

J Biomol NMR 26, 367–372

5 Bruggert M, Rehm T, Shanker S, Georgescu J & Holak

TA (2003) A novel medium for expression of proteins

selectively labeled with15N-amino acids in Spodoptera

frugiperda(Sf9) insect cells J Biomol NMR 25, 335–348

6 Goff SA & Goldberg AL (1987) An Increased Content

of Protease LA, the Lon Gene-Product, Increases

Pro-tein-Degradation and Blocks Growth in Escherichia coli

J Biol Chem 262, 4508–4515

7 Maurizi MR (1987) Degradation in vitro of

bacterio-phage lambda N protein by Lon protease from

Escheri-chia coli J Biol Chem 262, 2696–2703

8 Chrunyk BA, Evans J, Lillquist J, Young P & Wetzel R

(1993) Inclusion-Body Formation and Protein Stability

in Sequence Variants of Interleukin-1-Beta J Biol Chem

268, 18053–18061

9 Shi J, Pelton JG, Cho HS & Wemmer DE (2004)

Pro-tein signal assignments using speciﬁc labeling and

cell-free synthesis J Biomol NMR 28, 235–247

10 Torizawa T, Shimizu M, Taoka M, Miyano H &

Kainosho M (2004) Efﬁcient production of isotopically

labeled proteins by cell-free synthesis: a practical

proto-col J Biomol NMR 30, 311–325

11 Yabuki T, Kigawa T, Dohmae N, Takio K, Terada T,

Ito Y, Laue ED, Cooper JA, Kainosho M & Yokoyama

S (1998) Dual Amino Acid-Selective and Site-Directed

Stable-Isotope Labeling of the Human c-Ha-Ras Protein

by Cell-Free Synthesis J Biomol NMR 11, 295–306

12 Kigawa T, Muto Y & Yokoyama S (1995) Cell-Free

Synthesis and Amino Acid-Selective Stable Isotope

Labeling of Proteins for NMR Analysis J Biomol NMR

6, 129–134

13 Kainosho M, Torizawa T, Iwashita Y, Terauchi T, Mei

Ono A & Gu¨ntert P (2006) Optimal isotope labelling

for NMR protein structure determinations Nature 440,

52–57

14 Klammt C, Lohr F, Schafer B, Haase W, Doetsch V,

Ru¨terjans H, Glaubitz C & Bernhard F (2004) High

level cell-free expression and speciﬁc labeling of integral

membrane proteins Eur J Biochem 271, 568–580

15 Henrich B, Lubitz W & Plapp R (1982) Lysis of

Escher-ichia coliby induction of Cloned Phi-X174 Genes Mol

Gen Gen 185, 493–497

16 Guignard L, Ozawa K, Pursglove SE, Otting G & Dixon NE (2002) NMR analysis of in vitro-synthesized proteins without puriﬁcation: a high-throughput approach FEBS Lett 524, 159–162

17 Kohno T (2005) Production of proteins for NMR stud-ies using the wheat germ cell-free system Methods Mol Biol 310, 169–185

18 Kigawa T & Yokoyama S (2002) [High-throughput cell-free protein expression system for structural genomics and proteomics studies] Tanpakushitsu Kakusan Koso

47, 1014–1019

19 Yokoyama S (2003) Protein expression systems for structural genomics and proteomics Curr Opin Chem Biol 7, 39–43

20 Kigawa T, Yabuki T, Yoshida Y, Tsutsui M, Ito Y, Shibata T & Yokoyama S (1999) Cell-free production and stable-isotope labeling of milligram quantities of proteins FEBS Lett 442, 15–19

21 Kim DM, Kigawa T, Choi CY & Yokoyama S (1996)

A Highly Efﬁcient Cell-Free Protein Synthesis System from Escherichia coli Eur J Biochem 239, 881–886

22 Yokoyama S, Hirota H, Kigawa T, Yabuki T, Shirouzu

M, Terada T, Ito Y, Matsuo Y, Kuroda Y, Nishimura

Y, Kyogoku Y, Miki K, Masui R & Kuramitsu S (2000) Structural genomics projects in Japan Nat Struct Biol 7 (Suppl.), 943–945

23 Kim DM & Swartz JR (2000) Prolonging cell-free pro-tein synthesis by selective reagent additions Biotechnol Prog 16, 385–390

24 Yin G & Swartz JR (2004) Enhancing multiple disulﬁde bonded protein folding in a cell-free system Biotechnol Bioeng 86, 188–195

25 Chumpolkulwong N, Hori-Takemoto C, Hosaka T, Inaoka T, Kigawa T, Shirouzu M, Ochi K &

Yokoyama S (2004) Effects of Escherichia coli riboso-mal protein S12 mutations on cell-free protein synthesis Eur J Biochem 271, 1127–1134

26 Madin K, Sawasaki T, Ogasawara T & Endo Y (2000)

A highly efﬁcient and robust cell-free protein synthesis system prepared from wheat embryos: Plants apparently contain a suicide system directed at ribosomes Proc Natl Acad Sci USA 97, 559–564

27 Endo Y (2001) Genomics to Proteomics: A High-throughput Cell-free Protein Synthesis System for Prac-tical Use The 3rd ORCS International Symposium on Ribosome Engineering, January 22–23, 2001 Tsukuba, Japan

28 Kawasaki T, Gouda MD, Sawasaki T, Takai K & Endo

Y (2003) Efﬁcient synthesis of a disulﬁde-containing protein through a batch cell-free system from wheat germ Eur J Biochem 270, 4780–4786

29 Morita EH, Sawasaki T, Tanaka R, Endo Y & Kohno

T (2003) A wheat germ cell-free system is a novel way

to screen protein folding and function Protein Sci 12, 1216–1221

Trang 10

30 Sawasaki T, Ogasawara T, Morishita R & Endo Y

(2002) A cell-free protein synthesis system for

high-throughput proteomics Proc Natl Acad Sci USA 99,

14652–14657

31 Sawasaki T, Hasegawa Y, Tsuchimochi M, Kamura

N, Ogasawara T, Kuroita T & Endo Y (2002) A

bilayer cell-free protein synthesis system for

high-throughput screening of gene products FEBS Lett

514, 102–105

32 Vinarov DA, Lytle BL, Peterson FC, Tyler EM,

Volkman BF & Markley JL (2004) Cell-free protein

production and labeling protocol for NMR-based

structural proteomics Nat Methods 1, 149–153

33 Vinarov DA & Markley JL (2005) High-Throughput

Automated Platform for NMR-Based Structural

Proteo-mics Expert Rev Proteomics 2, 49–55

34 Vinarov DA, Tyler EM, Loushin Newman CL, Shahan

MN & Markley JL (2006) Protein Production using the

Wheat Germ Cell-Free Expression System In Current

Protocols in Protein Science(Coligan JE, Dunn BM,

Ploegh HL, Speicher DW & Wingﬁeld PT, eds Series

ed Taylor G), pp 5.18.1–5.18.18 Unlimited Learning

Resources, Winston-Salem, NC

35 Tyler RC, Aceti DJ, Bingman CA, Cornilescu CC, Fox

BG, Frederick RO, Jeon WB, Lee MS, Newman CS,

Peterson FC, Phillips GN Jr, Shahan MN, Singh S,

Song J, Sreenath H, Tyler EM, Ulrich EL, Vinarov

DA, Vojtik FC, Volkman BF, Wrobel RL, Zhao Q &

Markley JL (2005) Comparison of based and

cell-free protocols for producing target proteins from the

Arabidopsis thalianagenome for structural studies

Proteins 59, 633–643

36 Betton JM (2003) Rapid translation system (RTS): a

promising alternative for recombinant protein

produc-tion Curr Protein Pept Sci 4, 73–80

37 Ryabov LA, Desplancq D, Spirin AS & Pluckthun A

(1997) Functional antibody production using cell-free

translation: effects of protein disulﬁde isomerase and chaperones Nat Biotechnol 15, 79–84

38 Kang SH, Kim DM, Kim HJ, Jun SY, Lee KY & Kim

HJ (2005) Cell-free production of aggregation-prone proteins in soluble and active forms Biotechnol Prog 21, 1412–1419

39 Hirano N, Sawasaki T, Tozawa Y, Endo Y & Takai K (2006) Tolerance for random recombination of domains

in prokaryotic and eukaryotic translation systems: Lim-ited interdomain misfolding in a eukaryotic translation system Proteins 64, 343–354

40 Blommel PG, Martin PA, Wrobel RL, Steffen E & Fox

BG (2006) High-efﬁciency single-step production of expression plasmids from cDNA clones using the Flexi Vector cloning system Protein Expr Purif 47, 562–570

41 Tyler RC, Sreenath H, Aceti DJ, Bingman CA, Singh S, Markley JL & Fox BG (2005) Auto-Induction Medium for the Production of [U-15N]- and [U-13C, U-15 N]-labe-led Proteins for NMR Screening and Structure Deter-mination Protein Expr Purif 40, 268–278

42 Song J, Vinarov D, Tyler EM, Shahan MN, Tyler RC

& Markley JL (2004) Hypothetical protein At2g24940.1 from Arabidopsis thaliana has a cytochrome b5 like fold

J Biomol NMR 30, 215–218

43 Peterson FC, Lytle BL, Sampath S, Vinarov D, Tyler

E, Shahan M, Markley JL & Volkman BF (2005) Solu-tion structure of thioredoxin h1 from Arabidopsis thali-ana Protein Sci 14, 2195–2200

44 Song J, Tyler RC, Lee MS, Tyler EM & Markley JL (2005) Solution structure of isoform 1 of Road-block⁄ LC7, a light chain in the dynein complex J Mol Biol 354, 1043–1051

45 Singh S, Cornilescu CC, Tyler RC, Cornilescu G, Tonelli M, Lee MS & Markley JL (2005) Solution structure of a late embryogenesis abundant protein (LEA14) from Arabidopsis thaliana, a cellular stress-related protein Protein Sci 14, 2601–2609

Tiêu đề	Wheat Germ Cell-Free Platform For Eukaryotic Protein Production
Tác giả	Dmitriy A. Vinarov, Carrie L. Loushin Newman, John L. Markley
Người hướng dẫn	J. L. Markley
Trường học	University of Wisconsin-Madison
Chuyên ngành	Biochemistry
Thể loại	Minireview
Năm xuất bản	2006
Thành phố	Madison

Định dạng
Số trang	10
Dung lượng	771,98 KB