The number of genes encoding flavin-dependent proteins varies greatly in the genomes analyzed, and covers a range from 0.1% to 3.5% of the predicted genes.. To verify the dependence of a
Trang 1Flavogenomics – a genomic and structural view of
flavin-dependent proteins
Peter Macheroux1,2, Barbara Kappes3and Steven E Ealick2
1 Institute of Biochemistry, Graz University of Technology, Austria
2 Department of Chemistry and Chemical Biology, Cornell University, Ithaca, NY, USA
3 Department of Parasitology, University Hospital Heidelberg, Germany
Keywords
enzymes; flavin adenine dinucleotide (FAD);
flavin mononucleotide (FMN); genomic
distribution; oxidoreductases; redundancy;
structures
Correspondence
P Macheroux, Institute of Biochemistry,
Graz University of Technology, Petersgasse
12 ⁄ II, A-8010 Graz, Austria
Fax: +43 316 873 6952
Tel: +43 316 873 6450
E-mail: peter.macheroux@tugraz.at
(Received 17 March 2011, revised 11 May
2011, accepted 31 May 2011)
doi:10.1111/j.1742-4658.2011.08202.x
Riboflavin (vitamin B2) serves as the precursor for FMN and FAD in almost all organisms that utilize the redox-active isoalloxazine ring system
as a coenzyme in enzymatic reactions The role of flavin, however, is not limited to redox processes, as 10% of flavin-dependent enzymes catalyze nonredox reactions Moreover, the flavin cofactor is also widely used as a signaling and sensing molecule in biological processes such as phototropism and nitrogen fixation Here, we present a study of 374 flavin-dependent proteins analyzed with regard to their function, structure and distribution among 22 archaeal, eubacterial, protozoan and eukaryotic genomes More than 90% of flavin-dependent enzymes are oxidoreductases, and the remaining enzymes are classified as transferases (4.3%), lyases (2.9%), isomerases (1.4%) and ligases (0.4%) The majority of enzymes utilize FAD (75%) rather than FMN (25%), and bind the cofactor noncovalently (90%) High-resolution structures are available for about half of the flavo-proteins FAD-containing proteins predominantly bind the cofactor in a Rossmann fold ( 50%), whereas FMN-containing proteins preferably adopt a (ba)8-(TIM)-barrel-like or flavodoxin-like fold The number of genes encoding flavin-dependent proteins varies greatly in the genomes analyzed, and covers a range from 0.1% to 3.5% of the predicted genes
It appears that some species depend heavily on flavin-dependent oxidore-ductases for degradation or biosynthesis, whereas others have minimized their flavoprotein arsenal An understanding of ‘flavin-intensive’ lifestyles, such as in the human pathogen Mycobacterium tuberculosis, may result in valuable new intervention strategies that target either riboflavin biosynthe-sis or uptake
Introduction
Biological cofactors are generally employed by enzymes
to enable a wide and diverse range of biochemical
transformations necessary for all aspects of life Some
of these cofactors, such as vitamin B12 and vitamin H
(biotin), catalyze a small but nevertheless important set
of biochemical reactions Other cofactors, on the other
hand, perform very different chemical tasks, and compete for the title of master of versatility, with vitamin B2 (riboflavin)-derived, vitamin B6-derived (e.g pyridoxine and pyridoxamine) cofactors and cytochrome P450 being the most serious contenders The yellow vitamin B2, or riboflavin, is synthesized by
Abbreviations
PDB, Protein Data Bank; RI, redundancy index.
Trang 2many bacteria and plants [1,2], and then converted to
FMN and FAD (for structures see Fig 1) by riboflavin
kinase (which catalyzes the phosphorylation of the
ribityl side chain attached to N10 of the isoalloxazine
ring system) and further adenylated by FAD-synthetase
in two ATP-dependent reactions [3–5] These two
modified forms of riboflavin occur exclusively in
flavin-dependent enzymes The biochemical utility of FMN
and FAD is based on their redox-active isoalloxazine
ring system, which is capable of one-electron and
two-electron transfer reactions and, most importantly, of
dioxygen activation [6] Generations of enzymologists
have marvelled about the astonishing diversity of
flavin-dependent reactions, encompassing
dehydrogena-tion [7], oxidadehydrogena-tion [8–10], monooxygenadehydrogena-tion [11–13],
halogenation [14–16], and reduction (e.g of disulfides
and various types of double bond) [17], as well as their
utility in biological sensing processes (e.g light and
redox status) [18–25] Not surprisingly, this area has
been the subject of numerous review articles that have
attempted to fathom and rationalize the capabilities of
the flavin cofactor [26–32] The complexity of
flavin-catalyzed reactions is further increased when they join
forces with other redox-active cofactors, such as iron–
sulfur clusters ([2Fe–2S], [3Fe–4S] and⁄ or [4Fe–4S])
[33–35], heme [36], molybdopterin [37], or thiamine
diphosphate [38]
Since the discovery of the first flavin-containing
enzyme by Otto Warburg in the 1930s [39], the number
of ‘yellow’ enzymes has steadily increased, and there
has been a sharp rise in the last 20–30 years, owing to
the rapid progress in molecular cloning and full
genome sequencing More recently, structural genomics
has led to the structural characterization of many more
and hitherto unknown flavoproteins To gain an
over-view of flavoproteins, their genomic distribution, and
their structural topologies, we have assembled a list of
flavoproteins and searched for the encoding sequences
in a selection of genomes In addition, structural infor-mation on flavoproteins in the Protein Data Bank (PDB) was analyzed in order to define the flavin-bind-ing pocket accordflavin-bind-ing to the PFAM classification scheme [40]
Nature’s flavoprotein arsenal
The list of flavin-dependent proteins was assembled by using, mainly, three on-line databases First, the enzyme database BRENDA (http://www.brenda-enzy-mes.org/) was searched for FMN-dependent and FAD-dependent enzymes to compile a preliminary list This initial list contained many false positives and also missed several flavin-dependent enzymes, as well as flavoproteins with no catalytic or no known catalytic function (e.g flavin storage proteins) To verify the dependence of a protein on flavin, the primary litera-ture was consulted, and a complementary search for classified enzymes in the Enzyme Structures Database (http://www.ebi.ac.uk/thornton-srv/databases/enzymes/) and the PDB (http://www.pdb.org/pdb/home/home.do) was conducted to link the list of flavoproteins to the available structural information
The current list of flavoproteins contains 276 fully classified enzymes and 98 entries for enzymes with no
or incomplete classification as well as flavoproteins without a demonstrated enzymatic activity (cofactor storage, electron transfer, repressor and response proteins; 17 entries) As could be expected for a redox-active cofactor, the majority of flavoenzymes are found
in enzyme class 1: oxidoreductases account for 91% (251 entries), whereas transferases, lyases, isomerases and ligases contribute only 4.3% (12 entries), 2.9% (eight entries), 1.4% (four entries), and 0.4% (one entry) (Fig 2A) Within the class of oxidoreductases,
N
N
NH N
CH 2
O O
CH HC CH
OH OH OH
O P O P O
O O O
O
O N
N N
N
NH 2
OH OH
1 2 3 4 5 6 7 8 9 10
N
N
NH N
O O
H H
Isoalloxazine ring
Riboflavin
FAD FMN
2e / H
H 2 C
H 2 C
Fig 1 Structure of riboflavin, FMN, and FAD The redox-active isoalloxazine ring is shown in its oxidized and two-electron reduced state (red and blue) The numbering scheme for the isoalloxazine ring is indi-cated in the oxidized structure on the left.
Trang 3the three largest subgroups are enzymes in EC 1.1.4
(61 entries for monooxygenases⁄ hydroxylases), EC 1.1
(38 entries for enzymes oxidizing a CH–OH group),
and EC 1.1.3 (30 entries for enzymes oxidizing a
CH–CH group) (Fig 2B)
FAD is clearly more common as a cofactor than
FMN, with 289 proteins depending on FAD (75%)
and 98 on FMN (25%) (note: entries where cofactor
utilization is unclear were not considered; see
Table S1) Riboflavin is not used in any enzymes
(except for riboflavin kinase⁄ FAD synthetase as a
sub-strate), but appears to be the preferred storage form of
the cofactor in some organisms (e.g riboflavin-binding
protein in chicken eggs and dodecin in archaeons
[41,42]) In addition, organisms (e.g mammals) lacking
vitamin B2 biosynthesis employ riboflavin-specific
transporters to sequester it from dietary sources by
facilitated diffusion [43]
In the majority of enzymes, the cofactor is
noncova-lently bound in the active site Covalent attachment of
the flavin cofactor has been confirmed in 40 cases (see
Table S2), corresponding to 10.8% of all
flavopro-teins listed in Table S1 Apparently, covalent
attach-ment of FMN (five entries) occurs rarely as compared
with that of FAD (35 entries) Different types of
cova-lent attachment have been found for FMN It is linked
either to the 8a-position (via N3 of a histidine) or to
the 6-position (via the thiol group of a cysteine) of the
isoalloxazine ring [44], or, in one case, it is bicovalently
linked to N1 of a histidine and the thiol group of a
cysteine [45] Only recently, a novel attachment of
FMN to redox-driven ion pumps (RnfG and RnfD)
via an ester linkage between the hydroxyl group of a
threonine and the ribitylphosphate side chain of the
cofactor was discovered [46] On the other hand,
cova-lent linkage of FAD always occurs via the 8a-position,
to either the N1 or N3 of a histidine, a cysteine thiol,
a tyrosine hydroxyl, or an aspartate carboxyl group (Table S2) [44,47] In five enzymes, FAD is
bicovalent-ly attached via the 8a-position and 6-position of the isoalloxazine ring system [48] Bicovalent attachment was first discovered only 5 years ago, but appears to
be more common than monocovalent attachment to the 8a-position via cysteine, tyrosine, or aspartate [49,50]
Flavoprotein structures
The first structure of a flavin-dependent protein was reported in 1972 for a bacterial flavodoxin [51,52] Sev-eral years later, the structures of the FAD-dependent enzymes glutathione reductase (EC 1.8.1.7) and 4-hy-droxybenzoate 3-monooxygenase (p-hy4-hy-droxybenzoate hydroxylase; EC 1.14.13.2) were described [53,54] Since that time, the numbers of deposited structures have risen to 646 and 1179 structures of FMN-depen-dent and FAD-depenFMN-depen-dent proteins, respectively (as of
31 December 2010), and this has been paralleled by efforts to relate the structures of flavoproteins to their functions [55–58] The structure of flavodoxin, a small electron transfer protein that uses FMN as a cofactor,
is not only the first but also by far the most frequently solved structure of all flavin-dependent proteins (> 120 entries in the PDB)
Currently, structures are available for 55 FMN-uti-lizing and 141 FAD-utiFMN-uti-lizing flavoproteins, accounting for 52% of all flavoproteins listed in Table S1 Overall, a total of 23 structural clans (according to the PFAM classification [40]) is represented by flavin-dependent proteins, and the structural topologies are therefore quite diverse in comparison with other cofactor-dependent enzyme families; for example, all pyridoxal 5¢-phosphate-dependent enzymes adopt one
of five different structural topologies [59]
6 (0.4%)
5 (1.4%)
4 (2.9%)
2 (4.3%)
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.14
1.16
1.17 1.18
1.21
1.13
1.12
1.11
1.10
1 (91%)
Fig 2 Pie chart of flavoproteins found in various enzyme classes: yellow, class 1 (oxidoreductases); orange, class 2 (transferases); red, class 4 (lyases); blue, class 5 (isomerases); and green, class 6 (ligases) This chart was generated by using the fully classified flavoenzymes (a total of 276) from Table S1.
Trang 4As can be seen from Fig 3, FMN and FAD binding
are vastly different with respect to the topology of the
binding pocket, indicating that the adenosine moiety
strongly affects the mode of cofactor binding The
pre-ferred structure for FMN binding is the classical
(b⁄ a)8-barrel (clan TIM_barrel), with 16 entries, and
the flavodoxin-like fold (clan Flavoprotein), with 12
entries Together, these two clans account for more
than half of the currently known FMN-dependent
structural types Graphical representations of these
two most common topologies in FMN-dependent
pro-teins are shown in Fig 4A,B Within the clan
TIM_barrel, five families are found in FMN-dependent
enzymes: FMN_dh (six entries), Oxidored_FMN (five
entries), and DHO_dh, Glu_synthase and NPD (one
entry for each family) In the clan Flavoprotein, nine
proteins adopt a Flavodoxin_1, two an FMN_red and
one a recently discovered Flavodoxin_NrdI fold All of
the FMN-dependent proteins in this clan serve as
elec-tron transfer proteins or act as two-elecelec-tron reductases
for free flavin (FMN reductase, EC 1.5.1.29) or other electron acceptors (e.g azobenzene reductase,
EC 1.7.1.6) In addition to these two most abundant structural clans, FMN-dependent proteins are found in
12 rare folds Some of these folds are unique struc-tures, and are found in only one or a few enzymes, such as bacterial luciferase (Bac_luciferase), nitroreduc-tase (Nitroreducnitroreduc-tase fold), phosphopantothenate-cyste-ine ligase (clan NADP_Rossmann⁄ family DFP), and chorismate synthase (chorismate_syn) The latter two examples are very interesting, because these two enzymes do not catalyze net redox reactions and are not classical oxidoreductases, like most flavin-depen-dent enzymes (Fig 2) This observation suggests that FMN-dependent enzymes used for ‘aberrant’ activities have evolved independently from the canonical FMN-dependent oxidoreductases, or, in other words, the folds necessary to carry out the enzymatic reaction were not ‘borrowed’ from the oxidoreductases, but instead novel topologies have arisen during the evolution of these enzymes As will be discussed below, this tendency for unusual reactions to call for unusual folds is also found in FAD-dependent enzymes The topologies found for FAD binding are dominated by the Rossmann fold or variations thereof,
FAD
0
10
20
30
40
50
60
70
acyl-CoA_dh FAD_DHS
Flavoprotein FMN-binding
FMN
16
12
6
4 4 4
0
2
4
6
8
10
12
14
16
TIM_barrel Flavoprotein
FMN-binding Bac_luciferase Nitroreductase
A
B
Fig 3 Bar plot of the distribution of structural clans (according to
the PFAM classification) in FMN-dependent (A) and FAD-dependent
(B) flavoproteins.
Fig 4 Graphical representation of the two most common struc-tural clans for FMN-dependent (A, B) and FAD-dependent (C, D) proteins The examples show the structures of flavodoxin from (A) Desulfovibrio vulgaris (PDB entry 1fx1), (B) bold yellow enzyme from Sa cerevisiae (PDB entry 1oyc), (C) glutathione disulfide reductase (PDB entry 3grs), and (D) UDP-N-acetylmuramate dehy-drogenase (PDB entry 1mbt), representing the clans TIM_barrel, Flavoprotein, NADP_Rossmann, and FAD_PCMH, respectively The structure representations were generated with PYMOL
Trang 5contained in the clan NADP_Rossmann (Fig 4C) [56].
This structure clan comprises a large number of
families (148), with nine families reported to serve for
FAD binding Almost half of the FAD-dependent
pro-teins exhibit a fold in this clan (Fig 3, bottom panel)
Second to the clan NADP_Rossmann is the clan
FAD_PCMH (two families; for a graphical example,
see Fig 4D), followed by the clan FAD_Lum_binding
(five families) and the clan Acyl-CoA_dh (four
fami-lies) Together, the structures found in these four clans
account for 75% of all FAD-dependent proteins The
clans that are rare appear to occur predominantly in
proteins with special biological functions, such as
light-dependent DNA repair (deoxyribodipyrimidine
photolyase,EC 4.1.99.3), oxidoreductase activity in the
endoplasmic reticulum (ERO1), or electron transfer
from acyl-CoA dehydrogenases to the electron
trans-port chain (clan 4Fe–4S) As discussed above for
FMN-dependent proteins, this observation suggests
that employment of FAD-dependent enzymes for novel
or unusual functions requires the adaptation of already
existing topologies and, in some cases, new structural
designs to fulfill the desired role
The majority of covalently bound flavins are present
as FAD rather than FMN (Table S2) Interestingly,
covalent attachment of FAD occurs only in the
two most abundant clans, NADP_Rossmann and
FAD_PCMH, and is almost equally distributed between
these two clans (Table S2) Several families in the clan
NADP_Rossmann are associated with covalent FAD
linkage (DAO, GMC_oxred_N, FAD_binding_2,
Amino_oxidase, and Trp_halogenase) This is in
con-trast to the clan FAD_PCMH, where covalent linkage is
found in the family FAD_binding_4 but not in the
fam-ily FAD_binding_5, which comprises FAD-containing
and molybdopterin-containing enzymes, such as
xan-thine oxidase (EC 1.1.3.22) and
quinoline-2-oxidoreduc-tase (EC 1.3.99.17), to mention only two representatives
of this family (Table S1) Covalent linkage is highly
prevalent in the family FAD_binding_4: 11 of the 14
structures reported for this family show monocovalent
or bicovalent flavin attachment, with
UDP-N-acetyl-muramate dehydrogenase (EC 1.1.1.158), D-lactate
dehydrogenase (EC 1.1.1.28) and
alkyldihydroxyace-tone phosphate synthase (EC 2.5.1.26) being the only
exceptions (Table S2)
Impact of structural genomics
consortia
Several structural genomics projects on prokaryotic
and eukaryotic species have been initiated, in order to
define the structures of expressed proteins in the target
organism A total of 173 (86 for FMN-utilizing pro-teins and 87 for FAD-utilizing propro-teins) entries have been deposited by structural genomics consortia since
1999, amounting to 10% of the total entries ( 1800 entries; 640 for FMN-utilizing proteins and 1160 for FAD-utilizing proteins) Analysis of the structural classification for FMN-dependent proteins reveals a strong bias towards the clan Nitroreductases, with
a total of 27 entries ( 31%) As this clan has only a moderate frequency among FMN-dependent proteins (Fig 3, top panel), this overrepresentation suggests that this type of structure is favored by the methodolo-gies currently used in structural genomics pipelines The aim of the consortia to elucidate the structures of
as many different proteins as possible also leads to a serious lack of biochemical information, which renders some of the PDB entries difficult to interpret in terms
of the biological function of the flavoprotein On the other hand, several structures of new flavoproteins with unknown roles have been contributed by struc-tural genomics initiatives For example, a zinc-depen-dent protease from Bacteroides thetaiotaomicron (clan Glutaminase_I, family DJ-1⁄ PfpI, PDB entry3cne) and protein structures with a fold similar to the C-ter-minal domain of pyruvate kinase in the archaeons Archaeoglobus fulgidus and Methanobacterium thermo-autotrophicum were recently deposited in the PDB (clan PK_C, PDB entries1vp8 and 1t57) However, the role of the FMN cofactor in these two proteins is unclear In the putative protease, the flavin isoalloxa-zine ring is sandwiched by two tryptophans at the interface of the dimeric protein, with the edge of the pyrimidine ring moiety at distance of 15 A˚ from the presumably catalytic mononuclear zinc center Hence, the flavin does not appear to play a role in catalysis, but may instead be involved in dimerization of the protein or act as a gate for potential substrates to enter the active site On the other hand, the flavin in the pyruvate kinase fold in archaeons is located in a central cavity of the protein, and engages in hydrogen bond interactions with several amino acid side chains
In this case, it seems plausible that the flavin plays a catalytic role, albeit in a type of fold that has not previously been implicated in flavoenzyme catalysis Furthermore, an FMN-dependent oxidoreductase from Thermotoga maritima was the first structure of a flavin-dependent tRNA dihydrouridine synthase (clan TIM_barrel, family Dus; PDB entry1vhn),
an enzyme that has recently been characterized biochemically [60]
In the case of FAD, the entries provided by struc-tural genomics consortia reflect the predominance of the clan NADP_Rossmann, with 44 of 87 entries
Trang 6belonging to this clan Interestingly, several new
struc-tural families for FAD-dependent proteins were
defined in the course of structural genomics efforts,
such as the bluf domain of blue light sensors in
cyano-bacteria (1x0p), the glucose-inhibited division
pro-tein A (GidA) domain in the clan NADP_Rossmann,
the HI0933-like proteins (first discovered in target 0933
from Haemophilus influenzae, PDB entry2gqf), and a
siderophore-interacting protein (family
FAD_bind-ing_9 in the clan FAD_Lum_binding) In addition, a
novel covalent attachment between a side chain
car-boxylate group of an aspartate and the 8a-position of
the isoalloxazine system was discovered in an
FAD-dependent halogenase involved in chloramphenicol
biosynthesis in Streptomyces venezuelae [47] As noted
before, this structural information provides interesting
leads for biochemists to follow up and subject these
proteins to thorough biochemical characterization in
order to reveal their cellular role
Flavogenomics – occurrence and
distribution of flavoproteins in
prokaryotes and eukaryotes
Despite the availability of genomic sequence
informa-tion, it proved difficult to obtain reliable information
on the occurrence of flavoproteins encoded in the
genomes of various organisms This is mostly because
of the lack of information on whether a flavin (FMN
and⁄ or FAD) cofactor is present and the precise
biochemical reaction catalyzed by the enzyme On the
other hand, it is doubtful that all, or even most, of the
proteins predicted by genomics will ever be subjected
to a detailed characterization that would enable
accu-rate functional assignment of a putative flavoenzyme
For most of the species analyzed, we used the
annota-tions provided by the responsible sequencing facility,
and included only those entries that gave a clear
indication of flavin dependence (see Methods) This
approach probably leads to an underestimation of the
number of flavoproteins, as many ‘hypothetical’ or
‘putative’ proteins may be flavin-dependent but are not
annotated as such An interesting alternative to use of
the existing annotations is the analysis of predicted
protein families as provided by the Broad Institute
for Neurospora crassa (http://www.broadinstitute.org/
annotation/genome/neurospora/Pfam.html) and on the
tuberculosis research platform for Mycobacterium
tuberculosis and Streptomyces coelicolor (http://www
tbdb.org/) Therefore, we have also used our set of
structural families (Table S3) to search for proteins
predicted in the above-mentioned species In the case
of M tuberculosis, a parallel analysis of the available
genome annotation was conducted The ‘structural family approach’ has generated a significantly higher number of predicted flavoproteins (141 versus 113), as many hypothetical proteins are found in protein families that are typical or even specific for flavopro-teins (e.g FAD_binding_4 or NPD) and hence were included as predicted flavin-dependent proteins The disadvantage of this more ‘inclusive’ analysis is that some of the protein families, such as PAS_3, are not specific for flavin and may utilize other cofactors (e.g heme) In any case, the task of eliminating the false positives and false negatives inherent in both approaches can only be performed by biochemical characterization of predicted and suspected flavopro-teins To this end, structural genomics may also play
an important role; however, flavoenzymes that do not hold on tightly to the flavin cofactor (e.g chorismate synthase) or use it only transiently during catalysis (e.g hydroxypropylphosphonic acid epoxidase) may elude identification as flavin-dependent proteins Although it is presently not possible to determine the exact number of flavoproteins, our analysis has revealed striking differences in the utilization of flavin-dependent proteins in various prokaryotic and eukaryotic species, which are reflected both by the total number and the percentage of genes encoding flavoproteins (Fig 5) Several species appear to have a minimum number of flavin-dependent proteins that are required to maintain basic metabolic functions, such as succinate dehydrogenase which is necessary for pri-mary energy metabolism, and chorismate synthase and acetolactate synthase, which are necessary for amino acid biosynthesis Examples of species with a minimal set of enzymes are Pyrococcus abyssi, T maritima, and Saccharomyces cerevisiae (with 12, 12 and 48 entries, respectively) On the other hand, organisms such as
M tuberculosis, Neurospora crassa, S coelicolor and Arabidopsis thaliana contain a relatively large number
of genes encoding flavin-dependent proteins In these cases, flavoenzymes are apparently involved in a species-specific lifestyle that requires a much larger set
of flavoenzymes than are needed by the ‘flavin mini-malists’ mentioned before Closer inspection of the set
of flavoenzymes in these organisms reveals a multitude
of one or several types of flavin-dependent proteins In order to estimate this redundancy of a ‘flavogenome’,
we have defined the quotient of the number of distinct flavin-dependent proteins (i.e with different EC numbers) and the total number of flavin-dependent proteins as a ‘redundancy’ index (RI) (RI = 1 indi-cates a nonredundant flavogenome, whereas RI < 1 indicates increasing redundancy; Fig 5C) In the case
of M tuberculosis, 34 genes encoding acyl-CoA
Trang 7dehydrogenases and 10–15 genes encoding flavin-containing monooxygenases and oxidoreductases give rise to high redundancy (RI = 0.55; Fig 5C) The occurrence of this many acyl-CoA dehydrogenases is apparently related to the extensive and complex utiliza-tion of lipids from host cells by this pathogenic bacte-rium [61] A large number of genes encoding acyl-CoA dehydrogenases is also found in S coelicolor, and is only exceeded by putative flavin-dependent oxidore-ductases, with 57 predicted genes Again, the abundance of these flavoenzymes can be rationalized
on the basis of the lifestyle: S coelicolor is a rather immobile soil bacterium that can adapt to various car-bon and nitrogen sources and produces a large number
of biologically active compounds, such as antibiotics
In other words, the organism depends on metabolic power and versatility that are certainly conferred to some degree by flavin-dependent enzymes In contrast
to M tuberculosis and S coelicolor, N crassa has apparently pursued a different metabolic strategy by using a broader array of flavoenzymes rather than a highly similar set, as indicated by the rather high RI (0.74 versus 0.5 and 0.55 for S coelicolor and
M tuberculosis, respectively) As a result, N crassa contains more than 100 different flavoproteins, more than any other species analyzed in our study The large number of flavoenzymes in this filamentous fungus may be attributable to diverse biosynthetic routes leading to secondary metabolites, as well as the sapro-trophic lifestyle, which requires the generation and secretion of oxidases and dehydrogenases to access organic matter in the environment In this context, it is noteworthy that the protein family FAD_binding_4 constitutes the largest group among the predicted puta-tive flavoenzymes in this species Members of this fam-ily are typically oxidases that are capable of performing a wide range of substrate (e.g sugars and alcohols) oxidation reactions [48]
The flavogenome of the model plant A thaliana is the most prolific among the analyzed genomes This is mostly because of the occurrence of two large groups
of flavoproteins, monooxygenases and oxidases of the (S)-tetrahydroprotoberberine oxidase⁄ berberine bridge
0
50
100
150
200
250
P abyssi B subtilis
C trachomatis De radiodurans
M tuberculosis Ps aeruginosa
S coelicolor T maritima
To gondii N crassa
Sa cerevisiae A thaliana
0.00
0.50
1.00
1.50
2.00
2.50
3.00
3.50
P abyssi B subtilis
S coelicolor T maritima V fischeri
To gondii N crassa
P abyssi B subtilis
M tuberculosis Ps aeruginosa
To gondii N crassa
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
A
B
C
Fig 5 Occurrence and distribution of flavoproteins in 22 selected genomes (A) The number of genes encoding flavin-dependent pro-teins in the genomes of My genitalium, Ar fulgidus, Me janaschii,
P abyssi, Pl falciparum, To gondii, Sa cerevisiae, N crassa,
A thaliana, D melanogaster and Homo sapiens (B) The numbers
of predicted flavoproteins as percentages of the total proteins for the species in (A) (C) The RIs of flavoproteins in these genomes Yellow bars indicate genomes with low redundancy, and brown bars indicate genomes with high redundancy.
Trang 8enzyme family, with 31 and 26 members, respectively.
As previously discussed for microbial genomes, the
large number of enzymes in these two flavoprotein
families is a reflection of the diversity of metabolic
processes employed to synthesize a vast array of
bioac-tive compounds In the case of plants, natural products
such as alkaloids and terpenes are among the
com-pounds synthesized for signaling and defense purposes
Several members of the berberine bridge enzyme family
are implicated in plant metabolism, such as
(S)-tetra-hydroprotoberberine oxidase, nectarin V [62], and
pol-len allergen proteins [63] Therefore, it can be expected
that most of the flavoproteins occurring in these two
groups will catalyze distinct reactions on various
dif-ferent substrates
The RI seems to be a useful tool with which to
iden-tify organisms that have a ‘flavin-dependent’ lifestyle
because of their high demand for chemically complex
biomolecules, and which are thus potentially
vulnera-ble to inhibitors of riboflavin biosynthesis and⁄ or
uptake [64–66] Although it is apparent that major
spe-cies-specific differences exist, the currently estimated
RIs are probably too low for several species, owing to
the lack of biochemical knowledge of the enzymes in
the most common flavoprotein families Hence, future
efforts to define the flavoprotein arsenal of an
organ-ism have to focus on three aspects: to capture all true
flavin-dependent proteins, to eliminate false positives,
and to characterize the flavoproteins biochemically in
order to classify them accurately As a significant first
step, it would be useful to conduct an HMMER
analy-sis [67] of the existing genomes to provide a list of
potential flavoproteins, to enable scientists to target
specifically these putative genes for biochemical and
structural studies
Methods
Flavoproteins from different species were identified by
screening pertinent databases Microbial genomes were
analyzed by screening the databases provided by the J Craig
Venter Institute (Ar fulgidus DSM4304, Bacillus
subtil-is168, Chlamydia trachomatis serovar D, Deinococcus
radio-duransR1, Escherichia coli K-12, Helicobacter pylori 26695,
Methanocaldococcus jannaschii, M tuberculosisCDC1551,
Mycoplasma genitaliumG-37, Pseudomonas aeruginosa PAO,
P abyssi, Staphylococcus aureus MW2, T maritima, and
Vibrio fischeriES114) Putative flavoproteins in N crassa
were retrieved by a web-based analysis of the known
flavin-dependent protein families listed in Table S1 on
http://www.broadinstitute.org/annotation/genome/neurospora/
MultiHome.html Flavoproteins in the yeast Sa cerevisiae
were identified with the annotations available on the yeast
genome website at http://www.yeastgenome.org A similar approach was used for M tuberculosis and S
coelicol-orA3(2) (http://www.tbdb.org/) Information on flavopro-teins in the human parasites Plasmodium falciparum and Toxoplasma gondii were retrieved by inspection of http:// plasmodb.org/plasmo/ and http://toxodb.org/toxo, respec-tively Flavoproteins from A thaliana were retrieved by a keyword and protein name search (flavin, FMN, FAD, diox-ygenase, monooxdiox-ygenase, hydroxylase, and the individual names of all flavoproteins listed in Table S1), with the ARa-bidopsis Gene EXpression Database (AREX) (http:// www.arexdb.org/index.jsp) Analysis of flavoproteins in Dro-sophila melanogaster was based on a search in http://fly-base.org/ and http://www.brenda-enzymes.org Human flavoproteins were identified by a text search with the enzyme names from Table S1 in the Online-Mendelian Inheritance in Man (OMIM) database (http://www.ncbi nlm.nih.gov/omim)
References
1 Bacher A, Eberhardt S, Fischer M, Kis K & Richter G (2000) Biosynthesis of vitamin b2 (riboflavin) Annu Rev Nutr 20, 153–167
2 Fischer M & Bacher A (2008) Biosynthesis of vita-min B2: structure and mechanism of riboflavin syn-thase Arch Biochem Biophys 474, 252–265
3 Efimov I, Kuusk V, Zhang X & McIntire WS (1998) Proposed steady-state kinetic mechanism for Corynebac-terium ammoniagenesFAD synthetase produced by Escherichia coli Biochemistry 37, 9716–9723
4 Manstein DJ & Pai EF (1986) Purification and charac-terization of FAD synthetase from Brevibacterium am-moniagenes J Biol Chem 261, 16169–16173
5 Wu M, Repetto B, Glerum DM & Tzagoloff A (1995) Cloning and characterization of FAD1, the structural gene for flavin adenine dinucleotide synthe-tase of Saccharomyces cerevisiae Mol Cell Biol 15, 264–271
6 Massey V (1994) Activation of molecular oxygen by flavins and flavoproteins J Biol Chem 269, 22459– 22462
7 Ghisla S & Thorpe C (2004) Acyl-CoA dehydrogenases
A mechanistic overview Eur J Biochem 271, 494–508
8 Fitzpatrick PF (2010) Oxidation of amines by flavopro-teins Arch Biochem Biophys 493, 13–25
9 Fass D (2008) The Erv family of sulfhydryl oxidases Biochim Biophys Acta 1783, 557–566
10 Vrielink A & Ghisla S (2009) Cholesterol oxidase: bio-chemistry and structural features FEBS J 276, 6826– 6843
11 Ellis HR (2010) The FMN-dependent two-component monooxygenase systems Arch Biochem Biophys 497, 1–12
Trang 912 Palfey BA & McDonald CA (2010) Control of catalysis
in flavin-dependent monooxygenases Arch Biochem
Bio-phys 493, 26–36
13 van Berkel WJ, Kamerbeek NM & Fraaije MW (2006)
Flavoprotein monooxygenases, a diverse class of
oxida-tive biocatalysts J Biotechnol 124, 670–689
14 Anderson JL & Chapman SK (2006) Molecular
mecha-nisms of enzyme-catalysed halogenation Mol BioSyst 2,
350–357
15 Blasiak LC & Drennan CL (2009) Structural
perspec-tive on enzymatic halogenation Acc Chem Res 42,
147–155
16 van Pee KH, Dong C, Flecks S, Naismith J, Patallo EP
& Wage T (2006) Biological halogenation has moved
far beyond haloperoxidases Adv Appl Microbiol 59,
127–157
17 Argyrou A & Blanchard JS (2004) Flavoprotein
disul-fide reductases: advances in chemistry and function
Prog Nucleic Acid Res Mol Biol 78, 89–142
18 Demarsy E & Fankhauser C (2009) Higher plants use
LOV to perceive blue light Curr Opin Plant Biol 12,
69–74
19 Gomelsky M & Klug G (2002) BLUF: a novel
FAD-binding domain involved in sensory transduction in
microorganisms Trends Bichem Sci 27, 497–500
20 Kavakli IH & Sancar A (2002) Circadian
photorecep-tion in humans and mice Mol Interv 2, 484–492
21 Lin C & Todo T (2005) The cryptochromes Genome
Biol 6, 220
22 Losi A & Gartner W (2011) Old chromophores, new
photoactivation paradigms, trendy applications:
flavins in blue light-sensing photoreceptors Photochem
Photobiol 87, 491–510
23 Ozturk N, Song SH, Ozgur S, Selby CP, Morrison L,
Partch C, Zhong D & Sancar A (2007) Structure and
function of animal cryptochromes Cold Spring Harbor
Symp Quant Biol 72, 119–131
24 Braatsch S, Gomelsky M, Kuphal S & Klug G (2002)
A single flavoprotein, AppA, integrates both redox and
light signals in Rhodobacter sphaeroides Mol Microbiol
45, 827–836
25 Macheroux P, Hill S, Austin S, Eydmann T, Jones T,
Kim SO, Poole R & Dixon R (1998) Electron donation
to the flavoprotein NifL, a redox-sensing transcriptional
regulator Biochem J 332, 413–419
26 Ghisla S & Massey V (1989) Mechanisms of
flavopro-tein-catalyzed reactions Eur J Biochem 181, 1–17
27 Mansoorabadi SO, Thibodeaux CJ & Liu HW (2007)
The diverse roles of flavin coenzymes – nature’s most
versatile thespians J Biol Chem 72, 6329–6342
28 Mathews FS, Cunane L & Durley RC (2000) Flavin
electron transfer proteins Subcell Biochem 35, 29–72
29 Miura R (2001) Versatility and specificity in
flavoen-zymes: control mechanisms of flavin reactivity Chem
Rec 1, 183–194
30 Vervoort J & Rietjens IM (1996) Unifying concepts in flavin-dependent catalysis Biochem Soc Trans 24, 127– 130
31 Fagan RL & Palfey BA (2010) Flavin-dependent enzymes In Comprehensive Natural Products II (Begley
TP, ed.), pp 37–114 Elsevier, Amsterdam
32 Joosten V & van Berkel WJ (2007) Flavoenzymes Curr Opin Chem Biol 11, 195–202
33 Johnson DA, Gassner GT, Bandarian V, Ruzicka FJ, Ballou DP, Reed GH & Liu HW (1996) Kinetic charac-terization of an organic radical in the ascarylose biosyn-thetic pathway Biochemistry 35, 15846–15856
34 Cecchini G, Schroder I, Gunsalus RP & Maklashina E (2002) Succinate dehydrogenase and fumarate reductase from Escherichia coli Biochim Biophys Acta 1553, 140–157
35 Vanoni MA, Dossena L, van den Heuvel RH & Curti
B (2005) Structure–function studies on the complex iron–sulfur flavoprotein glutamate synthase: the key enzyme of ammonia assimilation Photosynth Res 83, 219–238
36 Mowat CG, Gazur B, Campbell LP & Chapman SK (2010) Flavin-containing heme enzymes Arch Biochem Biophys 493, 37–52
37 Garattini E, Mendel R, Romao MJ, Wright R & Terao
M (2003) Mammalian molybdo-flavoenzymes, an expanding family of proteins: structure, genetics, regula-tion, function and pathophysiology Biochem J 372, 15–32
38 Tittmann K (2009) Reaction mechanisms of thiamine diphosphate enzymes: redox reactions FEBS J 276, 2454–2468
39 Warburg O & Christian W (1932) Ein zweites Sauerst-off-u¨bertragendes Ferment und sein Absorptionsspek-trum Naturwissenschaften 20, 688
40 Bateman A, Coin L, Durbin R, Finn RD, Hollich V, Griffiths-Jones S, Khanna A, Marshall M, Moxon S, Sonnhammer EL et al (2004) The Pfam protein families database Nucleic Acids Res 32, D138–D141
41 Grininger M, Staudt H, Johansson P, Wachtveitl J & Oesterhelt D (2009) Dodecin is the key player in flavin homeostasis of archaea J Biol Chem 284, 13068–13076
42 Monaco HL (1997) Crystal structure of chicken ribofla-vin-binding protein EMBO J 16, 1475–1483
43 Yamamoto S, Inoue K, Ohta KY, Fukatsu R, Maeda
JY, Yoshida Y & Yuasa H (2009) Identification and functional characterization of rat riboflavin transporter 2 J Biochem 145, 437–443
44 Mewies M, McIntire WS & Scrutton NS (1998) Cova-lent attachment of flavin adenine dinucleotide (FAD) and flavin mononucleotide (FMN) to enzymes: the cur-rent state of affairs Protein Sci 7, 7–20
45 Li YS, Ho JY, Huang CC, Lyu SY, Lee CY, Huang
YT, Wu CJ, Chan HC, Huang CJ, Hsu NS et al (2007)
A unique flavin mononucleotide-linked primary alcohol
Trang 10oxidase for glycopeptide A40926 maturation J Am
Chem Soc 129, 13384–13385
46 Backiel J, Juarez O, Zagorevski DV, Wang Z, Nilges
MJ & Barquera B (2008) Covalent binding of flavins to
RnfG and RnfD in the Rnf complex from Vibrio
chole-rae Biochemistry 47, 11273–11284
47 Podzelinska K, Latimer R, Bhattacharya A, Vining LC,
Zechel DL & Jia Z (2010) Chloramphenicol
biosynthe-sis: the structure of CmlS, a flavin-dependent
halogen-ase showing a covalent flavin–aspartate bond J Mol
Biol 397, 316–331
48 Leferink NG, Heuts DP, Fraaije MW & van Berkel WJ
(2008) The growing VAO flavoprotein family Arch
Biochem Biophys 474, 292–301
49 Huang C-H, Lai W-L, Lee M-H, Chen C-J, Vasella A,
Tsai Y-C & Liaw S-H (2005) Crystal structure of
glu-cooligosaccharide oxidase from Acremonium strictum
A novel flavinylation of 6-S-cysteinyl, 8a-N1-histidyl
FAD J Biol Chem 280, 38831–38838
50 Winkler A, Hartner F, Kutchan TM, Glieder A &
Macheroux P (2006) Biochemical evidence that
berber-ine bridge enzyme belongs to a novel family of
flavo-proteins containing a bi-covalently attached FAD
cofactor J Biol Chem 281, 21276–21285
51 Andersen RD, Apgar PA, Burnett RM, Darling GD,
Lequesne ME, Mayhew SG & Ludwig ML (1972)
Structure of the radical form of clostridial flavodoxin:
a new molecular model Proc Natl Acad Sci USA 69,
3189–3191
52 Watenpaugh KD, Sieker LC, Jensen LH, Legall J &
Dubourdieu M (1972) Structure of the oxidized form of
a flavodoxin at 2.5-Angstrom resolution: resolution of
the phase ambiguity by anomalous scattering Proc Natl
Acad Sci USA 69, 3185–3188
53 Schulz GE, Schirmer RH, Sachsenheimer W & Pai EF
(1978) The structure of the flavoenzyme glutathione
reductase Nature 273, 120–124
54 Wierenga RK, De Jong RJ, Kalk KH, Hol WGJ &
Drenth J (1979) Crystal structure of p-hydroxybenzoate
hydroxylase J Mol Biol 131, 55–73
55 De Colibus L & Mattevi A (2006) New frontiers in
structural flavoenzymology Curr Opin Struct Biol 16,
722–728
56 Dym O & Eisenberg D (2001) Sequence-structure
analysis of FAD-containing proteins Protein Sci 10,
1712–1728
57 Karplus PA, Fox KM & Massey V (1995) Flavoprotein
structure and mechanism 8 Structure–function
rela-tions for old yellow enzyme FASEB J 9, 1518–1526
58 Massey V (1995) Introduction: flavoprotein structure
and mechanism FASEB J 9, 473–475
59 Percudani R & Peracchi A (2003) A genomic overview
of pyridoxal-phosphate-dependent enzymes EMBO Rep
4, 850–854
60 Rider LW, Ottosen MB, Gattis SG & Palfey BA (2009) Mechanism of dihydrouridine synthase 2 from yeast and the importance of modifications for efficient tRNA reduction J Biol Chem 284, 10324–10333
61 Cole ST, Brosch R, Parkhill J, Garnier T, Churcher C, Harris D, Gordon SV, Eiglmeier K, Gas S, Barry CE III et al (1998) Deciphering the biology of Mycobacte-rium tuberculosisfrom the complete genome sequence Nature 393, 537–544
62 Carter CJ & Thornburg RW (2004) Tobacco nectarin V
is a flavin-containing berberine bridge enzyme-like pro-tein with glucose oxidase activity Plant Physiol 134, 460–469
63 Liaw S, Lee DY, Chow LP, Lau GX & Su SN (2001) Structural characterization of the 60-kDa bermuda grass pollen isoallergens, a covalent flavoprotein Bio-chem Biophys Res Commun 280, 738–743
64 Dutta P (1991) Enhanced uptake and metabolism of riboflavin in erythrocytes infected with Plasmodium falciparum J Protozool 38, 479–483
65 Dutta P, Pinto J & Rivlin R (1985) Antimalarial effects
of riboflavin deficiency Lancet 2, 1040–1043
66 Dutta P, Pinto J & Rivlin RS (1986) Malaria chemo-therapy through interference of riboflavin metabolism Lancet 1, 679–680
67 Marchin M, Kelly PT & Fang J (2005) Tracker: contin-uous HMMER and BLAST searching Bioinformatics
21, 388–389
Supporting information
The following supplementary material is available: Table S1 List of flavin-dependent proteins (FMN, FAD, riboflavin and derivatives)
Table S2 Covalent attachment of FMN and FAD Table S3 Protein families (PFAM) used for structural classification of flavin-dependent proteins
This supplementary material can be found in the online version of this article
Please note: As a service to our authors and readers, this journal provides supporting information supplied
by the authors Such materials are peer-reviewed and may be re-organized for online delivery, but are not copy-edited or typeset Technical support issues arising from supporting information (other than missing files) should be addressed to the authors