Adaptive expansion of the maize maternally expressed gene (Meg) family involves changes in expression patterns and protein secondary structures of its members

The Maternally expressed gene (Meg) family is a locally-duplicated gene family of maize which encodes cysteine-rich proteins (CRPs). The founding member of the family, Meg1, is required for normal development of the basal endosperm transfer cell layer (BETL) and is involved in the allocation of maternal nutrients to growing seeds.

Trang 1

R E S E A R C H A R T I C L E Open Access

Adaptive expansion of the maize maternally

expressed gene (Meg) family involves changes in expression patterns and protein secondary

structures of its members

Yuqing Xiong1, Wenbin Mei2, Eun-Deok Kim3, Krishanu Mukherjee1, Hatem Hassanein1, William Brad Barbazuk2, Sibum Sung3, Bryan Kolaczkowski1*and Byung-Ho Kang1*

Abstract

Background: The Maternally expressed gene (Meg) family is a locally-duplicated gene family of maize which encodes cysteine-rich proteins (CRPs) The founding member of the family, Meg1, is required for normal development of the basal endosperm transfer cell layer (BETL) and is involved in the allocation of maternal nutrients to growing seeds Despite the important roles of Meg1 in maize seed development, the evolutionary history of the Meg cluster and the activities of the duplicate genes are not understood

Results: In maize, the Meg gene cluster resides in a 2.3 Mb-long genomic region that exhibits many features of non-centromeric heterochromatin Using phylogenetic reconstruction and syntenic alignments, we identified the pedigree of the Meg family, in which 11 of its 13 members arose in maize after allotetraploidization ~4.8 mya Phylogenetic and population-genetic analyses identified possible signatures suggesting recent positive selection in Meg homologs Structural analyses of the Meg proteins indicated potentially adaptive changes in secondary structure fromα-helix to β-strand during the expansion Transcriptomic analysis of the maize endosperm indicated that 6 Meg genes are selectively activated in the BETL, and younger Meg genes are more active than older ones In endosperms from B73 by Mo17 reciprocal crosses, most Meg genes did not display parent-specific expression patterns

Conclusions: Recently-duplicated Meg genes have different protein secondary structures, and their expressions in the BETL dominate over those of older members Together with the signs of positive selections in the young Meg genes, these results suggest that the expansion of the Meg family involves potentially adaptive transitions in which new members with novel functions prevailed over older members

Background

Transfer cells in plants mediate solute transport between

the apoplast and the symplast One structural feature of

plant transfer cells is the extensive secondary cell wall

growth, which increases the plasma membrane surface

area and is thought to facilitate rapid solute transport

across the plasma membrane [1] In agreement with their

solute exchange activity, transfer cells are typically

ob-served in sink or source tissues in the vicinity of vascular

tissues At the base of the maize endosperm, a layer of

transfer cells faces the maternal placento-chalazal zone [2] Seed development in maize is dependent on nutrient transfer through this cell layer, termed the basal endo-sperm transfer cell layer (BETL)

Cysteine rich proteins (CRPs) constitute a large super-family of small, secreted proteins abundant in eukaryotes [3,4] CRPs are involved in both cell-signaling [5,6] and antimicrobial processes [7] In plants, cell-cell communi-cations mediated by secreted CRPs contribute to sto-mata differentiation [8], to guiding pollen tube growth [9] in self-incompatibility [10], and patterning embryo development [11] BETL in the maize endosperm also secretes multiple types of CRPs, including basal endo-sperm transfer layer1 1), 2 2) and 4

(BETL-* Correspondence: bryank@ufl.edu ; bkang@ufl.edu

1

Department of Microbiology and Cell Science, University of Florida,

Gainesville, FL 32611, USA

Full list of author information is available at the end of the article

© 2014 Xiong et al.; licensee BioMed Central Ltd This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article,

Xiong et al BMC Plant Biology 2014, 14:204

http://www.biomedcentral.com/1471-2229/14/204

Trang 2

4)[12], BAP [13], and maternally expressed gene 1 (Meg1)

[14] It was shown that a MYB-like transcription factor

that plays a key role in BETL development, ZmMRP-1, is

involved in expression of BETL-1, BETL-2, and Meg1

[14-17] Given that the BETL is at the maternal-filial

inter-face, these CRPs may protect developing seeds from

maternally-transmitted pathogens [18] It is also possible

that some BETL CRPs serve as extracellular signal

mole-cules that coordinate the supply of maternal nutrients

during seed development [3]

The Meg1 gene is required for normal development of

the BETL, and elevated expression of Meg1 increases BETL

sizes and seed biomass Interestingly, ectopic expression of

Meg1drives the expression of BETL-specific genes such as

ZmMRP-1 and INCW2 in non-BETL endosperm cells

Be-cause Meg1 is a maternally expressed imprinted gene, and

the effects of Meg1 are dosage dependent, the promotion

of nutrient uptake by Meg1 provides evidence that nutrient

uptake during seed development is under maternal control

[19,20] The enhanced nutrient allocation resulting from

Meg1over-expression suggests that the Meg1 protein

con-tributes to establishing the sink strength of developing

seeds by controlling BETL A group of CRPs, termed

Em-bryo Surrounding Factor 1 (ESF1), play roles similar to

Meg1 in Arabidopsis The suspensor at the base of the

em-bryo is involved in nutrient transport in Arabidopsis and

ESF1s produced from the central cells and endosperm cells

promote suspensor development [11]

Homologs of Meg1 are also transcribed in the

develop-ing endosperm [14] We have shown that these Meg1

homologs are among the most highly-expressed genes in

the BETL [21] The existence of active Meg1 homologs

raises questions about how this family arose and whether

various Meg1 homologs play similar or different

func-tional roles In this study, we identify the global

comple-ment of functional and non-functional Meg family genes

in maize and in the closely-related sorghum outgroup;

we use a combination of phylogenetic and

population-genetic techniques to characterize selection pressures

across these genes and link selection to changes in gene

expression and protein structure We find that the Meg

gene family expanded rapidly in maize, with some

evi-dence suggesting that positive selection may have driven

changes in protein structure Our analysis indicates that

more recent duplicates exhibit higher expression levels,

more extensive structural changes, and stronger evidence

for adaptation than do older duplicates, suggesting that

newer, functionally different Meg homologs may have

pre-vailed over older homologs during recent adaptation

Results and discussion

Identification of Meg genes in maize

The Meg1 gene in maize is a member of the large Meg/

Ae1 supergroup of CRPs consisting of 17 subgroups

sharing a simple CXCC motif but little detectable se-quence similarity [4] We focused our attention on the subgroup CRP5420, which includes Meg1 and other mem-bers containing the cysteine motif: CX(6)CX(4)CYCCX (14)CX(3)C and exhibiting conserved amino acid se-quence Based on sequence conservation, we identified 13 loci in the B73 maize genome homologous to Meg1, in-cluding Meg2, Meg3, Meg4, and Meg6 that have been identified previously together with Meg1 [14] The B76 genome does not contain any open reading frame that matches Meg5 We named 8 new members Meg7—Meg14 according to their chromosome position The seven loci upstream of Meg1 were named Meg7—Meg13 from prox-imal to distal to the Meg1 gene, and the locus downstream

of Meg1 was named Meg14 (Additional file 1: Table S1) The Meg1 gene consists of two coding exons separated

by a single intron and an upstream promoter required for specific expression in basal endosperm transfer cells (BETCs) [14] We found that the complete Meg1 gene architecture is shared by 8 Meg homologs (Figure 1A) Exceptions were Meg7, Meg8, Meg3, Meg10 and Meg14 Meg14has the two canonical exons but its promoter is distinct from that of Meg1 The first coding exon is missing from Meg10 and Meg8 Meg8 does not appear to have promoter elements, suggesting that it may not be transcribed The flanking sequences of Meg8 and Meg10 suggest that disruption of the two genes has been caused

by non-homologous end joining Meg7 has the two coding exons, but its promoter is dislocated ~6.2 kb upstream from the first exon by a transposon insertion The struc-ture of Meg3 is abnormal in that it has multiple regulatory elements and extra exons that are disarranged

Clustering of maize Meg genes All 13 Meg loci reside on maize chromosome 7S, between the molecular markers p-asg8 and p-asg34 When compared with chromosome regions where gene density is high or where local gene duplicates are con-centrated, this Meg region exhibits several distinct fea-tures First, rather than tightly clustering in a genic island like other maize gene clusters [22], the thirteen loci of the Meg family are spread over a genomic region

of ~800 kb (Figure 1B) Also gene density is lower in the Meg region than in other genic regions of the maize genome; the average distance between neighboring Meg genes is 62 kb, larger than the average interval between similar locally-duplicated genes such as p1, rp1, zein, kn1, pl1, a1-b, or rp3 (Additional file 2: Table S2) The density of genes in the Meg cluster is even lower than the average gene density of the entire maize genome (one gene/52 kb, based on the filtered gene content of the 2066 Mb RefGen_v2 whole genome assembly in http://maizesequence.org/) The overly-dispersed nature

of the Meg gene cluster is striking, considering the

Trang 3

general tendency of maize genes to concentrate in

tightly-integrated gene islands [22]

Approximately 85% of the maize genome consists of

transposable elements, with gypsy transposons tending

to predominate in gene-poor heterochromatic regions

[23] and Mutator transposons tending to predominate

in genic regions and in open chromatin [24] In contrast

this general pattern across the maize genome, gypsy

transposons comprise 75% lengthwise of all

transpos-able elements in the 800-kb Meg region, and Mutator

transposons are completely absent from this region

(Figure 1B)

Chromosomal recombination tends to occur often in euchromatin but is suppressed in heterochromatin [25] Consistent with the presence of the gypsy heterochromatic-marker transposons and highly-dispersed genes, the

2.3-Mb genomic region containing the Meg cluster (from 10.85

to 13.86 Mb of chromosome 7S) shows a low recombin-ation rate of < 1 centimorgan (cM) (Liu et al [24]; http:// www.maizegdb.org) The 3.3-Mb region upstream (from 7.38 to 10.68 Mb) and the 3.7-Mb region downstream (13.92—17.05 Mb) flanking the low-recombining Meg gion represent ~15.8 and ~8.5 cM of genetic distance, re-spectively, suggesting that the region surrounding the Meg

Figure 1 Gene structures and genomic arrangement of the 13 Meg genes in maize (A) Meg genes and their flanking regions are aligned to illustrate their gene structures Promoters and exons of Meg genes are depicted as red and blue rectangles, respectively Note that Meg14 is missing the canonical Meg promoter Each superfamily of transposons is shown as a rectangle with the following color codes: xillondigus -yellow, prem1 - orange, ji - brown The transposon insertions within 10 kb upstream and 5 kb downstream of each gene model are shown All of the Meg genes except Meg1, Meg13 and Meg14 have xillon-digus on their 5 ’ side and CACTA sequences on their 3’ side (asterisks) Two putative H-type thioredoxins downstream of Meg14 and SbMeg2 are colored light blue All other regions are colored gray All components of the region were drawn to scale according to their physical sizes (B) The 800 kb region in chromosome 7S that contains the 13 Meg genes is detailed Color codes for the 6 main elements in the region are provided under the diagram.

Xiong et al BMC Plant Biology 2014, 14:204 Page 3 of 14 http://www.biomedcentral.com/1471-2229/14/204

Trang 4

gene cluster represents a localized region of reduced

re-combination Taken together, these data suggest that the

Meg gene region displays characteristics of maize

non-pericentromeric heterochromatin

We found that all members of the Meg cluster, except

Meg1and Meg14, are surrounded by homologous 5’ and

3’ flanking sequences (Figure 1A) The lengths of the

homologous flanking sequences vary from a few hundred

base pairs to more than 5 kb The 5' flanking sequences of

nine genes (Meg2, 3, 4, 6, 8, 9, 10, 11, and 12) contain

xilon-digusretrotransposons, which vary in length In

con-trast, Meg13 and Meg1 have prem1 retrotransposon

inser-tions at the beginning of their 5' flanking sequences The

3' flanking sequences of all Meg genes, except Meg14, are

homologous Meg14 is peculiar in that the flanking

se-quences on both sides are not homologous to any of the

other 12 Meg genes, suggesting that it may have a unique

origin The general homology of the sequences

surround-ing the Meg genes suggests that expansion of the Meg

family can be primarily attributed to unequal crossover

and insertion of transposable elements that left

character-istic signatures up- and down-stream of duplicate genes

Evolutionary history of Meg genes

The Meg gene cluster resides exclusively on

chromo-some 7S in maize We searched the public databases to

identify homologs of Meg genes in other grass species

Two open reading frames in sorghum (Sorghum bicolor)

displayed strong sequence similarity with Meg1 and

other members of the maize Meg gene cluster, and one

gene in foxtail millet (Setaria italica) was identified as a

potential homolog We found no homologs in rice or

other closely-related species, suggesting that Meg genes

originated before the sorghum/maize split but after the

Panicoideae group diverged from other grass species

[PMID: 22580950] Although Meg1-related peptides of

Arabidopsis, ESF1s, have been identified and

function-ally characterized [11], there is no detectable sequence

similarity between ESF1s and the genes identified in

maize and other grass species, asides from their conserved

patterns of cysteine residues Short secreted peptides such

as Meg typically evolve very rapidly, making the

determin-ation of precise phylogenetic reldetermin-ationships across large

timescales difficult We therefore restricted our analyses

to those Meg homologs displaying reliable sequence

simi-larity, although the actual evolutionary origin of this gene

family is likely to have been much earlier

Using sequence similarity to Meg genes and to other

genes flanking the maize Meg cluster, we identified regions

in the maize, sorghum, and rice genomes that are

homolo-gous or homeolohomolo-gous to the 800-kb Meg-containing region

The maize Meg genes and their sorghum homologs reside

exclusively in a syntenic block conserved throughout grass

genomes (Additional file 3: Figure S1) Gene colinearity is

well-retained in the syntenic blocks of maize, sorghum and rice, although the 4-Mb region of maize chromosome 7S containing the Meg genes is five times larger than the corresponding region in rice, which lacks Meg homologs The complete lack of Meg genes in the homeologous re-gion of maize chromosome 2 suggests that the duplication events in the Meg family happened only in chromosome

7, primarily after allotetraploidization ~4.8 million years ago (mya) [26,27], while the Meg copies in chromosome 2 were lost

In order to confirm that the expansion of the Meg gene family is not an anomaly of the B73 inbred line, we estimated copy numbers in six additional maize culti-vars All Meg loci were amplified from each cultivar, and amplicons were sequenced to determine whether the specific polymorphisms in each Meg gene were present

in the amplicons (Additional file 3: Figure S2) With few exceptions, all six inbred lines share the complete com-plement of Meg genes, suggesting that Meg gene family expansion probably occurred before the establishment of modern maize cultivars Further supporting this hypoth-esis, we were able to confirm all the Meg homologs from teosinte (Zea mays ssp parviglumis), suggesting that the Meg gene cluster had fully expanded before maize was domesticated from its wild ancestor, ca 4000–10,000 years ago (Additional file 3: Figure S2)

We reconstructed the phylogeny of Meg family genes using maximum likelihood, with the distantly-related foxtail millet Meg gene used as an outgroup The result-ing phylogeny identified a large clade consistresult-ing of the

12 B73 Meg genes and one of the sorghum Meg homo-logs (SbMeg1), separated from Meg14 and the other sor-ghum homolog (SbMeg2) with strong statistical support (Figure 2A) Maize Meg14 and sorghum SbMeg2 share homologous downstream flanking sequences and a nearby putative thioredoxin H gene (Figure 1A), further supporting their grouping Together, these data suggest that maize Meg14/SbMeg2 may have diverged from the maize Meg1-13/SbMeg1 clade after the maize/sorghum group split from millet but prior to the maize/sorghum divergence

In addition to outgroup rooting using the foxtail millet Megsequence, we used gene-tree/species-tree reconcili-ation to estimate the rooted phylogeny by minimizing gene gain/loss events [31] The most parsimonious root-ing (Figure 2A) supports the view that two Meg genes were present in the common ancestor of maize and sor-ghum One of these ancestral genes was retained as a sin-gle copy in both species (maize Meg14/SbMeg2), while the other ancestor underwent a series of at least two rapid ex-pansions in the maize genome Maize Meg1 falls at the base of the maize-specific expansion and is separated from the other Meg homologs with strong support Meg1 is also located downstream from the other maize-specific Meg

Trang 5

genes (Figure 1A), suggesting that the Meg1 gene was

probably the original progenitor of the maize expansion

that would have occurred through a series of“upstream”

duplication events The consistency between phylogenetic

“age” and chromosome position supports this general

model, with genes closer in physical location to Meg1

tending to fall toward the base of the Meg phylogeny (see

Figures 1B and 2A)

To date the time of Meg gene duplications, we

recon-structed the maximum likelihood phylogenetic tree using

a molecular clock calibrated with a maize-sorghum

diver-gence time of ~11.9 mya [26] Consistent with the absence

of Meg genes on maize chromosome 2, molecular-clock

analysis suggested that Meg gene expansions occurred

after maize allotetraploidization (Additional file 3: Figure

S3) According to this analysis, the majority of Meg genes

(Meg2-11) appeared very recently through a rapid series of

duplication events that cannot be resolved

phylogenetic-ally (i.e approximately 0.90—1.58 mya) Meg12 was

in-ferred to have arisen ~1.77—2.77 mya, and the oldest

duplicates following the maize-sorghum split, Meg1 and

Meg13, arose ~3.07—4.80 mya, right after maize

allote-traploidization Although we are cautious in our

assign-ment of concrete dates to these duplication events, as

molecular-clock assumptions are likely to be violated,

these results suggest a model in which the Meg gene

cluster expanded rapidly in maize after allotetraploidzation

(~4.8 mya) but before domestication (~4000-10,000 years

ago) These results are corroborated by examination of synteny and phylogenetic analyses (Figure 2A, Additional file 3: Figure S1), which do not rely on molecular-clock assumptions

Evidence for positive selection driving changes in Meg protein secondary structure

Functional divergence of cysteine rich proteins (CRPs) has often been linked to gene duplication followed by positive selection acting to alter protein function [32-34]

We used statistical analyses based on examining the ratio

of nonsynonymous to synonymous substitutions in order

to characterize the possible role of adaptive processes in shaping the protein functions of maize Meg homologs These analyses identified a single branch on the phylogeny

as exhibiting strong evidence for protein-coding adapta-tion, the branch uniting Meg3-9, which represents the most recent maize-specific expansion event (p < 0.05 after correcting for multiple tests; Figure 2A)

Branch-sites analysis further identified two amino-acid substitutions on the Meg3-9 branch that appear to have been driven by positive selection (Figure 2B) These substi-tutions replace a conserved AK motif next to the first con-served cysteine with a VV motif, altering the size, charge and hydrophobicity of this region An additional unusual Arg to Trp substitution in Meg6 in front of the same cyst-eine residue suggests that this position may represent a

“hotspot” of Meg protein functional differentiation

Figure 2 Phylogenetic analyses of maize Meg genes identifies adaptative amino acid substitutions (A) We reconstructed maximum likelihood phylogenies from protein and corresponding DNA sequence data SH-like aLRT support [28] at key nodes is shown for protein sequence data with and without Gblocks [29] processing to remove unreliable alignment positions (top row) and DNA alignments with and without Gblocks processing (bottom row) Nodes having <0.8 SH-like aLRT support in any analysis are collapsed, and the tree is rooted using gene-species tree reconciliation to minimize duplication/loss events A blue star indicates significant support for adaptative substitutions in that specific branch (p < 0.05 after correcting for multiple tests), inferred using codon-based analysis (see Methods) (B) We plot amino-acid substitutions inferred

as adaptive by branch-sites analysis (Zhang et al) [30] along the alignment of Meg protein sequences (green arrows) Biochemical properties

of amino acids are marked as pink for hydrophilic polar, green for hydrophilic polar uncharged, red for hydrophilic polar basic, and blue for hydrophobic nonpolar amino acids Conserved cysteine residues are highlighted in orange.

Trang 6

Although crystal structures to support homology

mod-eling of Meg proteins are not available, we characterized

secondary structures of Meg proteins to identify possible

structural consequences of amino-acid substitutions We

found that there was a general reduction in the

propor-tion of α-helices and a corresponding increase in

β-strands during the maize-specific Meg family expansion

(Table 1, Figure 3) For example, the oldest Meg proteins,

Meg1 and Meg14, were predicted to contain 52.81% and

45.45% α-helices, respectively In contrast, the youngest

proteins, Meg9, Meg2 and Meg6, were 35.63%, 36.36%

and 36.36% helix, respectively (Table 1) The

alpha-helix content of the evolutionary intermediates, Meg13

and Meg4, fell between those of the oldest and youngest

genes (i.e 38.64% and 37.50%, respectively) Proportions

of β-strand displayed the opposite trend, with β-strand

proportion increasing from oldest to youngest (Table 1)

We are cautious in our interpretation of

secondary-structure predictions, as modern methods only

achie-ve ~80% accuracy [http://ieeexplore.ieee.org/xpls/abs_all

jsp?arnumber=6217208] However, it is interesting to note

that localized changes in predicted protein secondary

structure correlate strongly with the specific amino acids

identified as being under positive selection (Figure 3) This

protein region forms the firstα-helix of the mature

pep-tide in Meg1 and Meg14 The region surrounding the

adaptive changes is predicted as disordered in the

intermediate-aged Meg4 and Meg13, leading to an overall

reduction in the length of this firstα-helix In the more

recently derived Meg2, Meg6, and Meg9, the first

α-helix is predicted as completely missing and is replaced

by a conservedβ-strand (Figure 3) Overall, these results

suggest that the N-terminal region of maize Meg

pro-teins has undergone a systematic and directional

struc-tural reorganization throughout the expansion of the

Meggene family Although the absence of 3D structural

data and the low accuracy of secondary structure

pre-diction limit our ability to draw strong conclusions

about how changes in Meg protein sequence may have

changed protein function, the confluence of adaptive

protein-coding changes and alteration of predicted

secondary structures do suggest that these evolutionary changes have altered Meg protein function in some way

Evidence for recent selective sweeps in the maize Meg gene cluster

To investigate the possible role of recent selective sweeps

in maize Meg gene evolution, we analyzed maize poly-morphism data [35,36] using a composite-likelihood method to identify population-level adaptation [37] We found that the Meg region had the strongest signature of

an adaptive sweep across the entire distal 30 Mb of maize chromosome 7S (Figure 4A) Although we are cautious about the ability of these methods to identify the precise locations of selective sweeps across the genome [37], we note that the strongest support for population-level adap-tation localized to Meg9—10 and just upstream of Meg1 and Meg7 (Figure 4B) The functional consequences of these putative adaptive sweeps remain unknown, although these results do suggest that the maize Meg gene cluster may have experienced recent positive selection, further supporting a general model of maize adaptation through Meggene family expansion and diversification

It is impossible to draw definitive conclusions about adaptive changes in protein function from phylogenetic and population-genetic analyses, alone so we consider these conclusions speculative at this point However, we note that the combination of statistical evidence for elevated nonsynonymous/synonymous substitution ratios, noncon-servative amino-acid substitutions, localized changes in pre-dicted secondary structure, and population-genetic evidence for possible selective sweeps all argue in favor of a model in which adaptation has played a role in the maize Meg gene expansion

Expression profiles of Meg genes

To determine transcription profiles of Meg genes in the endosperm, we measured mRNA levels from basal endo-sperm transfer cells (BETCs), starchy endoendo-sperm cells (SECs) and peripheral endosperm (PE) containing aleur-one cells at three developmental stages (Figure 5A) We found that the transcript levels of six Meg genes (Meg1, Meg2, Meg4, Meg6, Meg9, and Meg13) are significantly higher than those of other Meg genes (Unpaired t test: two-tailed p < 0.0001) (Figure 5B) These genes are all highly expressed specifically in BETCs at 8, 12 and

16 days after pollination (DAP) (FPKM > 4800), with the three consecutive Meg genes, Meg2, Meg6 and Meg9 be-ing the most highly transcribed (Figure 5B) In contrast

to these highly-expressed Meg homologs, five Meg genes showed negligible transcription levels across all cell types and time points (Meg7, Meg8, Meg3, Meg10 and Meg14, FPKM < 365), and the two remaining Meg genes had intermediate levels of transcription, specifically in

Table 1 Composition of secondary structures in Meg

proteins

Types of secondary structure α-helix β-strand Random coils

Trang 7

BETCs (FPKM = 1368 and 1910 for Meg9 and Meg11,

respectively)

These differences in the transcript levels of Meg genes

correlate well with preservation of gene integrity in the

Meg genes The promoter and/or the two canonical

exons are disrupted in the five Meg genes with low

FPKM values (Figure 1) Meg11 and Meg12 exhibit

inter-mediate transcript levels and appear to have the

canon-ical Meg gene structure However, Meg11 has a 22 bp

deletion in its promoter, and Meg12 contains a frame

shift mutation, which may affect the stability of its

tran-script Meg12 has been annotated as a pseudogene

(www.maizesequences.org)

Despite the large variation in transcript levels, all Meg

genes displayed similar spatiotemporal expression

pat-terns Their transcripts were strictly confined to BETCs,

and transcription levels were highest at 8 DAP, but

de-creased thereafter (Figure 5B) These results suggest that

the expansion of the Meg gene family in maize does not

include diversification of expression patterns but does

include variation in expression level across homologs,

with more recently-derived intact genes generally having

higher expression levels

To further examine expression of Meg genes at the pro-tein level, we searched the Atlas of Maize Proteotypes database (http://maizeproteome.ucsd.edu), where re-sults from proteomic analyses of maize seed tissues are cataloged Peptides were identified from six Meg genes, corresponding to the six genes with the highest transcript concentrations in the endosperm (Figure 5C) Peptides from the other 7 Meg genes were absent from the data-base Furthermore, the protein abundance of highly-expressed Meg genes peaked at 8–10 DAP and reduced thereafter, in agreement with their transcript levels Because Meg1 is a maternally expressed imprinted gene, we examined imprinting status of other Meg genes from publicly available transcriptome datasets generated

by reciprocal crosses of B73XMo17 [38-40] Meg1 expres-sion is maternally imprinted at 4 DAP but it becomes bial-lelic at 12 DAP [14] The transcriptome datasets were generated from endosperm samples at 7 DAP and 10 DAP, before Meg1’s imprinted expression disappears First,

we compared coding sequences of all Meg genes to deter-mine their single nucleotide polymorphisms (SNPs) in B73 and in Mo17 inbred lines We were able to identify SNPs in 8 Meg alleles of B73 and Mo17 (Additional file 3:

Figure 3 Meg protein secondary structure has changed over the maize-specific gene family expansion The secondary structures of Meg proteins were predicted using different algorithms on the Network sequence analysis server (NPS@, Network Protein Sequence Analysis, http:// npsa-pbil.ibcp.fr) The α-helix, β-strand and disordered loop regions are denoted by the longest, the second longest and the second shortest bars, respectively The shortest bars represent residues with ambiguous states The symbols of positively selected amino acids are shown above the corresponding bars Gaps were introduced according to the amino acid sequence alignment in order to align secondary structural elements for visualization The figure illustrates amino acid sequences of Meg genes whose coding sequences are intact.

Trang 8

Figure S4) and maternal to paternal expression ratios of

the 8 genes were available in the dataset by Xin et al [39]

Unlike Meg1, none of the 8 genes exhibited

parent-of-origin specific expression Instead, Mo17 alleles of Meg2,

Meg7, and Meg11 displayed strong dominance over those

of B73 while B73 alleles of Meg3, Meg4, and Meg13

over-whelmed those of Mo17 (Figure 6A) Meg6 and Meg12

did not exhibit allele specific expression patterns No

SNPs were identified in B73 and Mo17 alleles of Meg1,

Meg3, Meg9 and Meg10 and we were not able to find

in-formation about their parent of origin specific expression

in the datasets Expression data of Meg3, Meg4, and

Meg13 were available from Waters et al [38] and they

were consistent with the results in Figure 6A These

sug-gest that parent-of-origin specific expression of Meg1 is

not conserved in the 8 Meg duplicates that we examined

in the B73XMo17 expression datasets

The Meg gene region comprises 48 annotations in the

B73 genome database (AGPv2, working gene set),

includ-ing the 13 Meg genes Among the 35 other annotations,

13 are transposable elements, 11 are pseudogenes or

devoid of coding sequences, and 11 are predicted to be protein-coding genes with intact open reading frames To determine whether the 11 putative protein-coding genes are transcriptionally active in the endosperm, we searched our endosperm transcriptome data using the BLAST pro-gram Transcripts from three genes (GRMZM2G553132, GRMZM2G144653, GRMZM2G150091) were identified

as transcribed in endosperm, but their levels ranged from 5% to 20% of the Meg6 transcript (Figures 2B, 6B) GRMZM2G144653 is expressed in all three cell-types, while GRMZM2G553132 and GRMZM2G150091 are expressed specifically in BETCs The high levels of Meg transcripts in BETCs suggest that the Meg region corre-sponds to a transcriptional “hotspot” in BETCs, even though the region exhibits features of pericentromeric heterochromatin

Conclusions The Meg gene family has expanded radically in maize since its divergence from sorghum However, the func-tional consequences of this expansion remain unclear Meg proteins are members of the CRP superfamily, other members of which play diverse roles in cell signal-ing and defense in eukaryotic cells [3] Most maize Meg genes are expressed exclusively in the BETL, and it is evident that Meg1 is involved in the control of nutrient transport by promoting BETL formation [20] Both sor-ghum and maize have BETLs [41,42], but Meg genes have expanded only in maize This suggests that the cell-signaling networks controlling seed development and nutrient allocation through the BETL may have diversi-fied in maize Alternatively, Meg gene-family expansion could function to alter the molecular mechanisms re-sponsible for isolating the developing seed from infec-tions in the maternal tissue in maize The loss of imprinting in Meg genes is in line with the notion that functional diversity in the Meg family expanded along its evolution Further examination of the functional roles played by Meg family genes is likely to enhance our un-derstanding of how tandem gene duplication events con-tribute to species-specific adaptation in plants

In this study, we examined the evolution of recently-duplicated genes to identify molecular selection by the combined use of phylogenetic and population-genetic ana-lyses and to identify functional differences between dupli-cates by characterizing their expression, localization, imprinting, and protein structures We observed changes

in coding exons and promoter sequences throughout the Meggene array in maize, consistent with a model in which mistakes introduced during the production of tandemly-duplicated gene arrays may be an important source of dif-ferences in both gene expression and protein function

We expect that a thorough understanding of gene dupli-cation processes will illuminate the potential roles of

Figure 4 Selective sweeps in maize Meg gene region identified

by composite-likelihood analysis We used a spatially-explicit

likelihood model to identify recent selective sweeps within the

region of maize chromosome 7S containing the Meg gene array

from polymorphism data (see Methods) We plot the log-likelihood

support in favor of a selective sweep model along chromosome

position A dotted horizontal line indicates the empirically-derived

0.05 significance cutoff, with log-likelihood greater than the dotted

line indicating significant support for a selective sweep (A) We plot

support for a selective sweep across the 30-Mb region of chromosome

7S containing the Meg gene region (B) Close-up of the chromosomal

region containing the Meg gene cluster, with each Meg gene ’s coding

sequence indicated.

Trang 9

Figure 5 (See legend on next page.)

Trang 10

(See figure on previous page.)

Figure 5 Specific Meg homologs are highly expressed in maize endosperm (A) Bright-field micrograph of a maize endosperm at 8 days after pollination (DAP), showing the basal endosperm transfer cell (BETC), peripheral endosperm (PE) and starchy endosperm cell (SEC) layers These three tissue types were isolated by cryo-microdissection, and gene-specific transcripts were evaluated by RNA-seq Scale bar: 0.5 ○m (B) Transcript levels of each Meg gene in the BETC, PE and SEC The six highly-expressed genes are highlighted in green Note that Meg transcripts are detected exclusively in BETC (C) Abundances of Meg proteins in the maize endosperm at three developmental stages The histogram is based

on results from searching the maizeproteome.ucsd.edu Meg proteins not found in the proteome database are omitted from the histogram The x-axis is scaled to the normalized arbitrary unit according to the maize proteome database.

Figure 6 Imprinting status of Meg genes and endosperm expression patterns of non-Meg genes in the Meg region (A) Maternal

expression ratios of Meg genes at 7 DAP (left panel) and 10 DAP (right panel) endosperms from B73XM17 reciprocal crosses The horizontal and vertical dotted lines mark boundaries of 3:1 maternal and paternal expression ratio in each cross If the maternal allele of a gene is expressed 3 times more than its paternal allele, the gene should appear in the upper right corner (red square) The ratios were calculated from the

endosperm transcriptome data by Xin et al [39] Expression of Meg genes was not detected in 15 DAP endosperm (B) Heat map depicting the transcriptional activities in BETCs of genes within a ~9.4-Mb region spanning the Meg gene cluster in Normalized gene expression level (FPKM) was used to generate the graphic Meg genes are marked with green arrows The FPKM values of the 6 highly-expressed Meg genes are far larger (>3000) than those of any other genes in the 9.4 Mb interval Genes with FPKM < 20 in any of the nine samples were omitted from the heat map.

Định dạng
Số trang	14
Dung lượng	2,11 MB