1. Trang chủ
  2. » Giáo Dục - Đào Tạo

Genome-wide characterization and expression analysis of MYB transcription factors in Gossypium hirsutum

12 1 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Genome-wide characterization and expression analysis of MYB transcription factors in Gossypium hirsutum
Tác giả Haron Salih, Wenfang Gong, Shoupu He, Gaofei Sun, Junling Sun, Xiongming Du
Trường học Chinese Academy of Agricultural Sciences
Chuyên ngành Genetics
Thể loại Research article
Năm xuất bản 2016
Thành phố Anyang
Định dạng
Số trang 12
Dung lượng 1,62 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

MYB family proteins are one of the most abundant transcription factors in the cotton plant and play diverse roles in cotton growth and evolution. Previously, few studies have been conducted in upland cotton, Gossypium hirsutum. The recent release of the G. hirsutum genome sequence provides a great opportunity to identify and characterize the entire upland cotton MYB protein family.

Trang 1

R E S E A R C H A R T I C L E Open Access

Genome-wide characterization and

expression analysis of MYB transcription

Haron Salih1,2,4, Wenfang Gong1, Shoupu He1, Gaofei Sun3, Junling Sun1*and Xiongming Du1*

Abstract

Background: MYB family proteins are one of the most abundant transcription factors in the cotton plant and play diverse roles in cotton growth and evolution Previously, few studies have been conducted in upland cotton,

Gossypium hirsutum The recent release of the G hirsutum genome sequence provides a great opportunity to

identify and characterize the entire upland cotton MYB protein family

Results: In this study, we undertook a comprehensive genome-wide characterization and expression analysis of the MYB transcription factor family during cotton fiber development A total of 524 non-redundant cotton MYB genes, among 1986 MYB and MYB-related putative proteins, were identified and classified into four subfamilies including 1R-MYB, 2R-MYB, 3R-MYB, and 4R-MYB Based on phylogenetic tree analysis, MYB transcription factors were divided into 16 subgroups The results showed that the majority (69.1 %) of GhMYBs genes belong to the 2R-MYB subfamily

in upland cotton

Conclusion: Our comparative genomics analysis has provided novel insights into the roles of MYB transcription factors in cotton fiber development These results provide the basis for a greater understanding of MYB regulatory networks and to develop new approaches to improve cotton fiber development

Keywords: Comparative genomics analysis, Upland cotton, MYB genes, Fiber development

Background

Plant growth and development are controlled by

multi-gene families Transcription factors play a key role in the

regulation of gene transcription and commonly comprise

four distinct domains: a DNA-binding domain, a nuclear

localization signal, a transcription activation domain, and

an oligomerization site [1] These four domains work

to-gether to control many aspects of plant growth and

devel-opment by activating or suppressing the transcriptional

process [2] Additionally, transcription factors are

regu-larly encoded by multigene families which makes

analyz-ing their individual roles more complex [1] Compared

with fungi and animals, the MYB transcription factors of

higher plants are more broadly dispersed in the genome

[3] The MYB domain is highly conserved among plants,

and proteins usually contain between one and four repeats

(SONT domains) named R1, R2, R3, and R4 Each repeat

is comprised of 50–53 amino acids which encode three α-helices, the second and third of which form a helix–turn–

transcription factor DNA recognition site and interacts with the major groove of DNA [5] Moreover, it is com-prised of regularly spread triplet tryptophan residues that group together to make a hydrophobic core [6] In con-trast, the C-terminal promoter domain of different MYB proteins is quite diverse, leading to the broad variety of regulatory roles of the MYB gene family [7, 8] The MYB domain was first identified in the avian myeloblastosis virus (v-myb) [9] and three additional MYB genes (c-myb, A-myb, and B-myb) were identified in different organisms such as vertebrates, insects, fungi, and slime molds [7, 10, 11] The corn C1 gene was the first MYB gene identified

in plants, and encodes a c-myb-like transcription factor responsible for the regulation of anthocyanin biosynthesis [12] Generally, R2R3-MYB domain proteins are the pre-dominant form found in higher plants [8]

* Correspondence: sunjl000@163.com ; dujeffrey8848@hotmail.com

1 State Key Laboratory of Cotton Biology/Institute of Cotton Research,

Chinese Academy of Agricultural Science (ICR, CAAS), Anyang 455000, China

Full list of author information is available at the end of the article

© 2016 The Author(s) Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made The Creative Commons Public Domain Dedication waiver

Trang 2

To date, the plant transcription factor database (http://

planttfdb.cbi.edu.cn/) contains approximately 8746 MYB,

and 6410 MYB-related, sequences [13] These genes may

be involved in various plant cell activities such as

second-ary metabolism, hormone signaling [8, 14], environmental

stress, cell development [15], and organ growth [16, 17]

Recently, several genome-wide analyses of MYB

tran-scription factors have been conducted in Arabidopsis, rice

[18], maize [19], Salvia miltiorrhiza [20], soybean [21],

apple [22], sugarcane [15], and Chinese cabbage [23] These

studies can be utilized to identify MYB transcription factors

in other plants, including cotton However, very little

infor-mation about MYB gene diversity and abundance in upland

cotton is available Several studies have shown that the

MYB transcription factor family has a role in regulating

fiber progress in cotton The GhMYB109 and GhMYB2

genes, that belong to the R2R3-MYB subfamily, are

associ-ated with positive regulation of cotton fiber development

[24, 25] Moreover, silencing of GhMYB25 is involved in

the production of short fibers in cotton [26], while

suppres-sion of GhMYB25-like produces fiber-less cotton [27] The

negative regulator of cotton fiber elongation [28] A recent

report has revealed that ten MYB (MIXTA-like) genes were

highly expressed during early fiber development in

expression in three naked seed (fiber-less mutants) cotton

mutants revealed that only one group of MIXTA-like genes

had decreased expression levels [29]

Upland cotton is one of the most important fiber crops

in the world and provides raw material for the textile

in-dustry [30] Transcriptome analyses showed that several

pathways were regulated in developing cotton fibers [31]

The status of those up-regulated or down-regulated

path-ways, and the molecular mechanisms by which they are

controlled requires further investigation [32, 33] As MYB

proteins are one of the largest transcription factor families

in higher plants, they may play key roles in regulating

di-verse pathways in cotton during fiber development [30]

Comprehensive analysis of upland cotton MYB proteins

and their evolutionary variations, through tetraploid

cot-ton, might help to reveal critical molecular mechanisms of

cotton development and growth In addition, the recent

release of the G hirsutum genome sequence [34] provides

a great tool to identify and characterize the entire MYB

protein family in upland cotton

Here, we conducted a comprehensive genome-wide

analysis of the MYB transcription factor family in G

hir-sutum A total of 524 MYB transcription factor encoding

genes were identified and subsequently subjected to

sys-tematic analyses including: phylogenetic tree analysis,

chromosomal location determination, gene structure

(RNA-seq) analysis, and qRT-PCR analysis of selected

MYB genes Our genome-wide analysis of the GhMYB gene family might contribute to future studies on the functional characterization of MYB proteins in G hirsu-tum These research findings will provide information fun-damental to determining the molecular and regulatory mechanisms of MYB transcription factors in cotton

Methods

Identification of MYB gene family in upland cotton

Upland cotton protein sequences were downloaded from the Cotton Genome Project (http://cgp.genomics.org.cn/ page/species/download.jsp?category=hirsutum) for compu-tational analysis Corresponding protein sequences were downloaded from the Arabidopsis database (TAIR; http:// www.Arabidopsis.org/), cacao, mays, Vitis vinifera, Populus

downloaded from the plant transcription factor database (http://planttfdb.cbi.edu.cn/) A local BLASTP search was performed to identify candidate MYB members, using Ara-bidopsis, cacao, mays, Vitis vinifera, Populus trichocarpa and G raimondii MYB protein sequences as the query Hits with e-values of 1e- 10 were deemed to be members

of the MYB family Furthermore, to confirm the protein se-quences derived from the selected cotton MYB, candidate genes were examined using the domain analysis programs

of Pfam (Protein family: http://pfam.sanger.ac.uk/) and SMART (Simple Modular Architecture Research Tool: http://smart.embl-heidelberg.de/) All redundant sequences were manually discarded, resulting in 524 MYB pro-tein sequences Additional analysis was based on clus-ter W alignment results

Mapping MYB genes on chromosomes

The chromosomal position of all GhMYB genes was deter-mined through BLASTN searches against the G hirsutum genome project (Cotton Genome Project of the Institute

of Cotton Research of Chinese Academy of Agricultural Sciences) GhMYB genes were mapped on the chromo-some using the Map Chart software Two types of gene duplications were identified: tandem and segment duplica-tion events Gene duplicaduplica-tions were identified provided that the length of the aligned sequence covered >80 % of the longer gene, and that basic on the similarity of the gene alignment regions was >80 % [35] In addition, to further estimate GhMYB genes duplication events, the synonymous (Ks) and non-synonymous (Ka) substitution rates of evolution were calculated using the DnaSP soft-ware (version 5.10) [36] Ka/Ks calculator was run on those GhMYB gene pairs to estimate their synonymous and non-synonymous rates of evolution To estimate the evolutionary time of duplicated genes, Ks values were trans-lated into duplication time in millions of years based on a rate of 1 substitution per synonymous site per year The du-plication events time (T) was calculated as T = Ks/2λ × 10−6

Trang 3

Mya (approximate value for clock-like rate λ = 1.5 × 10−8

years for cotton) [37]

Phylogenetic analysis

The phylogenetic tree of MYB transcription factor genes

was generated using multiple sequence alignments of

upland cotton, Arabidopsis and cacao MYB protein

se-quences using Cluster W (http://www.ebi.ac.uk/Tools/msa/

clustalw2/) Phylogenetic and molecular evolutionary

ana-lyses were performed using MEGA 6.0 software (http://

www.megasoftware.net) with pairwise distance and the

Neighbor-Joining (NJ) method The tree was constructed

with the following parameters: Substitution, Poisson Model;

data subset to use, the p-distance, complete deletion;

replication, bootstrap analysis with 1,000 replicates

More-over, maximum likelihood and minimum evolution methods

were also used in our phylogenetic tree to validate the result

from the NJ method Additionally, a separate phylogenetic

tree was constructed with all the GhMYB protein sequences

in G hirsutum for further analysis

Gene structure analysis and identified motifs

MYB genomic and cDNA sequences were obtained from

the Cotton Genome Project of the Institute of Cotton

Research of Chinese Academy of Agricultural Sciences

(http://cgp.genomics.org.cn/page/species/download.jsp?Ca-tegory=hirsutum) The Online Gene Structure Display

Ser-ver (GSDS 2.0) (http://gsds.cbi.pku.edu.cn/index.php) was

used to examine gene structure by comparing each cDNA

sequence with the corresponding genomic sequence

Con-served protein motifs in G hirsutum MYBs were identified

using the MEME program (version 4.8.2)

(http://mem-e.nbcr.net/meme/intro.html) The following parameters

were used: any number of repetitions, the maximum

num-ber of motifs-20, and optimum width from 6 to 250

Plant materials, RNA extraction and qRT-PCR analysis

Upland cotton (Gossypium hirsutum L.) Ligon-lintless 1

(Li1) mutants and wild-types (TM-1) seeds were provided

from Institute of Cotton Research, Chinese Academy of

Agricultural Sciences (CAAS, Anyang, China) and planted

in the experimental field at the Institute of Cotton

Re-search under conventional field management conditions

Flowers on one day before anthesis were tagged for

self-pollination To detect the MYB gene expression, samples

were collected from ligon-lintless1 and wild-type cotton at

different stages of cotton fiber development: 0, 3, 5, 8, 10,

and 15 DPA RNA was extracted from cotton ovules and

fi-bers using the RNA Aprep Pure Plant Kit (Tiangen) The

quality and concentration of each RNA sample was

deter-mined using gel electrophoresis and a NanoDrop 2000

spectrophotometer (Only that met the criterion 260/280

ratio of 1.8-2.1, 260/230 ratio≥ 2.0) were used for further

were treated with DNase I (TaKaRa, Japan) to eliminate contaminating genomic DNA The cDNA was synthesized

Ace qPCR RT kit (TOYOBO, Japan) according to the man-ufacturer’s manual qRT-PCR experiments were conducted

to measure expression levels of MYB transcription factor family genes during cotton fiber development qRT-PCR analysis was performed using the Applied Biosystems 7500 Real-Time PCR system and the SYBER premix ExTaq kit (TaKaRa Japan) Target gene amplification was checked by SYBR Green fluorescence signal The cotton constitutive β-actin gene was used as a reference gene and specific MYB primers were used for qRT-PCR The following ther-mal cycle conditions were used: 95 °C for 2 min, followed

by 40 cycles of 95 °C for 5 s, products collected at 60 °C for 34 s All reactions were repeated three times with three biological replicates Expression levels were calculated as the mean signal intensity across the three replicates Fol-lowing the PCR, a melting curve analysis was performed

Ct or threshold cycle was used for relative quantification of the input target number Relative fold difference (N) is the number of treated target gene transcript copies relative to that of the untreated gene transcript copies, and is calcu-lated according to Schmittgen et al 2001 [38] as follows:

N¼ 2ΔΔCt¼ 2ð ΔCt treated‐ΔCt control Þ

differ-ence in threshold cycles for GhNAC18 target and the GhActin1internal reference

RNA-seq data analysis

To analyze upland cotton MYB expression patterns, we used Illumina RNA-seq data, including five stages of cot-ton fiber development (−1, 1, 3, 5, and 10 DPA) from

calculated as reads per kilobase of exon model per million mapped reads (RPKM) units (Additional file 1: Table S1) Fold changes of different genes expression analysis and the related statistical computations of the two tested condi-tions were performed using the DESeq R package (1.10.1) The resulting P-values were adjusted using Benjamini’s and Hochberg’s method to control the false rate [39] Only genes with an adjusted P-value < 0.05 found using DESeq were categorized as differentially expressed Heat maps were generated and hierarchical clustering was performed using genesis_v1.7.6.30.09.10-DIGERATI software

Results and discussion

Genome-wide identification of upland cottonGhMYB transcription factors

MYB transcription factor encoding genes of G hirsutum and homologous MYB genes collected from Arabidopsis,

Trang 4

cacao, mays, Vitis vinifera, Populus trichocarpa and G

DNA-binding domain, contains ~52 amino acid residues

in length and forms a helix-turn-helix fold with three

regu-larly spaced tryptophan residues) and MYB-related

(MYB-related genes are those transcription factors they have only

one MYB domain MYB domain proteins are more

preva-lent) [18] putative protein sequences were associated with

the upland cotton genome Of these, 582 non-redundant

MYB sequences, which met the crucial value of 1 e- 10,

were obtained Furthermore, GhMYB candidate genes were

examined using domain analysis programs of Pfam and

SMART A total of 524 non-redundant GhMYB genes

were identified and considered for further analysis These

genes were classified into four distinct subfamilies

includ-ing: 1R-MYB, 2R-MYB (R2R3-MYB), 3R-MYB

(R1R2R3-MYB), and 4R-MYB (Additional files 2: Table S2 and

Additional file 3: Figure S1) based on the number and

loca-tion of MYB repeats Our results indicate that, consistent

with results observed in rice, Arabidopsis [18], Chinese

cabbage [23], and apple [22], the majority of GhMYBs in

upland cotton belong to the 2R-MYB sub-family (69.1 %)

The second largest group 1R-MYB, accounted for 27.67 %

of all GhMYB genes, while 3R-MYB and 4R-MYB

accounted for 2.86 and 0.38 %, respectively (Table 1)

Chromosomal distribution and annotation MYB genes

Analysis of the G hirsutum genome sequence revealed

524 possible members of the GhMYB gene family Of

these genes, 114 had been annotated previously Three

hundred and seventy three (373) GhMYB transcription

factor genes were mapped onto upland cotton

chromo-somes and named according to their chromosomal order

(from chromosome 1 to 26) as GhMYB1 to GhMYB373

One hundred and fifty one (151) GhMYB genes were not

obviously mapped to any chromosome (scaffolds), and

named GhMYB374 to GhMYB524, respectively

(Add-itional file 4: Table S3) The distribution and density of

MYB transcription factor genes on chromosomes was not

uniform Some chromosomes, and chromosomal regions,

have a high density of MYB transcription factor genes

while others do not (Fig 1) The highest density of MYB

genes was observed on chromosome At 9 and its homolog

chromosome Dt 9 (23) with 58 genes, and the lowest

density of MYB genes was observed on chromosome At 3

and its homolog chromosome Dt 3 (17), with 11 genes In

addition, the majority of MYB transcription factor genes

were found at the upper and centromeric regions of the

chromosomes In addition, a greater number of MYB

genes were located on Dt chromosomes (tetraploid D)

than on At chromosomes (tetraploid A) with 201 and 172

genes, respectively (Table 1)

Tandem and segmental duplication events are the main

causes of gene-family expansion in upland cotton Based on

the whole genome analysis of gene duplications, 73 dupli-cated GhMYB gene pairs were made by segmental and tan-dem duplication, including 40 duplication events within the

At and Dt chromosomes as well as 33 duplication events between the At, Dt chromosomes and scaffold (Additional file 5: Table S4), indicating that segmental duplications and tandem duplications contributed to the expansion of

genes reside on the same chromosome or on different chromosomes A tandem duplication event is when gene duplication happens within the same chromosome while segmental duplication is when duplicated genes are located

in different chromosomes In this study, clusters formed by GhMYBs in the upland cotton (AD) genome were identi-fied to explain the mechanism behind the expansion of the GhMYBfamily in cotton We found that 5 gene pairs dupli-cated tandemly into chromosomes (At_chr5, At_chr7, At_chr12, Dt_chr5 and Dt_chr7) and 68 gene pairs dupli-cated segmentally, which deeply contributed to the expan-sion of the GhMYB transcription factors in upland cotton The results also indicated that, among the duplication events in the GhMYB transcription factor family, the gene pairs that appeared to be derived from segmental duplica-tion events occurred earlier than those that arose from tan-dem duplication (Additional file 5: Table S4) A gene duplication event, occurring during the course of cotton evolution, has led to the creation of new gene functions [40] The origin of multigene families has been attributed

to a region-specific gene duplication that occurred in up-land cotton [34] Furthermore, to calculate the evolution-ary time of these identified MYB in G hirsutum, an estimation of their synonymous and non-synonymous substitution rates during evolution, Ks and Ka values were calculated using the DnaSp software Nucleotide substitu-tions in protein-coding genes can be categorized as syn-onymous or non-synsyn-onymous substitutions as elaborated

in (Additional file 5: Table S4) The Ka/Ks ratio is a meas-ure used to examine the mechanisms of gene duplication evolution after divergence from their ancestors A Ka/Ks value of 1 suggests neutral selection, a Ka/Ks value of <1 suggests purifying selection, and a Ka/Ks value of >1 sug-gests positive selection (Hurst 2002) Here, we estimated synonymous and non-synonymous substitutions ratios (Ka/Ks) for the 73 pairs of segmentally and tandemly du-plicated genes It was found that most of GhMYB genes had Ka/Ks values of less than 1, implying that GhMYB genes have evolved under the effect of purifying selection while 17 duplicated genes had the Ka/Ks ratio more than

1, implying that those had evolved under positive selec-tion We further used Ks to estimate the time of GhMYB genes duplication events during the evolutionary time of upland cotton genome The tandem and segmental dupli-cation events in upland cotton that occurred between 0.26 (Ks = 0.0078) and 124.42 mya (Ks = 3.7326), with an

Trang 5

average of 46.494 mya (million years ago) The Ks of

tan-dem duplications of GhMYB genes occurred from 2.76

(Ks = 0.0828) mya to 44.343 (Ks = 1.3303) mya, with

aver-age 26.0493 mya The results suggest that the expansion

of the GhMYB genes in upland cotton which

origi-nated from At and Dt genomes mostly arose from

evolution

Phylogenetic analysis of MYB transcription factors genes

in upland cotton

To identify potential relationships between the various

phylogenetic tree was constructed Examination of protein sequence similarity and phylogenetic tree analysis allowed

us to divide the 524 upland cotton MYB genes into 16 sub-groups, which ranged in size from 2 to 68 MYB genes (Additional file 6: Figure S2A) The bootstrap values for some subgroups of the NJ tree were low as a result of rela-tively large number of gene sequences that were also found

in earlier study [23] Supporting the phylogenetic analysis

of subgroups, most GhMYB protein domain repeats exhib-ited high similarities within subgroups Hence, we strongly sought other evidence to check the reliability of our phylo-genetic tree The phylophylo-genetic trees of MYB TFs were re-constructed with maximum likelihood and minimum

Table 1 The different subfamilies of GhMYB transcription factor types distributed on upland cotton chromosomes

Chromosome Transcription factor Types

Trang 6

Fig 1 Distribution of GhMYB genes on cotton chromosomes The chromosomal position of each GhMYB was mapped to the upland cotton genome

Trang 7

evolution methods to validate the result from the NJ and

pairwise distance method The trees constructed by the

three methods mentioned above, are almost the same with

only minimal differences at some subgroups (subgroup 7

and 16), implying that the tree methods were mainly

con-sistent with each other (Additional file 7: Figure S3)

MYB gene structure analysis and conserved motif

identification

Gene structure analysis of 524 GhMYB transcription

fac-tors was performed To provide greater insight into their

intron/exon structure, cDNA and corresponding

gen-omic sequences were compared Approximately 90.84 %

of upland cotton GhMYB genes contained between 1 and

12 introns Similar to that described in Arabidopsis, Vitis

vinifera, and Eucalyptus grandis [41, 42], most GhMYB

transcription factors contained 1 (17.4 %) or 2 introns

(52.7 %) The remaining 22.51 % of GhMYB genes

con-tained more than two introns However, only 8.2 % of

Most of the intronless genes were clustered into the 13

and 21 subgroups Furthermore, an uprooted phylogenetic

tree was constructed using GhMYB protein sequences to

assess the similarities in intron/exon structure within

Within subgroups the majority of GhMYB genes contained

similar exon/intron distribution arrangements, particularly

related to exon length and intron number Most of the

2R-MYB transcription factor genes had a conserved gene

structure with three exons and two introns Additionally,

the size of the third exon was more variable than that of

the first and second exon (Additional file 6: Figure S2B)

High levels of variation in the sequence of the third exon

are reported to be associated with functional divergence

among R2R3-MYB genes [43] Whilst few R2R3-MYB

genes contained no intron, they were clustered into 13

subgroups It was noted that the duplication of 2R during

the early development of MYB proteins containing two

re-peats gave rise to the 3R-MYB domains Therefore, it was

found that most of the 1R-MYB, 3R-MYB, and 4R-MYB

genes were disrupted by more than four introns This is

consistent with previous reports which have suggested that

most MYB-related genes in higher plants contained more

than 2 introns [44] Within each subgroup, most of the

struc-tures, as it was described previously by Jiang, Gu, and

Peterson and Matus [42, 45] There was a strong

connec-tion between the phylogeny tree analysis and the intron/

exon structure of the GhMYB transcription factor family

in upland cotton

Further investigation of the variation within the

con-served motifs of GhMYB proteins using the MEME

pro-gram identified 20 conserved motifs, which we designated

motifs 1 to 20 Most of the GhMYB proteins within the

same subgroup showed similar motif compositions, while high variance was observed between the different sub-groups This is consistent with previous reports suggesting that MYB family members with similar protein arrange-ments were classified into the same subfamily [8] For ex-ample, all GhMYB proteins in subgroup1 possessed motifs

1, 2, 3, and 12 while all members in subgroup 15 contained motifs 2, 3, 4, 6, 7, and 15 (Additional file 8: Figure S4) In addition, some motifs were specific to a distinctive sub-group, indicative of a particular function of that subgroup Though the functions of most of the conserved motifs re-main to be identified, they are likely to play an important role in the transcriptional regulation of target genes, and may indicate further functional diversification in specific species Our results suggest that these motifs are evolu-tionarily conserved and functionally important This result was similar to Stracke and Dubos who suggested that if MYB genes from the same subgroup share similar protein motifs they probably share similar functions [8, 46]

Upland cotton MYB family relationships with other plant

To understand the relationship between the members of the MYB gene family, we constructed an NJ phylogen-etic tree of upland cotton, Arabidopsis, and cacao MYB proteins Comparison of protein sequences and phylo-genetic tree analysis enabled us to categorize the 524,

197, 141 and 256 MYB genes of upland cotton,

identified 15 subgroups containing 4 to 195 MYB genes (Fig 2) The bootstrap values for some subgroups of the

NJ tree were low as a result of relatively large number of gene sequences that were also found in earlier study [23] The low bootstrap support for the internal sub-groups of those trees was in agreement with phylogen-etic analysis of MYBs in other plants [23] It was likely due to the fact that the MYB domains are comparatively short, and members within a subgroup are highly con-served, with relatively few informative character posi-tions Most MYB subgroups contained more upland cotton members than Arabidopsis, cacao and Gossyoium

(Additional file 9: Table S5) Moreover, the classification and identification of the MYB protein sequences was consistent with the previous classification described by Stracke and Dubos [4, 8] For example, subgroup 1 con-tained 32 upland cotton, 13 Arabidopsis, 9 cacao and 22

14 contained 1 Arabidopsis, 3 upland cotton MYB genes Curiously, some homologs were clustered by species within a subgroup, which referred to that species (Additional file 9: Table S5) Identification of putative orthologous MYB genes of upland cotton, Arabidopsis, ca-cao and Gossypium raimondii was relatively easy because they were clustered in pairs within a subgroup One

Trang 8

hundred forty-nine (90 homologous gene pairs between

D5/Dt and 59 between D5/At) orthologous gene pairs

were found between upland cotton and Gossypium

rai-mondii, while only 1 orthologous was identified between

upland cotton and cacao, and 3 orthologous were found

between upland cotton and Arabidopsis, which might be

due to the closer relationships between upland cotton and

G raimondii

In addition, to gain more insights on divergence of the

MYB genes after polyploidization, the non-synonymous

(Ka) and synonymous (Ks) nucleotide substitutions and

their ratio (Ka/Ks) were analyzed for the homologous

gene pairs between G raimondii (D5) and upland cotton

homologous, 45 were identical (Ka = Ks = 0 or Ka/Ks

ra-tio = 0), 41 had a Ka/Ks less than 1, suggesting that MYB

genes have evolved mainly under the effect of purifying

se-lection However, only 3 MYB genes had a Ka/Ks ratio

more than 1, suggesting that these genes have been

evolved by positive selection (Additional file 10: Table S6)

This result implying that most of the ancestral MYB

genes have been retained in upland cotton G

hirsu-tum after polyploidization

Expression profiles of MYB genes inG hirsutum

MYB gene expression was analyzed using RNA-seq data

from different stages of cotton fiber development

(82.3 %) MYB genes were expressed in at least one stage

of cotton fiber development and the expression 93/524

(17.7 %) MYB genes were not detected by RNA-seq

(Additional file 1: Table S1) In addition, of the 431

low expression levels during different stages of fiber de-velopment (Additional file 6: Figure S2C) According to the phylogenetic tree analysis, the expression of MYB transcription factors can also be divided into 16 sub-groups All genes within subgroups 2, 3 and 5 exhibited low or undetectable expression levels during the five early stages of cotton fiber development In Arabidopsis, this subgroup has been shown to be involved in salt tol-erance [47, 48] GhMYBs showed elevated transcript levels in the five cotton fiber developmental periods, suggesting that they might be important for maintenance

of metabolic processes and normal cotton fiber develop-ment Most MYB genes in subgroup 1, 4 and subgroups

12 to 15 were highly expressed in the five analyzed stages of fiber development Many MYB MIXTA-like transcription factors could be involved in the regulation

of epidermal cell differentiation in different plant spe-cies, including specifying cell shape in petals, vegetative trichome initiation, and branching and seed fiber initi-ation [26, 27] Recently, it has been reported that ten MYB (MIXTA-like) genes were highly expressed during early fiber development in G hirsutum In contrast, only one group of MIXTA-like genes had low expression levels in three natural fiber-less mutants [29] These mu-tants provide a means to analyze the roles of MYB tran-scription factors in the control fiber development in upland cotton Interestingly, expression of the rice

hor-mone treatment Moreover, in Arabidopsis the MYB91 gene has been shown to integrate endogenous develop-mental signals with different environdevelop-mental conditions [49] In addition, MYB88 normally maintains fate and de-velopmental progression throughout the stomatal cell lineage [50] These results indicate that MYB genes can have multiple functions in plant growth and stress re-sponses Sixty six genes in subgroup 7, CotAD_42675 (MYB2 or GL1), CotAD_18666 (MYB109), CotAD_46807 (MYB109), CotAD_02818 and ect, were highly expressed during the initiation and elongation stages of fiber devel-opment, implying that these genes may be involved in a complex network of fiber development In fact, MYB2 stimulates cotton fiber development [25] and MYB109 is specifically expressed in fiber initiation and elongation stages [24] Recently, it was found that MYB2 and MYB109 promote normal fiber development in cotton [29] In this study, 15 MYB-3R genes were identified and clustered into subgroups 10 and 13 The expression of three of the 15 MYB-3R genes could not be detected dur-ing cotton fiber development Therefore, the fact that the MYB-3R family is easily identifiable and characterized could make them suitable targets for genetic engineering approaches aimed at improving cotton fiber development

Fig 2 Phylogenetic tree of 492 upland cotton MYB proteins, 197

Arabidopsis MYB proteins and cacao 249 MYB proteins The phylogenetic

tree was constructed by MEGA 6.0 using the Neighbor-Joining method.

The bootstrap test was performed with 1,000 iterations The 29 subgroups

are shown with different colors

Trang 9

Fig 3 Expression levels of 20 GhMYB genes measured by qRT-PCR analysis of Ligon-lintless1 mutant and wild-type at different stages of cotton fiber development Black and grey represent the expression levels of Li1 mutant and wild-type, respectively

Trang 10

Previous reports suggest that GhCPC, belonging to the

MYB-3R subfamily, negatively controls cotton fiber

devel-opment during the initiation and early elongation

stages in mutant cotton [28, 51] These results are

consistent with previous studies which have suggested

that MYBs played a crucial role in a wide variety of

biological processes including: cell growth, cell cycle

control, signal transduction, metabolic and

physio-logical stability, and response to environmental

stim-uli [37, 52] Therefore, the MYB family might provide

a means to regulate cotton fiber development and

offer a path to understanding cell fiber development

during the initiation and elongation stages

Expression verification ofGhMYB genes involved in

cotton fiber development

MYB transcription factors play roles in many plant specific

processes, such as primary and secondary metabolism, cell

shape, anthers development, cellular proliferation,

differen-tiation, and stress responses [53, 54] We randomly selected

20 MYB genes to undergo expression verification using

qRT-PCR (Fig 3) The GhMYB genes MYB25, MYB2,

MYB109, MYB5, and MYB3 were highly expressed in

wild-type G hirsutum, and exhibited lower expression levels in

G hirsutumLigon-lintless1 (Li1) mutants after 5DPA

Pre-viously, it found that 8DPA was the critical point for the

Ligon-lintless1 mutant [32] In addition, MYB109 and

CotAD_02818 (GL1) promoted cotton fiber development

[29] MYB25 was expressed in ovules (initiation) and fiber

development [26] Our results indicate these genes may

play an essential in maintaining normal cotton fiber

development In contrast, some selected GhMYB genes such

as CotAD_29631 (CPC-like), CotAD_47467 (MYB103),

CotAD_11820 (CPC-3R-MYB), CotAD_64719 (MYB1),

CotAD_42115 (MYB83), and CotAD_21852 (MYB69)

were significantly expressed in theLigon-lintless1 mutant,

but not in wild-type A previous study reported that

CPC-3R-MYB negatively controlled cotton fiber development

[28], consistent with this the genes that are up-regulated in

the Ligon-lintless1 mutant could be responsible for the

short fiber phenotype observed Other groups of genes such

as CotAD_04154 (MYB_255), CotAD_02811 (MYB52),

CotAD_41041 (MYB_198), CotAD_64081 (MYBML5),

CotAD_71681 (MYB42), CotAD_13600 (MYB46), and

CotAD_27106 (MYB20) were expressed at different

levels in the Li1 mutant and wild-type which may

in-dicate functional divergence of GhMYB genes during

cotton fiber development Previous reports mentioned

that MYB genes showed significant expression differences

between Ligon-lintless2 and wild-type expression during

the later stage of cotton fiber development at 20DPA [55]

Several MYB transcription factors were readjusted by the

Ligon-lintless1 mutant at 5 DPA [56], 6 DPA [31], 1 DPA,

3 DPA, and 8 DPA ovules [33] Overall, it can be seen that

our RNA-seq data is consistent with qRT-PCR results In addition, a comparative expression profile analysis of MYBs in upland cotton revealed that GhMYB might have diverse functions at different stages of cell fiber develop-ment Taken together, the RNA-seq and qRT-PCR expres-sion analyses in G hirsutum support the hypothesis that GhMYBs are involved in fiber development during differ-ent developmdiffer-ental stages, and may have diverse functions

in Arabidopsis and other species The functions of most MYBs in higher plants remain unclear, and fur-ther investigation is required to elucidate their exact func-tions Our results provide a comprehensive understanding

of GhMYBs and provide the foundation for future func-tional analyses of MYB genes and their roles in cotton fiber development

Conclusions

The MYB gene family is part of the biggest transcription factor family in higher plants and plays an important role in plant growth and development We undertook a comprehensive genome-wide characterization and ex-pression analysis of the MYB transcription factor family

in cotton fiber development A total of 524 MYB genes were identified and classified into four subfamilies Based on phylogenetic tree analysis, these MYB tran-scription factors were classified into 16 subgroups Pro-teins within the same subgroup contained very similar gene structures and protein motifs Additionally, our results revealed that MYB genes were distributed across the entire upland cotton genome Moreover, RNA-seq data showed that MYB genes play an important role in plants The expression profiles of 20 genes during cot-ton fiber development, obtained by qRT-PCR, show that different MYB genes can positively or negatively regulate cotton fiber development Additionally, other MYB genes are expressed in both mutant and wild-type fiber, further highlighting the diverse functions of MYB proteins in the development of the cotton fiber cell This study provides strong evidence that GhMYB genes play a major role in cotton fiber development and pro-vides a platform for the characterization of interesting MYB genes in the future

Additional files

Additional file 1: Table S1 Expression patterns of MYB genes in different stages of cotton fiber development measured by RNA-seq (XLSX 63 kb) Additional file 2: Table S2 Classification of MYB genes according to MYB repeat domains and protein length (XLSX 26 kb)

Additional file 3: Figure S1 Schematic representation of the general structure of upland cotton R1-MYB, 2R-MYB, 3R-MYB and 4R-MYB domain proteins (TIF 400 kb)

Additional file 4: Table S3 Location of MYB genes in the upland cotton genome The positive (+) and negative ( −) symbols following

Ngày đăng: 27/03/2023, 03:19

Nguồn tham khảo

Tài liệu tham khảo Loại Chi tiết
1. Riechmann JL, Heard J, Martin G, Reuber L, Jiang C, Keddie J, et al. Arabidopsis transcription factors: genome-wide comparative analysis among eukaryotes.Science. 2000;290:2105 – 10 Khác
2. Ptashne M. How eukaryotic transcriptional activators work. Nature.1988;335:683 – 9 Khác
3. Zimmermann IM, Heim MA, Weisshaar B, Uhrig JF. Comprehensive identification of Arabidopsis thaliana MYB transcription factors interacting with R/B-like BHLH proteins. Plant J. 2004;40:22 – 34 Khác
4. Stracke R, Werber M, Weisshaar B. The R2R3-MYB gene family in Arabidopsis thaliana. Curr Opin Plant Biol. 2001;4:447 – 56 Khác
5. Kanei-Ishii C, Sarai A, Sawazaki T, Nakagoshi H, He DN, Ogata K, et al. The tryptophan cluster: a hypothetical structure of the DNA-binding domain of the myb protooncogene product. J Biol Chem. 1990;265:19990 – 5 Khác
6. Saikumar P, Murali R, Reddy EP. Role of tryptophan repeats and flanking amino acids in Myb-DNA interactions. Proc Natl Acad Sci U S A. 1990;87:8452 – 6 Khác
7. Martin C, Paz-Ares J. MYB transcription factors in plants. Trends Genet.1997;13:67 – 73 Khác
8. Dubos C, Stracke R, Grotewold E, Weisshaar B, Martin C, Lepiniec L. MYB transcription factors in Arabidopsis. Trends Plant Sci. 2010;15:573 – 81 Khác
9. Klempnauer K-H, Gonda TJ, Michael BJ. Nucleotide sequence of the retroviral leukemia gene v-myb and its cellular progenitor c-myb: The architecture of a transduced oncogene. Cell. 1982;31:453 – 63 Khác
11. Weston K. Myb proteins in life, death and differentiation. Curr Opin Genet Dev. 1998;8:76 – 81 Khác
12. Paz-Ares J, Ghosal D, Wienand U, Peterson PA, Saedler H. The regulatory c1 locus of Zea mays encodes a protein with homology to myb proto-oncogene products and with structural similarities to transcriptional activators. EMBO J.1987;6:3553 – 8 Khác
13. Jin J, Zhang H, Kong L, Gao G, Luo J. PlantTFDB 3.0: A portal for the functional and evolutionary study of plant transcription factors. Nucleic Acids Res. 2014;42:1182 – 7 Khác
14. Allan AC, Hellens RP, Laing WA. MYB transcription factors that colour our fruit. Trends Plant Sci. 2008;13:99 – 102 Khác
15. Geethalakshmi S, Barathkumar S, Prabu G. The MYB Transcription Factor Family Genes in Sugarcane (Saccharum sp.). Plant Mol Biol Report. 2014;512 – 31 Khác
16. Pesch M, Hülskamp M. One, two, three … models for trichome patterning in Arabidopsis? Curr Opin Plant Biol. 2009;12:587 – 92 Khác
17. Balkunde R, Pesch M, Hu M, Hülskamp M. Trichome patterning in Arabidopsis thaliana: from genetic to molecular models. Curr Top Dev Biol.2010;91:299 – 321 Khác
18. Katiyar A, Smita S, Lenka SK, Rajwanshi R, Chinnusamy V, Bansal KC.Genome-wide classification and expression analysis of MYB transcription factor families in rice and Arabidopsis. BMC Genomics. 2012;13:544 Khác

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

🧩 Sản phẩm bạn có thể quan tâm