1. Trang chủ
  2. » Luận Văn - Báo Cáo

Tài liệu Báo cáo khoa học: Seed-based systematic discovery of specific transcription factor target genes pptx

15 500 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Seed-based systematic discovery of specific transcription factor target genes
Tác giả Ralf Mrowka, Nils Blüthgen, Michael Fähling
Trường học Charité - Universitätsmedizin Berlin
Chuyên ngành Systems Biology
Thể loại báo cáo khoa học
Năm xuất bản 2008
Thành phố Berlin
Định dạng
Số trang 15
Dung lượng 639,11 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

In this work, we developed a method to exploit a dataset of approximately 1200 microarray experiments in conjunction with a seed group of known transcription factor target genes and show

Trang 1

factor target genes

Ralf Mrowka1,2,3, Nils Blu¨thgen4and Michael Fa¨hling1,3

1 Paul-Ehrlich-Zentrum fu¨r Experimentelle Medizin, Berlin, Germany

2 AG Systems Biology – Computational Physiology, Berlin, Germany

3 Johannes-Mu¨ller-Institut fu¨r Physiologie, Charite´-Universita¨tsmedizin Berlin, Germany

4 School of Chemical Engineering and Analytical Sciences, Manchester Interdisciplinary Biocentre, University of Manchester, UK

The prediction and analysis of the regulatory networks

underlying gene expression is a central challenge in

systems biology and functional genomics [1,2]

Regula-tion of transcripRegula-tion is the initial mechanism for

con-trolling the expression of genes Key regulators of

transcription are transcription factors, which bind to

DNA motifs in noncoding regions that control gene

transcription Therefore, the identification of

transcrip-tion factor target genes is one major element in the

understanding and reconstruction of the regulatory

network Although many DNA motifs for trans-cription factor binding are known and are contained

as consensus sequences and binding matrices in data-bases such as transfac [3] and jaspar [4], their direct use for genome-wide matching in promoter sequences

of higher organisms is greatly limited [5] Current methods that use sequence data give results that are dominated by false predictions [5] The issue of a high proportion of false positives in pure sequence-based methods has been known for a long time [6], and also

Keywords

feedback; glaucoma; NF-jB; optineurin;

transcription factor target prediction

Correspondence

R Mrowka, Paul-Ehrlich-Zentrum fu¨r

Experimentelle Medizin, AG Systems

Biology – Computational Physiology,

Tucholskystr 2, D-10117 Berlin, Germany

Fax: +49 30 450528972

Tel: +49 30 450528218

E-mail: ralf.mrowka@charite.de

(Received 26 February 2008, revised 1 April

2008, accepted 16 April 2008)

doi:10.1111/j.1742-4658.2008.06471.x

Reliable prediction of specific transcription factor target genes is a major challenge in systems biology and functional genomics Current sequence-based methods yield many false predictions, due to the short and degenerated DNA-binding motifs Here, we describe a new systematic gen-ome-wide approach, the seed-distribution-distance method, that searches large-scale genome-wide expression data for genes that are similarly expressed as known targets This method is used to identify genes that are likely targets, allowing sequence-based methods to focus on a subset of genes, giving rise to fewer false-positive predictions We show by cross-vali-dation that this method is robust in recovering specific target genes Fur-thermore, this method identifies genes with typical functions and binding motifs of the seed The method is illustrated by predicting novel targets of the transcription factor nuclear factor kappaB (NF-jB) Among the new targets is optineurin, which plays a key role in the pathogenesis of acquired blindness caused by adult-onset primary open-angle glaucoma We show experimentally that the optineurin gene and other predicted genes are tar-gets of NF-jB Thus, our data provide a missing link in the signalling of NF-jB and the damping function of optineurin in signalling feedback of NF-jB We present a robust and reliable method to enhance the genome-wide prediction of specific transcription factor target genes that exploits the vast amount of expression information available in public databases today

Abbreviations

CASP4, caspase 4; ChIP, chromatin immunoprecipitation; GAPDH, glyceraldehyde-3-phosphate dehydrogenase; HEK, human embryonic kidney; HIF-1, hypoxia-inducible factor 1; HNF4, hepatocyte nuclear factor 4; IKK, IjB kinase; NEMO, nuclear factor kappaB essential modulator; NF-jB, nuclear factor kappaB; OPTN, optineurin; RGA, reporter gene analysis; STAT5A, signal transducer and activator of transcription 5A; TNF-a, tumor necrosis factor-a.

Trang 2

applies for the transcription factors analysed in this

study The major problem is the short length and high

degeneracy of the DNA-binding motifs, which give rise

to one predicted binding site per 1000–10 000 bp by

sheer chance Therefore, other resources, such as

phylo-genetic footprinting have been explored to further

restrict and ‘purify’ potential targets to more likely

candidates [7,8] Such methods decrease the number of

false predictions by about one order of magnitude,

which is still not good enough for genome-wide

predic-tions Because the potential list of targets is too large,

further information needs to be exploited to

concen-trate the analysis on the genes that have a higher

prob-ability of being true target genes

Gene ontology as a controlled and

computer-read-able way to annotate genes has been used extensively

to characterize clusters of genes from microarray [9,10]

data and also to validate microarray data [11] Despite

the enormous number of false-positive predictions for

transcription factor targets with current methods,

sig-nificant correlations with gene ontology terms have

been found that can be used to enhance prediction

quality [12,13] In addition, statistical methods have

been developed to associate genes with disease [14],

and seed-based computational procedures have been

applied to identify brain cancer-related genes [15]

Currently, experience and knowledge of pathways

and an educated literature search may help us to focus

on possible candidates The inclusion of information

from expression experiments conducted under different

experimental conditions may hint at potential

candi-dates for further evaluation, as these data provide the

relevant biological functions of transcription factors,

which directly influence mRNA concentrations in the

cell Well-designed, small-scale expression profile

experiments have been successfully used to identify

transcription factors involved in certain pathways

[16,17] Especially when applied to time-series data,

seed-based clustering methods have been very

success-ful in identifying novel targets by comparing

expres-sion kinetics with known targets for p53 and for

picking up genes regulated in different cell-cycle phases

[18,19] However, these approaches require dedicated

microarray experiments We addressed the question as

to whether it is feasible to explore the large body of

expression information that is already stored in public

databases These datasets might contain information

about expression at different time points for different

cell lines that might be only marginally related to the

transcription factor under investigation, and we

won-dered whether these datasets would allow us to extract

the relevant information about the action of

transcrip-tion factors on their targets

In recent years, several microarray techniques have been developed to measure mRNA concentration on a genome-wide scale [20] In addition, efforts have been made to store individual microarray experiments in databases Microarray expression data have been used

in recent times to improve transcription factor target prediction [21] In this work, we developed a method

to exploit a dataset of approximately 1200 microarray experiments in conjunction with a seed group of known transcription factor target genes and show that the information available in the databases is sufficient

to increase the accuracy of prediction drastically We elucidate and exemplify our seed-distribution-distance method for predicting novel nuclear factor kappaB (NF-jB) targets NF-jB is involved in pathways important for both physiological processes and disease conditions It plays an important role in the control of immune function, differentiation, inflammation, stress response, apoptosis, cell survival, processes of develop-ment, and progression of cancers [22] Thus, NF-jB has become one of the most widely studied transcrip-tion factors Five NF-jB genes (NFKB1, NFKB2, RELA, c-REL and RELB) belong to the NF-jB gene family, and the resulting proteins are able to form homodimers or heterodimers [23] Prior to activation, NF-jB is localized in the cytoplasm and is tightly associated with its inhibitors (IjB proteins) and p100 proteins Multiple stimuli such as tumor necrosis fac-tor-a (TNF-a), UV radiation and free radicals, activate NF-jB signalling through activation of IjB kinases (IKKs), which phosphorylate IjBs and p100 proteins, subsequently leading to their polyubiquitination and degradation [24]

Results

The seed-distribution-distance method

We started by defining a ‘seed’ group of known NF-jB targets by collecting known NF-jB targets mentioned

in an NF-jB review paper [25] matching ensembl entries, resulting in 91 genes Joining the 91 target genes with the genes in the microarray set resulted in

81 genes, which were used as the seed We obtained these large-scale microarray expression data [26] (detailed description of data in supplementary Doc S1) from the Stanford microarray database [27] The set contains genome-wide data from 1202 hybridization experiments from human tissues and cell lines Subse-quently, we ranked each gene x according to its similarity L(x) of expression to the seed group (detailed results given in supplementary Doc S2) We defined similarity L(x) for a gene x by taking the

Trang 3

median correlation of gene x to the seed and

subtract-ing its median correlation to all genes (typical

distribu-tions of correladistribu-tions of genes to the seed group are

shown in supplementary Fig S1) Thus, if L(x) showed

high values, the particular gene was similarly regulated

as the seed gene group In contrast, if the absolute

value of the similarity measure was low, it indicated

that the median of distribution was close to that

corre-lation distribution of the gene to a randomly selected

group Using the similarity measure L, we then sorted

all remaining human genes and thereby obtained a

ranking of the genes according to their similarity to

the seed group To avoid a circular argument, we

would like to stress that for all statistical analyses and characterization of rank, the seed group was excluded

A schematic representation of this procedure is given

in Fig 1 The essence of the method is that if a gene’s correlation to those in the seed set (represented by the median) is larger than the median of the correlation to all genes, then it is more likely to be related to the seed set, the members of which are then more likely to be targets of the transcription factor This method requires that at least the initial seed set of true targets

is known, and that other targets are correlated to sev-eral genes in the seed set Furthermore, the method is based on the assumption that there is a relationship

Fig 1 Schematic diagram of the workflow

in this study Expression profiles of a gene

g are compared to the expression profiles of the seed genes and randomly selected genes A distance score L(x) is calculated that quantifies specific expression similarity

to the seed The genes are then ranked on the basis of L(x), searched for putative bind-ing sites in their promoter region, and sub-jected to a reporter gene assay.

Trang 4

between gene coexpression and gene coregulation.

The ranking can also be done by other scores than

the median correlation For instance, we have ranked

the genes using a one-sided P-value derived from a

computationally more extensive Mann–Whitney

rank-sum test, and found similar performance as with L(x)

(see supplementary Fig S3)

Top members in the rank show typical NF-jB

functions

We next analysed the top members of the obtained

rank with regard to their gene ontology classification

For the top 600 genes, we examined whether any gene

ontology classification is significantly enriched using

rigorous statistics [12] It turns out that the list of

sig-nificant gene functions of the top 600 genes as shown

in supplementary Table S1 is congruent with the

func-tions of NF-jB described in the literature

We further analysed the occurrences of NF-jB

typi-cal functions within the rank We found that there was

a steep increase of the density of genes involved in

‘immune response’, starting at approximately rank 700

when moving from lowest to highest ranks The

proba-bility of a gene being involved in the immune response

is therefore greatly increased for the top members in

the rank, as seen in Fig 2

High density of putative NF-jB DNA-binding sites

in promoters in the top group of the rank

As the overrepresentation of typical NF-jB-related biological functions might be due to coexpression mediated by different transcription factors, we decided

to analyse the sequences of putative promoter regions

of the high-ranking genes

We predicted binding sites for all vertebrate tran-scription factors contained in the transfac database

in the 500 bp putative promoter region of all genes in the ranking We derived the 500 bp sequences upstream of the transcriptional start site from the ensembl database We chose to limit our search to

500 bp, because we and others observed earlier that the majority of promoter sequences fall within this region [12,28]

To illustrate our method, we chose to search for consensus sequences from the transfac database in the putative promoter regions, as this method does not require an additional parameter like more sophisti-cated weight-matrix methods, which typically require a cut-off score (see also supplementary Table S5) We analysed the distribution of occurrence of all predicted factor-binding sites in the promoters of genes along the rank For each predicted binding motif, we calcu-lated the ratio of the number of occurrences in the upper 5% of the rank divided by the expected occur-rence in the top 5% (given by 0.05 times the total number of occurrences) A list of the motifs sorted by this ratio has NF-jB-binding motifs in the top ranks, namely NFKAPPAB65 (P = 0.0028) and NFKAP-PAB50 (P = 0.0239) (P-values from the binomial test; see Experimental procedures) In addition, this list includes motifs of the transcription factors BACH2 (P = 0.0025), signal transducer and activator of tran-scription 5A (STAT5A) (P = 0.0036), and VBP (P = 0.0106), which are enriched on average in the top group A graphical representation is given in Fig 3 (see also supplementary Table S4)

Robustness of seed-distribution-distance method The original seed group contained 81 known NF-jB targets (supplementary Table S2) As, for most tran-scription factors, fewer targets are known, we investi-gated whether the seed-distribution-distance method might also give reliable results if the seed was substan-tially smaller We applied a cross-validation strategy

by randomly dividing the original 81 targets into two groups, one group being the seed, and the remain-ing genes constitutremain-ing the other group, named the test group, t Several sizes of the seed were used (1, 10, 20

0

0.05

0.1

0.15

0.2

0.25

0 0.1 0.2

Position of gene in the ranking

"high rank" position "low rank"

Genes involved in immune response

Fig 2 Density of occurrences of genes annotated with the term

‘immune response’ in the ranking after applying the

seed-distribu-tion-distance method Immune response genes are highly enriched

in the top members of the rank (P < 0.0001, two-sided

Mann–Whit-ney rank-sum test) Red, individual occurrences of immune

response genes; black line, density of genes that are annotated

with the term Inset: density for all genes in the rank.

Trang 5

and 50 are shown in Fig 4; cumulative representations

of the distributions are provided in supplementary

Fig S2) After rank construction using the reduced

seed, the test group was then analysed regarding its

position in the rank This procedure was repeated 100

times It turned out that the test group members were

strongly present in the top positions of the rank, and

this was preserved even if a considerable part of the

original targets was not used for the seed Even if one

used, for example, only 10 of 81 members of the seed,

the remaining 71 genes in the test group were highly

enriched in the top ranks, as shown in Fig 4

Moreover, we addressed the question of whether the

seed-distribution-distance method is also effective in

enriching targets for other transcription factors We

chose E2F [29,30], ETS1 [31,32], hypoxia-inducible

factor 1 (HIF-1) [33], hepatocyte nuclear factor 4

(HNF4), and c-Myc [34], and collected seed groups for

these factors (supplementary Tables S2 and S3) We

applied our method to these seed groups in a

jack-knife manner (i.e we iteratively left one seed member

out and determined its position in the rank) For all of

these additional transcription factors, the seed mem-bers left out were strongly enriched in the top of the rank (Fig 5) Moreover, the top members of the rank were strongly enriched with typical gene ontology terms of the factors for E2F and HNF4 For ETS1, HIF-1 and c-Myc, this ontology enrichment is not as clear as for the other three tested factors One reason could be the considerably lower number of gene onto-logy annotated genes for the specific terms and, in the case of c-Myc, the broad-spectrum ontologies [34] The results of this jack-knife procedure also provide

an estimate of how many of the true positives will lie

in the upper 5%: about 18–39% of all targets would

be in the upper 5% of genes of the rank (26% for NF-jB, 39% for E2F, 29% for ETS1, 18% for HIF-1, 36% for HNF4, and 20% for c-Myc) Thus, applying the seed-distribution-distance method will enrich the true targets in the top 5% of the rank by a factor of 4–8

0 0.5

1 1.5

2 2.5

3

Enrichment of putative transcription factor

binding sites in top group

NF κB 65 ST A T5a

VBP1

NF κB 50

BACH2

Binding sites

for 234 other

vertebrate

transcription

factors

Occurence Enriched P < 0.025 Depleted P < 0.025

Fig 3 Distribution of enrichment of putative transcription

factor-binding motifs in the ranking after applying the

seed-distribution-dis-tance method The seed-distribution-disseed-distribution-dis-tance method enriches

genes with putative NF-jB-binding sites in the respective promoter.

The top gene group of the seed rank was analysed regarding

tran-scription factor-binding motif enrichment within the )500 bp

pro-moter region The binding motifs for NF-jB 50 and NF-jB 65 are

among the transcription factor-binding sites that are most strongly

enriched Note that the initial seed group was not contained in this

analysis.

Recovered position in gradient

Histogram of recovery test

0 2000 4000 6000 8000 10 000 12 000 14 000

0 0.1 0.2 0.3 0.4

0.5

Original seed n = 81 Seed n = 50 Seed n = 20 Seed n = 10 Seed n = 1

Fig 4 Recovery of target genes in a cross-validation test: the origi-nal seed was divided into two parts: (a) a group of members for rank construction; and (b) a test group with the remaining members

of the original seed Histograms of the recovery position of the test group are shown for the newly constructed ranks using the seed without the test group (median: s, , h, ) If, for example, 10 genes are used as a seed (71 in the test group), the relative occur-rence of the recovered positions are still very high (h), i.e the enrichment capability of the seed-distribution-distance method is still highly preserved For comparison, the relative occurrence of members of the original seed in the corresponding rank is given (d) The error bars indicate the 5th and 95th percentiles of the dis-tribution Corresponding cumulative histograms are given in supple-mentary Fig S2.

Trang 6

Taken together, these results suggest that the

seed-distribution-distance method is applicable to other

transcription factors as well, and might be used for

much smaller seed sizes than the 81 genes used in the

NF-jB seed

The list of predicted NF-jB targets and

experimental verification

We assembled a list of predicted NF-jB target genes

by selecting all genes that showed a putative

NF-jB-binding site (a match of a transfac consensus motif

of NF-jB) in the 500 bp upstream of the transcription

start site and were members of the upper 5% in the

rank The resulting list is shown in Table 1 Eight of

the 16 predicted targets have already been reported in

the literature to be direct targets of NF-jB, but were not

in the seed

We decided to validate three of the novel predicted

targets by performing luciferase reporter assays We

focused on optineurin (OPTN), among SPI-B, and

cas-pase 4 (CASP4), and chose NFKBIA as a positive

control and DARS from the bottom of our rank as a

negative control We cloned their human promoters in

a luciferase reporter plasmid and generated identical

plasmids in which the predicted consensus sequence of

the NF-jB-binding site was deleted A widely used

method to induce NF-jB is stimulation by means of

TNF-a Human HEK293 cells were transiently

trans-fected with the reporter plasmids, and TNF-a

stimula-tion (1.25–20 ngÆmL)1) was applied For all three unmodified promoters, luciferase activity was strongly induced in a concentration-dependent manner under TNF-a stimulation in the undeleted plasmid, very simi-lar to our positive control NFKBIA In contrast, in the experiment with the plasmids in which we had deleted the putative NF-jB sites, the concentration-dependent stimulation effect was not seen for OPTN and CASP4 promoters, and was strongly reduced for the Spi-B promoter (Fig 6), indicating that the NF-jB action was blocked in the deleted mutant The negative control (DARS) did not show any significant dose-dependent change in expression

Furthermore, we applied the chromatin immunopre-cipitation (ChIP) analysis in order to verify NF-jB interaction with the predicted NF-jB-binding sites A positive ChiP signal was obtained for OPTN and SPI-B

as well as for NFKBIA in stimulated cells (Fig 6) NF-jB-dependent activation of the CASP4 promoter was not indicated by ChIP analysis in HEK293 cells (Fig 6Be) This correlates well with a very low basal promoter activity, and therefore may be attributed to

a silenced CASP4 promoter in the cellular model used

Discussion

We have described the seed-distribution-distance method for the identification of specific transcription factor target genes This strategy extracts relevant information about gene regulation from large-scale

Table 1 Potential NF-jB targets identified by the seed-distribution-distance method that are in the top group of the rank and have predicted NF-jB-binding motifs within their )500 bp upstream promoter region Interestingly, eight of the 16 identified new targets are known targets

of NF-jB Note that all potential new targets were not in the initial seed group, so the otherwise known targets therefore constitute a good validation of our method The third column contains additional information about the results of the analysis of the ChIP assays and the repor-ter gene analysis (RGA) followed by a + or ) in case of a positive or negative result, respectively.

RGA+ (positive control) ENSG00000197635 Dipeptidyl peptidase 4 (DPP4)

ENSG00000081041 Macrophage inflammatory protein 2a precursor (CXCL2) Guitart et al [61]

ENSG00000169245 Small inducible cytokine B10 precursor (CXCL10) O’Donnell et al [60], suggested

ENSG00000117151 Di-N-acetylchitobiase precursor (CTBS)

ENSG00000023445 Baculoviral IAP repeat-containing protein 3 (BIRC3) Hosokawa et al [62]

ENSG00000166718 Hypothetical protein

ENSG00000158714 SLAM family member 8 precursor (SLAMF8)

Trang 7

microarray experiments to generate a

distribution-dis-tance-derived target prediction based on a seed set of

known target genes of a specific transcription factor

The target prediction is based on a combination of

transcription factor-binding site information and the distribution distance We took especial care to keep our method simple and the number of free parameters

as low as possible, so our results do not depend on

0

5

10

15

20

25

30

0

2

4

6

8

10

0

5

10

15

20

25

30

0

1

2

3

4

5

6

0 10 000 20 000

10

20

30

40

E2F

ETS1

HIF-1

HNF4

NFkB

Position in rank Position in rank

0

10

0 5000 10 000

0

10

0

10

0

10

0

10

5

5

5

5

5

Immune response

Liver development Blood coagulation Lipid metabolic process

Response to hypoxia Angiogenesis

Extracellular matrix

Cell cycle

Transcription

Factor

0

0

5

10

15

20

25

30

0

10

5

Fig 5 Left column: cross-validation of the seed distribution method for six different transcription factors By means of a jack-knife method, the recovery position of the gene left out in the rank was calculated for each transcription factor seed group There

is a clear and high enrichment in the top ranks for each transcription factor tested Right column: we applied the seed distribu-tion method to rank genes We calculated the gene ontology density for typical ontolo-gies of the corresponding factor Enrichment corresponds to an increased density at the top ranks as compared with the density at the bottom ranks.

Trang 8

any parameter fine-tuning Despite the simplicity of

the method, our predictions are very reliable, with 11

of the 16 predictions being true targets, corresponding

to an upper bound of the false discovery rate of 33%

On the basis of a jack-knife method, we estimate that

our seed-based method of ranking genes will enrich

true target genes within the top 5% by a factor of 4–8

Thus, incorporating the vast amount of microarray

data stored in databases can help to reduce the

extraordinarily high amount of false-positives obtained

with purely sequence-based methods [5,7,35] More

sophisticated clustering methods might even improve

the prediction quality further We provide both

statisti-cal and biologistatisti-cal evidence that the

seed-distribution-distance method is robust and applicable to other

transcription factors and is hence very useful in

pre-dicting specific transcription factor target genes

Top rank members are involved in typical

NF-jB-regulated functions and are enriched

with putative NF-jB-binding sites

The distance criterion for generating the rank is a kind

of expression profile similarity measure with respect to

the seed group It is not a priori clear that similarly

regulated genes share the same gene function The

NF-jB analysis, however, reveals that the

seed-distri-bution-distance method highly enriches genes in the

top ranks that share typical NF-jB-regulated

func-tions For instance, the processes immune responses,

complement activation, regulation of T-cell

differentia-tion and immune cell activadifferentia-tion are significantly

pres-ent in the top group (supplempres-entary Table S1)

Moreover, we found specific enrichment of predicted

binding motifs for NF-jB 50 and NF-jB 65 in the top

5% of the genes among three others We would expect

the other factors to be functionally related to NF-jB

This is the case for STAT5A, which has been reported

to be involved in severe combined immunodeficiency

[36] and is involved in the immune response [37]

Please note that these statistics were obtained without

the initial seed group Therefore, it would have been

possible in our example to determine with high certainty

from the constructed rank which seed group was used to

build up the rank, namely a group with NF-jB targets

OPTN is a direct NF-jB target

We predict a list of new NF-jB targets that were not

in the initial seed (Table 1) Eight of the 16 predicted

novel targets have been previously confirmed Three

other predicted NF-jB targets were experimentally

investigated in this study, and were identified as direct

NF-jB targets OPTN, Spi-B and CASP4 were in our predicted list of new targets Deletions in the OPTN gene are causative for the adult-onset primary open-angle glaucoma [38] Glaucoma affects 67 million peo-ple worldwide [39], and is the second largest cause of bilateral blindness in the world [40] It has been sug-gested that OPTN is involved in the TNF-a signalling pathway [41]; however, the molecular mode of action has been unknown up to now It has been suggested that OPTN blocks the protective effect of E3-14.7K on TNF-a-mediated cell killing, and hence OPTN may be part of the TNF-a signalling pathway that can shift the equilibrium towards induction of apoptosis [38,41] Recently, it has been shown that OPTN increases cell survival and translocates to the nucleus upon an apop-totic stimulus that is dependent upon the GTPase activity of Rab8, an interaction partner of OPTN [42] Interestingly, this protective function of OPTN is lost when the OPTN protein is changed to the mutated form E50K, which is typical for patients with normal tension glaucoma [42] We show that a deletion of a putative NF-jB-binding site in the promoter region of OPTN completely abolishes the enhancing action and modulatory effect of NF-jB on OPTN (Fig 6) Our experiments show clearly that OPTN is a direct target of NF-jB Recent findings indicated that TNF-a potentiates glutamate neurotoxicity through the blockade of glutamate transporter activity [43,44] Fur-thermore, it was shown that OPTN and NF-jB essen-tial modulator (NEMO) are competitive inhibitors of one another [45] NEMO represents the regulatory subunit of IKK, which is essential for NF-jB activa-tion [46] Together with our data, this makes it appar-ent that OPTN is part of a negative feedback system that is important for NF-jB action Elevated OPTN expression reduces induced NF-jB activation [45], and

is therefore protective against induced neuronal cell death, which depends on NF-jB activity This is in line with findings indicating that the protective func-tion of OPTN is lost upon truncafunc-tion resulting from the insertion of a premature stop codon, and when the OPTN protein is changed to the mutated form E50K, which is markedly reduced in patients suffer from glaucoma [42] Our data provide the missing link in the signalling of NF-jB and the damping function of OPTN in signalling feedback of NF-jB

The knowledge about the direct action of NF-jB on OPTN will greatly enhance our understanding of the signalling pathways relevant for antiapoptosis, and will

be helpful in designing possible new cell survival strate-gies in glaucoma patients

The two other newly identified and verified target genes of the NF-jB transcription factor seem to be

Trang 9

0.100 1.000 10.000

SPIB

0.100 1.000 10.000

Control TNF-alpha Control TNF-alpha Control TNF-alpha Input Anti-rabbit-AB Anti-NFkB-AB

Control DNA

0.001 0.010 0.100 1.000 10.000

P < 0.003

P < 0.01

n.s.

OPTN

0.010 0.100 1.000 10.000

CASP4

0.001 0.010 0.100 1.000 10.000

P < 0.03

n.s.

0

5

10

15

20

25

30

35

40

45

SPI-B SPI-B NFkB del

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

CASP4 CASP4 NFkB del

P < 3.2*10 –5

P < 4.2*10–12

Control 1.25 ng·mL –1

2.5 ng·mL –1

5 ng·mL –1

10 ng·mL –1

20 ng·mL –1

TNF-alpha

0

200

400

600

800

1000

1200

NFKBIA promoter DARS promoter

0 20 40 60 80 100 120

P = 0.94

A Reporter gene activity

P < 10 –15

Lucreportergene putative

NFkB site

deletion

Lucreportergene putative

NFkB site

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

P < 4.2*10–26

B ChIP analysis (a)

(b)

(c)

(d)

(e)

(a)

(b)

(c)

(d)

(e)

Control TNF-alpha Control TNF-alpha Control TNF-alpha Input Anti-rabbit-AB Anti-NFkB-AB

Control TNF-alpha Control TNF-alpha Control TNF-alpha Input Anti-rabbit-AB Anti-NFkB-AB

Control TNF-alpha Control TNF-alpha Control TNF-alpha Input Anti-rabbit-AB Anti-NFkB-AB

Control TNF-alpha Control TNF-alpha Control TNF-alpha Input Anti-rabbit-AB Anti-NFkB-AB

Control 1.25 ng·mL –1

2.5 ng·mL –1

5 ng·mL–1

10 ng·mL–1

20 ng·mL –1

TNF-alpha

Control 1.25 ng·mL –1

2.5 ng·mL –1

5 ng·mL–1

10 ng·mL–1

20 ng·mL –1

TNF-alpha

Control 1.25 ng·mL –1

2.5 ng·mL –1

5 ng·mL –1

10 ng·mL –1

20 ng·mL –1

TNF-alpha

Trang 10

involved in important physiological processes related

to typical known functions of NF-jB It is known

that the Spi-B transcription factor is expressed in

adult pro-T cells, with Spi-B being maximal in the

newly committed cells at the DN3 stage [47]

Furthermore, Spi-B can interfere with T-cell

develop-ment [47] CASP4 can function as an endoplasmic

reticulum stress-specific caspase in humans, and may

be involved in pathogenesis of Alzheimer’s disease

[48]

When does the seed-distribution-distance

method work?

The major assumption of our method is that genes

that are regulated by the same factor show at least

some coregulation We use a genome-wide based

simi-larity measure L(x) based on the comparisons of the

median values of two correlation distributions For

each gene (x) in the genome, we calculate L(x), which

is the median correlation of gene x with all the genes

within the seed set minus the median correlation of

gene x with all the rest of the genes in the genome

Our approach is able to ‘add up’ contributions form

all the genes in the seed set, and by the use of the

med-ian and not the mean, it can discard a reasonable

amount of outliers Subtracting the median correlation

with the rest of the genome corrects for the correlation

structure of the expression dataset as a whole We also

tried a more sophisticated scoring scheme by ranking

the genes on the basis of a Mann–Whitney rank-sum

test, which did not improve the performance of the

ranking procedure

The seed-distribution-distance method is extremely

robust and produces high enrichment even if a

consid-erable part of the seed is not present This was shown

by the cross-validation procedure and the subsequent

recovery test

The seed-distribution-distance method is expected to produce a biologically meaningful rank if the seed group is homogeneous with respect to its expression correlation If, for instance, the seed group contains completely unrelated expression clusters that are located in the cluster space in a linearly independent way, the resulting distance measure might not to be capable of building up a transcription factor-specific rank In this case, one would need to cluster the seed group into subseeds and to build up individual cluster-specific ranks For instance, this might be necessary in the case of transcription factors that target different genes depending on the splice form of the transcription factor Interestingly, however, in our analysis, the per-formance of the method seems not to depend crucially

on the homogeneity of the expression of the seed group, as some seed groups that performed well in the cross-validation test had large intraseed variations (supplementary Fig S4)

A second consideration relates to the expression dataset The seed-distribution-distance method relies

on the assumption that the transcription factor of interest shows some biological activity in the data If, for example, the transcription factor of interest is com-pletely shut down in all experiments, one would not expect to be able to recover the regulation response of that factor This issue might be of importance for genes that are only active at tight periods during devel-opment One solution to this problem would be to generate expression experiments with artificial expres-sion of that transcription factor or to include native material from that developmental period in the micro-array analysis

The third consideration relates to the size of the seed One would expect that if the seed is too small to define the target response adequately, the rank will be poorly defined However, our bootstrapping test showed that 10 seed genes are capable of enriching

Fig 6 Experimental validation of predicted NF-jB targets by functional analyses and physical NF-jB interaction with the predicted NF-jB-binding sites in the nuclear chromatin context (A) RGA HEK293 cells were transfected and treated for 24 h with TNF-a in a dose-dependent manner (n = 4) (a) Schematic illustration of experimental design RGA was measured with unmodified native promoter constructs (left col-umn) and in constructs where the putative NF-jB-binding sites were deleted (right column, NF-jB del) (b) Promoter activity for NFKBIA, which is known to be a target of NF-jB, and a negative control (DARS) Only the NFKBIA promoter responded in a dose-dependent manner under stimulation with TNF-a (c, d, e) RGA for the (c) OPTN, (d) SPI-B and (e) CASP4 promoter: All experiments showed a dose-dependent increase in promoter activity under stimulation with TNF-a Deletion of the putative NF-jB-binding site resulted in significantly attenuated dose-dependent responses (B) ChIP analysis HEK293 cells were cultured with TNF-a (10 ngÆmL)1) or without (control) for 24 h prior to crosslinking and ChIP using anti-rabbit serum (negative control) or an antibody to NF-jB Relative values of immunoprecipitated DNA were assessed by real-time PCR (n = 3) (a) Amplification of a coding region part of the intron-less gene encoding GAPDH, which should show no promoter-like activity and contains no potential NF-jB-binding element, served as control DNA (b–e) Verification of the predicted NF-jB-bind-ing sites was obtained for the (b) positive control NFKBIA as well as (c) OPTN and (d) SPI-B NF-jB-dependent activation of (e) the CASP4 promoter is not indicated by ChIP analysis in HEK293 cells.

Ngày đăng: 18/02/2014, 18:20

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

🧩 Sản phẩm bạn có thể quan tâm