It has been frequently observed that a large number of SAGE tags do not match the existing expressed sequences upon analysis of the SAGE data Keywords modified lock-docking oligodT; mRNA
Trang 1analysis of gene expression
Wang-Jie Xu1, Qiao-Li Li1, Chen-Jiang Yao1, Zhao-Xia Wang1, Yang-Xing Zhao1and
Zhong-Dong Qiao1,2
1 College of Life Science and Technology, Shanghai Jiao Tong University, Shanghai, China
2 Shanghai Institute of Medical Genetics, Shanghai Jiao Tong University, Shanghai, China
The serial analysis of gene expression (SAGE)
tech-nique allows the construction of a comprehensive
expression profile, in which each mRNA is defined by
a specific 14-mer [1–4] By analyzing a short sequence
tag for a transcript, SAGE significantly decreases the
overall scale of sequencing analysis and makes it
possi-ble to analyze nearly all of the expressed transcripts
from the genome, a capability matched by no other
currently available method [5] Application of the SAGE technique has provided valuable information in various biological systems [6,7] Recently, millions of short cDNA sequences called SAGE tags have been collected from human tissues through the SAGE method [8,9] It has been frequently observed that a large number of SAGE tags do not match the existing expressed sequences upon analysis of the SAGE data
Keywords
modified lock-docking oligo(dT); mRNA;
RACE; serial analysis of gene expression
(SAGE); two-step analysis of unknown
SAGE tags (TSAT-PCR)
Correspondence
Z Qiao, Shanghai Institute of Medical
Genetics, Shanghai Jiao Tong University,
Shanghai, China
Fax: +86 21 54747330
Tel: +86 21 34204925
E-mail: zdqiao@sjtu.edu.cn
(Received 3 August 2008, revised 3
September 2008, accepted 5 September
2008)
doi:10.1111/j.1742-4658.2008.06671.x
Serial analysis of gene expression (SAGE) is a powerful technique for studying gene expression at the genome level However, short SAGE tags limit the further study of related data In this study, in order to identify a gene, we developed a semi-nested PCR-based method called the two-step analysis of unknown SAGE tags (TSAT-PCR) to generate longer 3¢-end cDNA fragments from unknown SAGE tags In the procedure, a modified lock-docking oligo(dT) with two degenerate nucleotide positions at the 3¢-end was used as a reverse primer to synthesize cDNAs Afterwards, the full-length cDNAs were amplified by PCR based on 5¢-RACE and 3¢-RACE The amplified cDNAs were then used for the subsequent two-step PCR of the TSAT-PCR process The first-two-step PCR was carried out at
an appropriately low annealing temperature; a SAGE tag-specific primer was used as the sense primer, and an 18 bp sequence (universal primer I) located at the 5¢-reverse primer end was used as the antisense primer After 15–20 PCR cycles, the 3¢-end cDNA fragments containing the tag could be enriched, and the PCR products could be used as templates for the second-step PCR to obtain the specific products The second-second-step PCR was per-formed with a SAGE tag-specific primer and a 22-bp sequence (universal primer II) upstream of universal primer I at the 5¢-reverse primer with a high annealing temperature With our innovative TSAT-PCR method, we could easily obtain specific PCR products covering SAGE from those tran-scripts, especially low-abundance transcripts It can be used as a method to identify genes expressed in different cell types
Abbreviations
GLGI, generation of longer cDNA fragments from serial analysis of gene expression tags for gene identification; PLF, primary library forward primer; PLR, primary library reverse primer; RAST-PCR, rapid reverse transcription–PCR analysis of unknown serial analysis of gene expression tags; rSAGE, reverse serial analysis of gene expression; SAGE, serial analysis of gene expression; TSAT-PCR, two-step analysis
of unknown SAGE tags; UP-I, universal primer I; UP-II, universal primer II.
Trang 2[10,11] It is possible, then, that the unmatched SAGE
tags originating from potentially novel transcripts or
novel genes are unidentified in the human genome
We have constructed a SAGE library on human
spermatozoa in which we obtained more than 2500
unique tags Of these, 54 were considered to be
high-frequency tags, and no homology could be found in
the GenBank database [12] Therefore, those tags
might represent unidentified genes However, there was
a major problem when the SAGE tag sequence was
applied to the process of gene identification Owing to
the short length of SAGE tag sequences, it became
dif-ficult to produce the 3¢-longer cDNA fragments and
even whole cDNA sequences by PCR, which affected
further studies on SAGE data Moreover, the short
tag has hindered the application of SAGE to the vast
majority of eukaryotes, including expressed sequence
tags and genome sequences without sufficient genomic
resources [13]
In order to solve this problem, we have developed
a technique called the two-step analysis of unknown
SAGE tags (TSAT-PCR) to generate the 3¢-longer
cDNA ends The three key points of our method are
as follows: first, it uses a modified lock-docking
oligo(dT) primer, with two degenerate nucleotide
positions at the 3¢-end, as a reverse primer to
syn-thesize the first-strand cDNA; second, the primary
cDNAs were enriched by PCR, and then served as
templates for the subsequent TSAT-PCR experiment;
and third, the semi-nested PCR principle was used
as a reference in designing the two-step PCR method
in order to obtain the 3¢-end cDNA tag-specific
fragments Currently, we have successfully used this
procedure to test and analyze 11 of the 54 unmatched SAGE tags
Results and Discussion
Enrichment of cDNA template Owing to RACE technology, we could now amplify full-length cDNAs to generate enough templates for the subsequent PCR, especially a few low-abundance cDNAs (Fig 1A) In this study, the amplification of cDNAs was carried out as follows: first, owing to two degenerate nucleotide positions at the 3¢-end of the modified oligo(dT) primer in the RT-PCR pro-cess, these nucleotides position the primer at the start
of the poly(A)+ tail, thereby eliminating the 3¢-heter-ogeneity inherent in conventional oligo(dT) priming [14] As the PrimeScript Reverse Transcriptase exhib-ited terminal transferase activity upon reaching the end of an RNA template, it added three to five resi-dues (predominantly dC) to the 3¢-end of the first-strand cDNA The 5¢-cap oligonucleotide contained a terminal stretch of G residues that annealed to the dC-rich cDNA tail and served as an extended tem-plate for reverse transcription In the subsequent PCR process, the reverse transcription product above was used as template Primary library forward primer (PLF) and primary library reverse primer (PLR) paired with the 5¢-end and 3¢-end of all cDNAs, respectively, and after 25 cycles, the entire cDNAs were largely amplified for the next experiment Figure 2 shows the amplified cDNAs As can be seen, the length of the smear is distributed from about
mRNA
mRNA NBAAAAAAA-3′
NBAAAAAAA-3′
NBAAAAAAA Modified oligo (dT)
NVTTTTTTT
NBAAAAAAA NVTTTTTTT NBAAAAAAA
NBAAAAAAA NVTTTTTTT
NBAAAAAAA NVTTTTTTT
16
16
16
16
16
16
16 16
5′
5′
5′-cap oligo
NVTTTTTTT NVTTTTTTT
NBAAAAAAA NVTTTTTTT GGG
CCC
GGG CCC
GGG CCC
GGG
GGG CCC Anneal first strand Primer to mRNA
cDNA first strand synthesis
Modified oligo (dT)
Tag-specific primer
UP-I
The 2 nd PCR The 1 st PCR
UP-II
UP-II
PLF
PLR
GGATCC
GGATCC
GGATCC
cDNAs synthesis
cDNA library
Fig 1 Detailed mechanism of the amplification of the whole cDNAs and the TSAT-PCR technique (A) In this process, double-stranded cDNAs synthesized by modified lock-docking oligo(dT) and 5¢-cap oligonucleotides were used for PCR During the PCR process, PLF and PLR were used as sense primer and antisense primer, respectively, to amplify the cDNAs (B) The procedure involved two PCR reactions The first PCR reaction was performed with a tag-specific primer containing a SAGE tag sequence and an 18 bp primer (UP-I) located at the 5¢-reverse primer end The first PCR product was then used as the template for the second PCR reaction The tag-specific primer and a 22-bp primer (UP-II) located near UP-I located at the 5¢-reverse primer were used as the sense primer and the antisense primer, respectively.
Trang 3100 bp to over 2 kb, and is mostly focused on the 0.3–
1 kb range The results demonstrate that
high-abun-dance genes are not very variable in terms of length,
as they mostly concentrate on a narrow span (0.3–
1 kb) Aside from the range, we can see that there are
a few low-abundance genes that are either very long
(50 kb) or short (50 bp) It seems that the smear of the
genes did not become obvious because of their low
abundance or short extension time in the PCR, or
both
TSAT-PCR general strategy
The amplified cDNAs served as primary templates for
TSAT-PCR, as illustrated in Fig 1B The antisense
primers [PLR, universal primer I (UP-I) and universal
primer II (UP-II)] were all designed from the sequence
of the modified oligo(dT) primer The three primers
shared some overlap with each other and their length
was different considering the consistency of their
equivalent sense primers (Fig 3) Both UP-I and
UP-II were used as nested primers in the TSAT-PCR
reactions The TSAT-PCR technique was developed from the principle of nested PCR, and the procedure included a two step-PCR reaction For 15–20 cycles of the first PCR, an appropriately low annealing tempera-ture (about 55C) was used, a SAGE tag-specific primer and UP-I As a result, the 3¢-end cDNA frag-ments containing the tag could be enriched while some nonspecific products were also generated simulta-neously, and then the PCR products could be used
as templates for the second-step PCR to obtain the specific products The second-step PCR was performed with a SAGE tag-specific primer and a nested primer (UP-II) at a high annealing temperature (‡ 60 C) Afterwards, the specific products corresponding to tags could be amplified
Amplification of longer sequences from SAGE tags
To test the TSAT-PCR procedure, we chose five tags corresponding to known genes, as well as 11 different-abundance tags corresponding to unknown genes, all identified in SAGE analysis of human spermatozoa (Table 1) Among the 16 tags, tag 4, A and E were used as representatives of low-frequency genes in order
to help us determine whether or not the process worked on low-frequency tags Upon application of the TSAT-PCR method, we obtained the PCR prod-ucts (Fig 4) of all tags tested using the standard PCR condition (first PCR, 94 C for 30 s, 55 C for 30 s and 72C for 30 s for 15 cycles; second PCR, 94 C for 30 s, 60C for 30 s and 72 C for 30 s for 25 cycles) The PCR products were electrophoresed through a 2.0% agarose gel, and cloned into a plasmid vector for sequencing analysis As compared with the others, tag 1, 2, 3, 4 and A displayed very weak PCR bands in the agarose gel, especially the two low-frequency tags (Fig 4) Aside from this, there were also two clear bands in the PCR product of no 10
We further optimized the PCR annealing tempera-ture, as well as the cycle number, for each of the weak-band tags Moreover, these bands were obviously clearer than the pervious ones (data not shown) We then verified whether or not each PCR product indeed represented a sequence downstream of the most 3¢ NlaIII site in the full-length cDNA by analyzing the
3530 bp
1584 bp
947 bp
564 bp
Fig 2 PCR amplification of the full-length cDNAs The cDNAs
were amplified with PLF and PLR M: kDNA ⁄ HindIII + EcoRI
mar-ker Lane 1: amplified full-length cDNAs Lane 2:
glyceraldehyde-3-phosphate dehydrogenase (GAPDH) GAPDH was used as control.
Fig 3 The sequences and relationships of the primers [modified oligo(dT), PLR, UP-I and UP-II] discussed in this article.
Trang 4sequences of the products If the tag sequence was
presented at the predicted location, no NlaIII site
would be present in the sequence of the obtained PCR
product, whereas the PCR product would include the
oligo(dT16) sequence All PCR products were cloned
and sequenced successfully (Table S1) Through
analy-sis of the sequencing result, we identified 16 of 17 PCR
products (Figs 4 and 5) that met the standard
men-tioned above This indicates that the 16 PCR products
represented a sequence downstream of the most 3¢
NlaIII restriction site In contrast, the remaining PCR
product was a large size band of the no 10 product, in
which sequences of UP-II and oligo(dT16) were not
found, although the tag-specific primer was found
only in its sequence This meant that the PCR product
was amplified by PCR using only a single primer (the
tag-specific primer no 10) Sequencing could only determine the single primer-prone product The sequencing results (Table 1) were analyzed using the blast program of the NCBI server (http://www ncbi.nlm.nih.gov/BLAST/) Among the five fragments containing known tags (Table 1), four sequences corre-sponding to the tags A, B, C and E were matched to the 3¢-cDNA of genes predicted by Zhao based on the spermatozoa SAGE tags [12], whereas no D was not matched to the gene (Hs 436980) The reason for this was further investigated, and it was found that the
no D tag could not represent the gene (Hs 436980), because seven NlaIII (CATG) sites were found between the site of the no tag D tag and a poly(dA) among the cDNA of the gene (Hs 436980) The blast results of another 11 sequences in the GenBank
Table 1 Overview of all tags analyzed with the TSAT-PCR technique The sequences from nos 1 and 7 matched a single sequence No 11 matched multiclusters The rest of the sequences did not match any clusters.
PCR product size (bp)
Presence
of NlaIII site
Presence
of oligo(dT) Blast results
BC021246 BC013387 AY211920 BC092442
a
Single-prime PCR product.
500 bp
400 bp
200 bp
100 bp
Fig 4 TSAT-PCR analysis of 16 tags Lanes 1–11 were unknown SAGE tags corresponding to tags 1–11 in Table 1 Lanes a, b, c, d and e were known SAGE tags corresponding to tags A, B, C, D and E in Table 1 TSAT-PCR was performed as described in Results and Discussion.
Trang 5database (refseq_rna: reference mRNA sequence and
expressed sequence tags) revealed several cases
(Table 1): match, multimatch, unmatch and mismatch
The corresponding accession numbers of matched and
multimatched sequences are given in Table 1 No 8
was defined as a mismatch, because the blast result
showed that the site of the tag did not exactly match
sequences in the GenBank database, due to nonspecific
amplification The genes corresponding to the matched
sequences (corresponding to tags A and E) are
Hs 431668 (COX6B1, cytochrome c oxidase subunit
Vib polypeptide 1) and Hs 34114 [ATP1A2, ATPase,
Na+⁄ K+-transporting, a2(+) polypeptide], which are
related to energy production for motility of the human
spermatozoa Hs 372658, corresponding to no B, is a
gene coding for spermatogenesis-related protein 7,
which could take part in spermatogenesis The rest of
the genes corresponding to tags C, 1 and 7 are
Hs 435464 (Homo sapiens neuritin 1-like), AK027322
(highly similar to signal recognition particle 68 kDa
protein), and NR_003286 (Homo sapiens 18S
ribo-somal RNA) Currently, as little is known of the
func-tion of mRNAs in human spermatozoa, it was difficult
to estimate whether the rest of the genes were related
to the function of human spermatozoa, or just retained
during spermatogenesis For the unmatched sequences
and multimatched sequences, the 5¢-RACE experiment
should be carried out to obtain its full-length cDNA
sequences and to determine whether the sequences
represent new genes
During the course of our research on the SAGE
data of the human spermatozoa, we became aware that
other methods [rapid reverse transcription–PCR
analy-sis of unknown SAGE tags (RAST-PCR) [15],
genera-tion of longer cDNA fragments from SAGE tags for
gene identification (GLGI) [16] and reverse SAGE
(rSAGE) [17]] hardly generate the 3¢-fragment
sequences of these unmatched tags Although GLGI is
more effective than RAST-PCR [17], the antisense
pri-mer in GLGI is only composed of oligo(dT), so the
rigorous PCR conditions, the Mg2+concentration, the
number of PCR cycles and the annealing temperature would be optimized for each SAGE tag In experi-ments, we often encountered nonspecific amplification
or multiple fragments, and met difficulties in amplify-ing the product of low-frequency tags, due to the short antisense primer The rSAGE method was derived from SAGE, and many steps and reagents are shared
by these two protocols However, step 4 (linker liga-tion) in the rSAGE protocol does not avoid self-ligation of the cDNA, and the self-self-ligation would lead
to smearing in the following PCR amplification In addition, the method requires more initial total RNA and poly(A)+ than SAGE, because of the loss of RNA in each step Thus, the demand for RNA restricts the application of this method during the low total RNA experiment, as each human spermatozoon
is estimated to contain just 0.015 pg of total RNA [18], only 1⁄ 600 of the amount of somatic total RNA
To avoid this problem, we have used semi-nested PCR to improve the specific amplification, and devel-oped the method called TSAT-PCR Using the condi-tions described in that article [17], we compared the two methods with six tags and obtained the results that we expected (Fig 5) The bands obtained with TSAT-PCR are obviously clearer than those obtained within GLGI; moreover, the tags (4, A and E) with low abundance (< 6) were all obtained with TSAT-PCR
In comparison with other methods, ours is able to amplify our target PCR products from low-abundance transcripts Also, the method needs a lower initial amount of mRNA than the with others Furthermore, our method possesses the advantages of being simple, rapid, low in cost, and highly efficient We have dem-onstrated that we could obtain a clear band of PCR products for each case, as well as enough full-length cDNAs as PCR templates for subsequent experiments through the novel PCR amplification method described above
Although the improved version of SAGE can gener-ate tags with lengths of 21 bases [19] and 26 bases [13], which theoretically can be uniquely assigned to a single
500 bp
300 bp
200 bp
100 bp
M E C B A 10 4 E C B A 10 4
TSAT-PCR GLGI
Fig 5 Comparison between GLGI and TSAT-PCR A set of six SAGE tags was chosen for the analysis Among the six tags, three tags (4, A and E) with low abundance (< 6) were examined The same RNA from human spermatozoa and sense primers was used for both methods The conditions used for GLGI followed the procedures described in [16].
Trang 6genomic position [20], there still exists a much earlier
SAGE database constructed with the use of the
conventional SAGE technique, which consists of
shorter tags (14 bp) Converting short tags to 3¢-longer
cDNA is a key step and a breakthrough for further
studies on SAGE data Our method would help SAGE
to become a high-throughput technique that could be
widely applied to gene expression
In summary, the study could be applied to further
analyses of SAGE data gathered from humans and
some eukaryotic species Our approach has several
important advantages, such he following: (a) it can
obtain enough full-length cDNA templates for
sub-sequent experiments, such as 5¢-RACE, 3¢-RACE and
northern blotting, among others; (b) it can convert
short SAGE tag sequences into 3¢-complementary
DNAs; (c) it can obtain full-length DNA sequences
containing specific tags from mRNA transcripts,
espe-cially low-abundance mRNA transcripts, through the
combined application of TSAT-PCR and 5¢-RACE;
and (d) it can identify novel genes from SAGE data
and confirm the existence of exons predicted by
bio-informatic tools in genomic sequences
Experimental procedures
Tag sequences
In our SAGE library generated from human spermatozoa,
each tag was homologously screened in the Unigene
data-base (http://www.ncbi.nlm.nih.gov/SAGE/SAGEtag.cgi?tag)
to identify its respective match We chose 16 SAGE tags,
including four tags corresponding to known genes, which
served as a positive control for this experiment, and 11
dif-ferent-abundance tags from the 54 unmatched tags
corre-sponding to unknown genes
RNA samples and cDNA synthesis
Total RNA of purified spermatozoa was extracted using
Trizol RNA isolation reagent (Invitrogen, Carlsbad, CA,
USA), according to the manufacturer’s protocol (http://www
invitrogen.com/content/sfs/manuals/10296010.pdf) The
quantity of extracted RNA was determined by UV
absorp-tion Meanwhile, cDNAs were generated with a modified
RACE method through the PrimeScript Reverse
Transcrip-tase (TaKaRa, Dalian, China), following the manufacturer’s
instructions Briefly, two kinds of primers were added in the
RT-PCR reaction: one was the modified oligo(dT) primer
(5¢-CCAGACACTATGCTCATACGACGCAG-T16-VN-3¢;
N= A, C, G, or T; V = A, G, or C), which was used as a
reverse transcription primer to generate the first-strand
cDNA; and the other was the 5¢-cap oligonucleotide primer
(5¢-AAGCAGTGGTATCAACGCAGAGTACGCGGG-3¢),
which annealed to the dC-rich cDNA tail and served as an extended template for reverse transcription Thus, a set of full-length cDNAs can now serve as a primary library of spermatozoa cDNAs to be used for further studies
Amplification of primary library
The full-length cDNAs in spermatozoa were amplified by PCR with the use of Takara Ex Taq Hot Start Version (TaKaRa), with the primary library sequences serving as the template Briefly, PLF (5¢-AAGCAGTGGTATCAACGCA GAGT-3¢) was used as the sense primer, and was located at the 5¢-end of all cDNAs generated from the 5¢-cap oligonu-cleotide primer Meanwhile, PLR, which used the sequence (5¢-CCAGACACTATGCTCATACGACG-3¢) in the 3¢-ends
of all cDNAs incorporated from the reverse transcription primer, was used as the antisense primer in the PCR The PCR program consisted of 25 cycles of 94C for 30 s, 66 C for 30 s and 72C for 3 min The final extension step con-sisted of 72C for 5 min Ten microliters of the PCR product was checked by 1.2% agarose gel electrophoresis
TSAT-PCR
The amplified primary library was diluted 103-fold with sterile H2O for TSAT-PCR analyses A 1-lL aliquot was directly used as a template for the first PCR amplification with the tag-specific primer (5¢-GGATCCXXXXXXXXXX,
X represents each tag) and UP-I (5¢-CCAGACACTAT GCTCATA-3¢) The reaction was then carried out for 15 cycles with the following conditions: 94C for 30 s, 53–
55C for 30 s and 72 C for 30 s extension with TaKaRa
Ex Taq(TaKaRa), using a Bio-Rad Cycler (Bio-Rad, Her-cules, CA, USA) The resulting PCR product was diluted
103-fold with sterile H2O, and a 1 lL aliquot was used as a template for the second nested PCR amplification with the tag-specific primer and UP-II (5¢-CACTATGCTCATAC GACGCAGT-3¢) with the following conditions: 25–30 cycles of 94C for 30 s, 60 C for 30 s and 72 C for 30 s, using TaKaRa Ex Taq (TaKaRa)
DNA cloning and sequencing
The PCR products were cloned into pT19G-T vector (Gen-eray Biotech, Shanghai, China) Positive clones were screened by PCR with M13 reverse and M13 forward (220 bp) primers while located in the vector; sequencing reactions were performed by Sanny Bio-Tech (Shanghai, China)
Acknowledgements
This work was supported by Shanghai Leading Academic Discipline Project (B205)
Trang 71 Velculescu VE, Zhang L, Vogelstein B & Kinzler KW
(1995) Serial analysis of gene expression Science 270,
484–487
2 Madden SL, Galella EA, Zhu J, Bertelsen AH &
Beau-dry GA (1997) SAGE transcript profiles for
p53-depen-dent growth regulation Oncogene 15, 1079–1085
3 Velculescu VE, Zhang L, Zhou W, Vogelstein J, Basrai
MA, Bassett DE Jr, Hieter P, Vogelstein B & Kinzler
KW (1997) Characterization of the yeast transcriptome
Cell 88, 243–251
4 Zhang L, Zhou W, Velculescu VE, Kern SE, Hruban
RH, Hamilton SR, Vogelstein B & Kinzler KW (1997)
Gene expression profiles in normal and cancer cells
Science 276, 1268–1272
5 Velculescu VE, Madden SL, Zhang L, Lash AE, Yu J,
Rago C, Lal A, Wang CJ, Beaudry GA, Ciriello KM
et al.(1999) Analysis of human transcriptomes Nat
Genet 23, 387–388
6 Hashimoto S, Suzuki T, Dong HY, Nagai S, Yamazaki
N & Matsushima K (1999) Serial analysis of gene
expression in human monocyte-derived dendritic cells
Blood 94, 845–852
7 Hibi K, Liu Q, Beaudry GA, Madden SL, Westra WH,
Wehage SL, Yang SC, Heitmiller RF, Bertelsen AH,
Sidransky D et al (1998) Serial analysis of gene
expression in non-small cell lung cancer Cancer Res 58,
5690–5694
8 Lal A, Lash AE, Altschul SF, Velculescu V, Zhang L,
McLendon RE, Marra MA, Prange C, Morin PJ,
Pol-yak K et al (1999) A public database for gene
expres-sion in human cancers Cancer Res 59, 5403–5407
9 Boon K, Osorio EC, Greenhut SF, Schaefer CF,
Shoe-maker J, Polyak K, Morin PJ, Buetow KH, Strausberg
RL, De Souza SJ et al (2003) An anatomy of normal
and malignant gene expression Proc Natl Acad Sci
USA 99, 11287–11292
10 Lee S, Zhou G, Clark T, Chen J, Rowley JD & Wang
SM (2001) The pattern of gene expression in human
CD15+ myeloid progenitor cells Proc Natl Acad Sci
USA, 98, 3340–3345
11 Zhou G, Chen J, Lee S, Clark T, Rowley JD & Wang
SM (2001) The pattern of gene expression in human
CD34+ hematopoietic stem⁄ progenitor cells Proc Natl
Acad Sci USA 98, 13966–13971
12 Zhao YX, Li QL, Yao CJ, Wang ZX, Zhou Y, Wang
YJ, Liu LM, Wang YF, Wang LY & Qiao ZD (2006)
Characterization and quantification of mRNA tran-scripts in ejaculated spermatozoa of fertile men by serial analysis of gene expression Hum Reprod 21, 1583–1590
13 Matsumura H, Reuter M, Kru¨ger DH, Winter P, Kahl
G & Terauchi R (2008) SuperSAGE Methods Mol Biol
387, 55–70
14 Borson ND, Sato WL & Drewes LR (1992) A lock-docking oligo(dT) primer for 5¢ and 3¢RACE PCR PCR Methods Appl 2, 144–148
15 van den Berg A, van der Leij J & Poppema S (1999) Serial analysis of gene expression: rapid RT-PCR analysis of unknown SAGE tags Nucleic Acids Res 27, e17
16 Chen JJ, Rowley JD & Wang SM (2000) Generation of longer cDNA fragments from serial analysis of gene expression tags for gene identification Proc Natl Acad Sci USA 97, 349–353
17 Richards M, Tan SP, Chan WK & Bongso A (2006) Reverse serial analysis of gene expression (SAGE) char-acterization of orphan SAGE tags from human embry-onic stem cells identifies the presence of novel
transcripts and antisense transcription of key pluri-potency genes Stem Cells 24, 1162–1173
18 Miller D, Ostermeier GC & Krawetz SA (2005) The controversy, potential and roles of spermatozoal RNA Trends Mol Med 11, 156–163
19 Wahl MB, Heinzmann U & Imai K (2005) Long SAGE analysis significantly improves genome annotation: iden-tifications of novel genes and alternative transcripts in the mouse Bioinformatics 21, 1389–1392
20 Saha S, Sparks AB, Rago C, Akmaev V, Wang CJ, Vogelstein B, Kinzler KW & Velculescu VE (2002) Using the transcriptome to annotate the genome Nat Biotechnol 20, 508–512
Supporting information
The following supplementary material is available: Table S1 The amplified longer cDNA sequences This supplementary material can be found in the online version of this article
Please note: Wiley-Blackwell is not responsible for the content or functionality of any supplementary materials supplied by the authors Any queries (other than missing material) should be directed to the corresponding author for the article