[allele, citation, codon, codon_start, db_xref, EC_number, evidence, exception, function, gene, label, map, note, number, product, protein_id, pseudo, standard_name, translation, transl_
Trang 2Brought to You by
Trang 10PROSITE example SWISS-PROT example
Trang 15[ SYMBOL ] [ A ] [ B ] [ C ] [ D ] [ E ] [ F ] [ G ] [ H ] [ I ] [ J ] [K] [ L ] [ M ] [ N ] [ O ] [ P ] [ Q ] [ R ] [ S ] [ T ] [ U ] [ V ] [ W ] [ Y ]
Kyte & Doolittle hydropathy plot of protein sequences, displaying
[ Team LiB ]
Trang 33properties of the sequence are determined by taking a shortlength of sequence known as a window and determining theproperties of the sequence in that window The window isincrementally moved along the sequence, and the
properties are calculated at each new position
-shiftincrement (integer)
This is the amount the window is moved at each increment
in order to find the melting point and other properties alongthe sequence
Trang 34-temperature (float)
If -thermo has been specified, this specifies the
temperature at which to calculate the DeltaG, DeltaH andDeltaS values
Advanced qualifiers:
-rna (boolean)
This specifies that the sequence is an RNA sequence, not aDNA sequence
-product (boolean)
This prompts for percent formamide, percent of mismatchesallowed, and product length
-thermo (boolean)
Output the DeltaG, DeltaH, and DeltaS values of the
sequence windows to the output data file
-plot (boolean)
Trang 35If this is not specified, the file of output data is produced,else a plot of the melting point along the sequence isproduced.
Trang 36B.6 Ciliate, Dasycladacean, and Hexamita Nuclear Code
Trang 37GTC V Val GCC A Ala GAC D Asp GGC G Gly
The following table contains the differences in the Ciliate,
Dasycladacean, and Hexamita Nuclear Code from the StandardCode
Codon Ciliate, Dasycladacean, and Hexamita Nuclear Standard
Trang 41newcpgreport is used in the production of the CpG Island
database CPGISLE It produces CPGISLE database entry formatreports for a potential CpG island See the FTP site:
Trang 45pscan scans proteins using PRINTS The home web page of the PRINTS database is
Trang 48The output files named after the prosite access numbers can also be seen in the prosite directory This files are automatically created after prosextract is run.
Mandatory qualifiers:
[-infdat] (string)
Enter name of PROSITE directory.
Trang 49HincII,hinfI,ppiI,hindiii This command is notcase-sensitive You may also use the data from file
Trang 50of enzymes to search for A file containing enzyme namesmight look like this:
Trang 53Show suppliers.
Trang 59-[no]cleanup (boolean)
Clean up temporary files
Trang 62Clean up temporary files.
Trang 65Clean up temporary files.
Trang 66GenBank is maintained by the National Center for BiotechnologyInformation (NCBI) It is joined by the DNA Data Bank of Japan(DDBJ, in Mishima, Japan) and the European Molecular BiologyLaboratory (EMBL, in Heidelberg, Germany) nucleotide databasefrom the European Bioinformatics Institute (EBI, in Hinxton,UK) to form the International Nucleotide Sequence DatabaseCollaboration Although the three repositories have separatesites for data submission, they share sequence data and allowdaily downloads of sequence files by the public We're usingGenBank Release 132, EMBL Release 72, and DDBJ Release 51
Trang 67In February 1986, GenBank and EMBL (joined by DDBJ in 1987)started a collaborative effort to create a common feature tableformat The overall objective of the feature table was to supply
an in-depth vocabulary for describing nucleotide (and protein)features We're using Version 4 of the feature table
2.7.1 Features
A feature is a single word or abbreviation indicating a functionalrole or region associated with a sequence A list of
Definition column of the table, the appropriate qualifiers foreach feature are in brackets Mandatory qualifiers are
2) sequence segment located between the promoter and the first structural gene that causes partial termination of transcription.
[citation, db_xref, evidence, gene, label, map, note, phenotype, usedin]
C_region
Constant region of immunoglobulin light and heavy chains, and T-cell receptor alpha, beta, and gamma chains; includes one or more exons depending on the particular chain.
[citation, db_xref, evidence, gene, label, map, note, product, pseudo, standard_name, usedin]
Trang 68Coding sequence; sequence of nucleotides that corresponds with the sequence of amino acids in a protein (location includes stop codon); feature includes amino acid conceptual translation.
[allele, citation, codon, codon_start, db_xref, EC_number, evidence, exception, function, gene, label, map, note, number, product, protein_id, pseudo, standard_name, translation, transl_except, transl_table, usedin]
conflict
Independent determinations of the "same" sequence differ at this site or region.
[citation, db_xref, evidence, label, map, note, gene, replace, usedin]
D-loop
Displacement loop; a region within mitochondrial DNA in which a short stretch of RNA is paired with one strand of DNA, displacing the original partner DNA strand in this region Also used to describe the displacement
of a region of one strand of duplex DNA by a single stranded invader in the reaction catalyzed by RecA protein.
[citation, db_xref, evidence, gene, label, map, note, usedin]
D_segment
Diversity segment of immunoglobulin heavy chain, and T-cell receptor beta chain
[citation, db_xref, evidence, gene, label, map, note, product, pseudo, standard_name, usedin]
enhancer
A cis-acting sequence that increases the utilization of (some) eukaryotic promoters, and can function in either orientation and in any location (upstream or downstream) relative to the promoter.
[citation, db_xref, evidence, gene, label, map, note, standard_name, usedin]
exon
Region of genome that codes for portion of spliced mRNA, rRNA and tRNA; may contain 5' UTR, all CDSs, and 3' UTR.
[allele, citation, db_xref, EC_number, evidence, function, gene, label, map, note, number, product, pseudo, standard_name, usedin]
GC_signal
GC box; a conserved GC-rich region located upstream of the start point of eukaryotic transcription units which may occur in multiple copies or in either orientation; consensus=GGGCGG.
[citation, db_xref, evidence, gene, label, map, note, usedin]
Region of biological interest identified as a gene and for which a name has been assigned.
Trang 69gene [allele, citation, db_xref, evidence, function, label, map, note, product,
pseudo, phenotype, standard_name, usedin]
iDNA
Intervening DNA; DNA which is eliminated through any of several kinds of recombination.
[citation, db_xref, evidence, function, label, gene, map, note, number, standard_name, usedin]
intron
A segment of DNA that is transcribed, but removed from within the transcript by splicing together the sequences (exons) on either side of it [allele, citation, cons_splice, db_xref, evidence, function, gene, label, map, note, number, standard_name, usedin]
J_segment
Joining segment of immunoglobulin light and heavy chains and T-cell receptor alpha, beta, and gamma chains.
[citation, db_xref, evidence, gene, map, note, product, pseudo, standard_name, usedin]
LTR
Long terminal repeat, a sequence directly repeated at both ends of a defined sequence, of the sort typically found in retroviruses.
[citation, db_xref, evidence, function, gene, label, map, note, standard_name, usedin]
mat_peptide
Mature peptide or protein coding sequence; coding sequence for the mature or final peptide or protein product following post-translational modification; the location does not include the stop codon (unlike the corresponding CDS).
[citation, db_xref, EC_number, evidence, function, gene, label, map, note, product, pseudo, standard_name, usedin]
misc_binding
Site in nucleic acid which covalently or non-covalently binds another moiety that cannot be described by any other binding key (primer_bind
[citation, clone, db_xref, evidence, gene, label, map, note, phenotype, replace, standard_name, usedin]
Trang 70Region of biological interest which cannot be described by any other feature key; a new or rare feature.
[citation, db_xref, evidence, function, gene, label, map, note, number, phenotype, product, pseudo, standard_name, usedin]
misc_recomb
Site of any generalized, site-specific or replicative recombination event where there is a breakage and reunion of duplex DNA that cannot be described by other recombination keys (iDNA and virion) or qualifiers of source key (/insertion seq, /transposon, /proviral).
[citation, db_xref, evidence, gene, label, map, note, organism,
standard_name, usedin]
misc_RNA
Any transcript or RNA product that cannot be defined by other RNA keys (prim_transcript, precursor_RNA, mRNA, 5' clip, 3' clip, 5' UTR, 3' UTR, exon, CDS, sig_peptide, transit_peptide, mat_peptide, intron, polyA_site, rRNA, tRNA, scRNA, and snRNA).
[citation, db_xref, evidence, function, gene, label, map, note, product, standard_name, usedin]
misc_signal
Any region containing a signal controlling or altering gene function or expression that cannot be described by other signal keys (promoter, CAAT_signal, TATA_signal, -35_signal, -10_signal, GC_signal, RBS, polyA_signal, enhancer, attenuator, terminator, and rep_origin).
[citation, db_xref, evidence, function, gene, label, map, note, phenotype, standard_name, usedin]
misc_structure
Any secondary or tertiary nucleotide structure or conformation that cannot be described by other Structure keys (stem_loop and D-loop) [citation, db_xref, evidence, function, gene, label, map, note,
standard_name, usedin]
modified_base
The indicated nucleotide is a modified nucleotide and should be substituted for by the indicated molecule (given in the mod_base qualifier value).
[citation, db_xref, evidence, frequency, gene, label, map, mod_base,
note, usedin]
mRNA
Messenger RNA; includes 5' untranslated region (5'UTR), coding sequences (CDS, exon) and 3' untranslated region (3'UTR);
[allele, citation, db_xref, evidence, function, gene, label, map, note,
Trang 71N_region
Extra nucleotides inserted between rearranged immmunoglobulin segments.
[citation, db_xref, evidence, gene, label, map, note, product, pseudo, standard_name, usedin]
old_sequence
The presented sequence revises a previous version of the sequence at this location.
[citation, db_xref, evidence, gene, label, map, note, replace, usedin]
polyA_signal
Recognition region necessary for endonuclease cleavage of an RNA transcript that is followed by polyadenylation; consensus=AATAAA.
[citation, db_xref, evidence, gene, label, map, note, usedin]
polyA_site
Site on an RNA transcript to which will be added adenine residues by post-transcriptional polyadenylation.
[citation, db_xref, evidence, gene, label, map, note, usedin]
precursor_RNA
Any RNA species that is not yet the mature RNA product; may include 5' clipped region (5'clip), 5' untranslated region (5'UTR), coding sequences (CDS, exon), intervening sequences (intron), 3' untranslated region (3'UTR), and 3' clipped region (3'clip).
[allele, citation, db_xref, evidence, function, gene, label, map, note, product, standard_name, usedin]
prim_transcript
Primary (initial, unprocessed) transcript; includes 5' clipped region (5'clip), 5' untranslated region (5'UTR), coding sequences (CDS, exon), intervening sequences (intron), 3' untranslated region (3'UTR), and 3' clipped region (3'clip).
[allele, citation, db_xref, evidence, function, gene, label, map, note, standard_name, usedin]
primer_bind
Non-covalent primer binding site for initiation of replication, transcription,
or reverse transcription; includes site(s) for synthetic e.g., PCR primer elements.
[citation, db_xref, evidence, gene, label, map, note, standard_name, PCR_conditions, usedin]
Region on a DNA molecule involved in RNA polymerase binding to initiate transcription.
Trang 72Region of genome containing repeating units.
[citation, db_xref, evidence, function, gene, insertion_seq, label, map, note, rpt_family, rpt_type, rpt_unit, standard_name, transposon, usedin]
repeat_unit
Single repeat element.
[citation, db_xref, evidence, function, gene, label, map, note, rpt_family, rpt_type, rpt_unit, usedin]
rep_origin
Origin of replication; starting site for duplication of nucleic acid to give two identical copies.
[citation, db_xref, direction, evidence, gene, label, map, note, standard_name, usedin]
rRNA
Mature ribosomal RNA ; RNA component of the ribonucleoprotein particle (ribosome) which assembles amino acids into proteins.
[citation, db_xref, evidence, function, gene, label, map, note, product, pseudo, standard_name, usedin]
S_region
Switch region of immunoglobulin heavy chains; involved in the rearrangement of heavy chain DNA leading to the expression of a different immunoglobulin class from the same B-cell.
[citation, db_xref, evidence, gene, label, map, note, product, pseudo, standard_name, usedin]
satellite
Many tandem repeats (identical or related) of a short basic repeating unit; many have a base composition or other property different from the genome average that allows them to be separated from the bulk (main band) genomic DNA.
Trang 73[citation, db_xref, evidence, gene, label, map, note, rpt_type, rpt_family, rpt_unit, standard_name, usedin]
scRNA
Small cytoplasmic RNA; any one of several small cytoplasmic RNA molecules present in the cytoplasm and (sometimes) nucleus of a eukaryote.
[citation, db_xref, evidence, function, gene, label, map, note, product, pseudo, standard_name, usedin]
sig_peptide
Signal peptide coding sequence; coding sequence for an N-terminal domain of a secreted protein; this domain is involved in attaching nascent polypeptide to the membrane leader sequence.
[citation, db_xref, evidence, function, gene, label, map, note, product, pseudo, standard_name, usedin]
snRNA
Small nuclear RNA molecules involved in pre-mRNA splicing and processing.
[citation, db_xref, evidence, function, gene, label, map, note, partial, product, pseudo, standard_name, usedin]
snoRNA
Small nucleolar RNA molecules mostly involved in rRNA modification and processing.
[citation, db_xref, evidence, function, gene, label, map, note, partial, product, pseudo, standard_name, usedin]
source
Identifies the biological source of the specified span of the sequence; this key is mandatory; more than one source key per sequence is
permissable; every entry will have, as a minimum, a single source key spanning the entire sequence or multiple source keys together spanning the entire sequence.
[cell_line, cell_type, chromosome, citation, clone, clone_lib, country, cultivar, db_xref, dev_stage, environmental_sample, focus, frequency, germline, haplotype, lab_host, insertion_seq, isolate, isolation_source,
label, macronuclear, map, note, organelle, organism, plasmid,
pop_variant, proviral, rearranged, sequenced_mol, serotype, serovar, sex, specimen_voucher, specific_host, strain, sub_clone, sub_species, sub_strain, tissue_lib, tissue_type, transgenic, transposon, usedin, variety, virion]
stem_loop
Hairpin; a double-helical region formed by base-pairing between adjacent (inverted) complementary sequences in a single strand of RNA or DNA [citation, db_xref, evidence, function, gene, label, map, note,
standard_name, usedin]
Trang 74Sequence tagged site; short, single-copy DNA sequence that characterizes a mapping landmark on the genome and can be detected
by PCR; a region of the genome can be mapped by determining the order
of a series of STSs.
[citation, db_xref, evidence, gene, label, note, map, standard_name, usedin]
TATA_signal
TATA box; Goldberg-Hogness box; a conserved AT-rich septamer found about 25 bp before the start point of each eukaryotic RNA polymerase II transcript unit which may be involved in positioning the enzyme for correct initiation; consensus=TATA(A or T)A(A or T).
[citation, db_xref, evidence, gene, label, map, note, usedin]
terminator
Sequence of DNA located either at the end of the transcript that causes RNA polymerase to terminate transcription.
[citation, db_xref, evidence, gene, label, map, note, standard_name, usedin]
transit_peptide
Transit peptide coding sequence; coding sequence for an N-terminal domain of a nuclear-encoded organellar protein; this domain is involved
in post-translational import of the protein into the organelle.
[citation, db_xref, evidence, function, gene, label, map, note, product, pseudo, standard_name, usedin]
tRNA
Mature transfer RNA, a small RNA molecule (75-85 bases long) that mediates the translation of a nucleic acid sequence into an amino acid sequence.
[anticodon, citation, db_xref, evidence, function, gene, label, map, note, product, pseudo, standard_name, usedin]
[citation, db_xref, evidence, gene, label, map, note, product, pseudo, standard_name, usedin]
Variable segment of immunoglobulin light and heavy chains, and T-cell