1.24k likes | 1.4k Views
National Center for Biotechnology Information. A Field Guide part 2. UT-Health Science Center. February 14, 2006. Header. Feature Table. Sequence. GenBank Records. The Flatfile Format. LOCUS NM_019570 4279 bp mRNA linear INV 28-OCT-2004
E N D
National Center for Biotechnology Information A Field Guidepart 2 UT-Health Science Center February 14, 2006
Header Feature Table Sequence GenBank Records The Flatfile Format
LOCUS NM_019570 4279 bp mRNA linear INV 28-OCT-2004 DEFINITION Mus musculus REV1-like(S. cerevisiae)(Rev1l),mRNA ACCESSION NM_019570 VERSION NM_019570.3 GI:50811869 KEYWORDS . = Title A Typical GenBank Record
GenBank Record: Feature Table, con’t. GenPept identifier
skip GenBank Record: sequence
[accn] [orgn] [mdat] [prop] Indexing for Nucleotide UID 59958365 FieldIndexed Terms [primary accession] NM_001012399 [title] Bos taurus hemochromatosis (hfe), mRNA. [organism] Bos taurus [sequence length] 1168 [modification date] 2005/02/19 [properties] biomol mrna gbdiv mam srcdb refseq
[Title] Entrez Nucleotide: HFE 137 records Not HFE
42 records Curated HFE splice variants (11 total) Smarter Query hfe[title] AND human[orgn]
hfe[title]ANDhuman[orgn] (con’t) Primary data
srcdb Preview/Index: Properties, srcdb Properties
Preview/Index: Properties, srcdb …AND srcdb refseq[Properties]
Preview/Index: Properties, srcdb …AND srcdb ddbj/embl/genbank[Properties]
Primate division gbdiv pri[prop] EST division gbdiv est[prop] Database Queries #1hfe 137 #2 hfe[title]AND human[orgn] 42 #3 #2 AND srcdb refseq[prop] 11 #4 #2 AND srcdb ddbj/embl/genbank[prop] 31 #5 #4 AND gbdiv pri[prop] 29 #4 #4 AND gbdiv est[prop] 2
Genomic DNA biomol genomic[prop] cDNA biomol mrna[prop] Molecule Queries #1hfe 116 #2 hfe[title]AND human[orgn] 42 #3 #2 AND biomol mrna[prop] 29 #4 #2 AND biomol genomic[prop] 13
Entrez Nucleotide Reviewed RefSeqs with transcript variants: srcdb refseq reviewed[prop]ANDtranscript[title] AND variant[title] More Queries… Fields are database-specific
Entrez Nucleotide Reviewed RefSeqs with transcript variants: srcdb refseq reviewed[prop]ANDtranscript[title] AND variant[title] Entrez Gene Topoisomerase genes from Archaea: topoisomerase[gene name]ANDarchaea[organism] Genes on human chromosome 2 with OMIM links 2[chromosome] ANDhuman[organism]AND“gene omim”[filter] Membrane proteins linked to cancer: “integral to plasma membrane”[gene ontology]ANDcancer[dis] More Queries… Fields are database-specific
Genomic Biology UniGene E-PCR Map Viewer Trace Archive Genome Resources Genomic Biology
Genome Projects: microb 13 Eukaryotic Genome Sequencing Projects Selected: Complete – 0, Assembly – 2, In Progress - 11
E-PCR Map Viewer Trace Archive Genome Resources Genomic Biology UniGene
UniGene Gene-oriented clusters of expressed sequences • Automatic clustering using MegaBlast • Each cluster represents a unique gene • Informed by genome hits • Information on tissue types and map locations • Useful for gene discovery and selection of mapping reagents
A Cluster of ESTs query 5’ EST hits 3’ EST hits
UniGene Collections Species UniGene
ftp://ftp.ncbi.nih.gov/repository/UniGene/Homo_sapiens/ Get Sequences web page
UniGene Map Viewer E-PCR Trace Archive Genome Resources Genomic Biology
E-PCR Genomic sequence here