200 likes | 1.16k Views
Beyond PubMed and BLAST: Exploring NCBI tools and databases. Kate Bronstad David Flynn Alumni Medical Library. Location 12 th Floor Instructional Bldg www.medlib.bu.edu Services Electronic resources: full text access through PubMed, Google Scholar, Web of Science
E N D
Beyond PubMed and BLAST: Exploring NCBI tools and databases Kate Bronstad David Flynn Alumni Medical Library
Location • 12th Floor Instructional Bldg • www.medlib.bu.edu • Services • Electronic resources: full text access through PubMed, • Google Scholar, Web of Science • Reference: drop in or by reservation • Instruction: request class sessions or • creation of web tutorial • - Learning resource center: lab space, hands-on • instruction Alumni Medical Library
National Center for Biotechnology Information • Built on Entrez System • Original database was Nucleotide • PubMed built upon this original structure. • PubMed, GENE, other molecular databases • interconnected • Gene discovery, related data options in PubMed • MyNCBI works with multiple databases NCBI
Gives sequence, expression, information about protein structure and function. • Doesn't list all known and predicted genes • Focuses on completely sequenced genomes or ones where research communities are actively contributing genetic information. • Information from RefSeq and collaborating model organism databases. • Mix of curated and automatically updated information. • Pulls in, links out to resources outside of NCBI. • 4.6 Million records for 5,588 taxa GENE
Summary • official full name, gene type, lineage, summary, AKA • Genomic regions, transcripts – structure, exon-intron • boundaries. • Gene table for fuller display. • Bibliography: GeneRIF. • Summary of gene functions with specific references to • related articles about function of gene/proteins in • PubMed. Put together by people at NCBI. • Not comprehensive, but will give you the most relevant • papers regarding function. • Authors can contact the NCBI to submit their citations GENE Record
Reference Sequences • Nucleotide sequences and protein translation • Curated by NCBI or NCBI-approved programs. • Difference between GenBank and RefSeq • GenBank has raw data and duplicated records • Metadata in GenBank can be incomplete • RefSeq annotated, curated and non-redundant. • NCBI takes best sequences from GenBank and • curates for RefSeq records RefSeq
mRNAs and Proteins NM_123456Curated mRNA NP_123456Curated Protein NR_123456Curated non-coding RNA XM_123456Predicted mRNA XP_123456Predicted Protein XR_123456Predicted non-coding RNA Gene Records NG_123456Reference Genomic Sequence Chromosome NC_123455Microbial replicons, organelle genomes, human chromosomes AC-123455 Alternate assemblies Assemblies NT_123456Contig NW_123456WGSSupercontig RefSeq Record Numbers
Online Mendelian Inheritance in Man • Previously in print, 10 volumes, updated every 2 years. • Contains all the known genes in humans. • Gives referenced explanations of cloning, allelic variations, inheritance, mapping, molecular genetics • Links to clinical and testing information • OMIA (Online Mendelian Inheritance in Animals) a separate database for information in animals. OMIM
GEO Profiles: Microarray Data Repository public repository • - Archives and freely distributes microarray, • next-generation sequencing, and other high- throughput functional genomic data. • - Submitted by researchers. Offers data storage, web-based interfaces and applications to query and download content • Evidence Viewer: Graphical display of evidence supporting a gene model Databases for Evidence
Genome • Sequence and map data from the whole genomes of over 1000 organisms • -Represent organisms that are completely sequenced and those that are in progress. • Graphical overviews of complete • genomes/chromosomes • Specialized genome BLAST search to see alignments in context of genome • Good for microbial genomes.
May want to use instead of BLAST if looking for a model organism with same function or if looking at an evolutionary comparison. • Allows downloads of genomic information. • - Can capture regulatory region by including bases up or down stream. • Multiple and pairwise alignment • Protein Alignment scores - Substitution rates, synonymous vs. non, conservative vs. radical • Polymorphisms in GeneView dbSNP link Homologene
Structure, MMDB (Molecular Modeling Database) • -Access from Protein link, Related Structure • CN3D for application to view at different angles, highlight sequence in structure. • VAST (Vector Alignment Search Tool) searches by geometric criteria Structure and Models
BLAST Link • Pre-run BLAST results • NCBI runs weekly searches for every new protein sequence. • Can use instead of running BLAST search - More information than in default BLAST: taxonomy report, view multiple alignments, search data against different BLink
MGI • Ensembl • KEGG: Kyoto encyclopedia of genes and genomes • - Integrated databases • - Pathway, disease, drug • - Good for quick pathway and protein graphics • UCSC Genome Browser • -Visualize tracks to compare information like gene • predictions, ESTs, conserved regions. • - BLAT Blast-like alignment tool – quicker but not • as sensitive as BLAST. Links to Outside Databases
Gene expression information from Gene Ontology (GO) • - Lists what has been assigned to the gene in: • Molecular Function • Biological Processes • Cellular Component • Level of evidence and references linked when available. • Links into AMIGO browser for more ontology or evidence information • Can search GENE for GO information by placing suffix at end of search • Ex: “vasodilation [GO]” Gene Information from GO
Biostatistics • - Dr.Mayetri Gupta: created statistical software for discovering transcription factor binding sites (motifs) and regulatory modules, gene regulatory networks, and phylogenetic inference. • Dr. Paola Sebastiani: created software for network modeling called Bayesware Discoverer, also CAGED, BAGED for analysis of gene expression data. BU Resources
Contact the library with any suggestions, recommendations that we can list or promote for BU community • Software and datasets can be archived in BU’s Digital Common • If there are resources we don’t have, we may be able to procure them for you. • Hands-on BLAST workshop offered. Library Support