270 likes | 1.26k Views
Entrez. Global Query Cross-Database Search System. Basics. Location: http://www.ncbi.nlm.nih.gov/ Entrez is connected to 23 databases which are part of the National Center for Biotechnology (NCBI) A single search ran on the main page will search all of databases simultaneously
E N D
Entrez Global Query Cross-Database Search System
Basics • Location: http://www.ncbi.nlm.nih.gov/ • Entrez is connected to 23 databases which are part of the National Center for Biotechnology (NCBI) • A single search ran on the main page will search all of databases simultaneously • Results are displayed according to each individual database • Example
Why Entrez is powerful • Records contained within an Entrez-connected databases and between databases are linked together • Links are referred to as “neighbors” • After executing a query, neighbors of a result can be examined
Databases • Umbrella Nucleotide Database • NCBI’s traditional Nucleotide database • Split into GST, EST, and the remaining “Core Nucleotide” sequences • Protein Database • Contains sequence data from the translated coding regions from DNA sequences in GenBank, EMBL, and DDBJ • Also contains information submitted from outside sources that are submitted to Protein Information Resource (PIR)
Databases (cont) • Genome Database • Contains a data about several genomes, complete chromosomes, and sequence maps • Structure Database (Molecular Modeling Database) • Contains experimental concerning protein structure • Conserved Domains • A database of protein domains • UniSTS • Provides information concerning sequence-tagged sites
Databases (cont) • Gene • Provides a query environment for genes by sequence • PopSet • Contains aligned sequences submitted as a set resulting from a population, phylogenetic, or mutation study • Data pertains to evolution and population variation • Nucleotide and protein sequence data • Taxonomy Database • Contains names of all organisms that are represented in the NCBI genetic database
Databases (cont) • Cancer Chromosomes • Contains three cancer cytogenetic databases • PubChem Compound & Substance • Contains validated chemical depiction information • PubMed Central • U.S. National Library of Medicine’s digital archive of life science journal literature • Journals • Contains journals in all Entrez databases
Databases (cont) • Bookshelf • Contains a collection of biomedical books that are referenced by Entrez • OMIM Database • A database of human genes and disorders • OMIA • A database of animal genes and disorders • Probe Database • Public registry of nucleic acid reagents pertaining to biomedical research applications
Operators • Boolean operators may be used as part of a search string in Entrez • {T1} AND {T2} • Return all documents that contain both terms • {T1} OR {T2} • Return all documents that contain either term • {T1} NOT {T2} • Return all documents that contain the first term, but not the second • Use of these operators is case sensitive, they must be in all uppercase letters
Operators (cont) • Complex statements are supported • Entries are evaluated left-to-right • Parenthesis can be used to force a particular order of evaluation, similar to mathematical statements • Example: gene AND (acid OR base) • If multiple terms are entered they are automatically AND’ed together • To force Entrez to search for an exact phrase, use quotes
Operators (cont) • To search for authors, enter their names must be entered in a particular format: • {Last name} {initials} • No punctuation • Only author fields will be searched in the database • Searches can be further limited by adding [AUTHOR] to the query string • Accession numbers or sequence identification numbers can be searched, but specific formats are required, for more information visit: http://www.ncbi.nlm.nih.gov/books/bookres.fcgi/helpentrez/EntrezHelp.pdf (page 9)
Operators (cont) • Molecular weights can be searched in the following format: • {weight}[Molecular Weight] • {weight minimum}:{weight maximum}[Molecular Weight] • Other searches • Accession numbers, [ACCN] • Sequence Length [SLEN] • Both can be combined with the range operator “:”
Operators (cont) • Truncating • Search terms can be truncated by adding an asterisk at the end of the string. All documents that begin with provided the string will be returned. • Limits • Individual databases may allow different limits to be set, exactly how this is done varies between the databases that make up Entrez, see the help document. • Example
NCBI accounts • Registration • If you create an account with the NCBI website you may use many advanced functions for searching • For example, you may save your search history for further use • You may also mark individual results and add them to a “collection” for future reference • Example http://www.ncbi.nlm.nih.gov/
Utilities • Several utilities exist for processing Entrez queries • These utilities may be used to provide advanced searching techniques or data parsing • See http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html for more information about what is available.
Sources • Entrez Help • http://www.ncbi.nlm.nih.gov/books/bookres.fcgi/helpentrez/EntrezHelp.pdf • Entrez • http://www.ncbi.nlm.nih.gov/sites/gquery • Utility Information • http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=coursework.chapter.eutils