650 likes | 831 Views
Getting Started: PCB3063 Term Project and NCBI’s OMIM, PubMed and Sequence Resources. Michele R. Tennant, Ph.D., M.L.I.S. Health Science Center Libraries/ U.F. Genetics Institute PCB3063, General Genetics tennantm@ufl.edu. Today’s Session. Your term project
E N D
Getting Started: PCB3063 Term Projectand NCBI’s OMIM, PubMed and Sequence Resources Michele R. Tennant, Ph.D., M.L.I.S. Health Science Center Libraries/ U.F. Genetics Institute PCB3063, General Genetics tennantm@ufl.edu
Today’s Session • Your term project • Resources to help you with your project … • HSCL Website, Catalog, etc. • NCBI Resources: • OMIM – “review articles” • PubMed – journal articles • Nucleotides/RefSeq – gene sequences • Receive your term project topic
Your Term Project • Scientific poster on an assigned genetic disorder • Should cover all aspects of genetics – • Mode of inheritance • What gene normally does • What protein is encoded by gene • Map location and gene structure • Types of mutations and what they do to protein • Potential for gene therapy • Etc. (more info next time)
Your Term Project • Four assignments for your project: • Part A: • Identify disorder/gene in OMIM and MeSH • E-learning assessment by start of class Feb. 11 • Part B: • Literature and sequence searches • E-learning assessment by start of class Feb. 25 AND paper form and search print-outs in class Feb. 25 • Part C: • Structure, SNP, map and clinical db searches • E-learning assessment by start of class Mar. 30 AND paper form and search print-outs in class Mar. 30 • Poster Presentations – Apr. 15 • Note – keep Parts A, B, and C (and the corresponding search print-outs) once they are returned to you; you may need to include resubmit them with your poster.
NCBI • National Center for Biotechnology Information • Located on the Bethesda National Institutes of Health campus • Part of the National Library of Medicine (NLM), which is part of the NIH • Created by Congress in 1988 • Home of GenBank since 1992
NCBI Mandates • Develop automated systems for the storage, retrieval, and analysis of molecular, genetic and biochemical information • Develop software for the study of molecule structure and function
NCBI Mandates • Facilitate the use of molecular databases and programs by both researchers and clinicians • Coordinate international cooperation in gathering molecular, genetics and biochemical data
Effective Searchers ... • know the content of the database • subjects, type of data, years of coverage, curated vs. non-curated • understand the structure of the database • record structure, searchable fields, controlled vs non-controlled vocabularies • understand searching options and tools • thesaurus, limits, AND/OR, etc.
Entrez • Search tool on the NCBI website • Contains a variety of databases: • Nucleotide sequence; Protein sequence; Molecular structure; SNPs; Expression data; Journal literature • Each “database” contains “records” • Each “record” in database contains “fields”
Entrez Search Options • Similar among the various databases • Entrez conventions: AND, OR, NOT, * • Three ways to search: • Basic: just enter your search terms • Advanced: more controlled search - uses limits, preview/index, history • Complex Boolean: command language with qualifiers in brackets; • syntax= term [field] AND term [field] etc.
Entrez Differences • Differences among the various databases • Different search fields available • Different limits available • Some controlled, some non-controlled • Some archival, some curated
Two Ways to Get to NCBI • Directly at - http://www.ncbi.nlm.nih.gov/ • Through HSC Library’s webpage: • http://www.library.health.ufl.edu/ • Click on “Databases” icon • Click on “NCBI” icon
www.library.health.ufl.edu Click on “Databases” from HSCL Website
OMIM - Online Mendelian Inheritance in Man • Catalog of human genes and genetic disorders • 19,854 records (as of 1/27/10) • Records are basically “review articles” • Records link to PubMed, sequences, structures, etc. • Built on Entrez architecture • Search tip – look for your disease or gene in “title” field on “Limits” page
Choose OMIM from the dropdown and then click on “search” to reach the OMIM page
We will search for information on “Sipple Syndrome”, but first we limit so that we search only in the title field x Limit so that your terms reside only in the “title”
Type in Sipple Syndrome, then click “Go” Link to discussion of Sipple Syndrome Link to OMIM Gene Map
Table of Contents for Sipple Syndrome record Record was retrieved via these words in title Link to record for the RET Oncogene
PubMed • Journal literature database • Pre-clinical and clinical information – best literature database to use for Dr. Miyamoto’s project • Approximately 5,200 journals covered; currently over 18,000,000 records • Most citations include abstract • Can search via keyword, but has been built to take advantage of controlled vocabulary search
Controlled vs Non-controlled Vocabularies • “Old People” Example
Controlled Vocabulary • Controlled terms act as “umbrella” to pick up all synonyms, spelling differences (hemoglobin/haemoglobin), singular vs plural, etc. • In PubMed, use MeSH Database to find and search controlled MeSH terms (Medical Subject Headings) • Once in MeSH Database, can use additional options to enhance search (major heading, subheadings, etc.)
MeSH Example • Find journal articles on the “immunological aspects of breast cancer and vaccines”; but only those papers where “immunological aspects of breast cancer” is the main point of the articles you find. • Search PubMed
Enter PubMed through our direct link (rather than through NCBI) and you will be able to directly see if the HSCL owns the journal articles you find
The “ufhsclib” indicates that you have entered PubMed correctly, and that the journals the library owns will be apparent Use the MeSH Database as a dictionary to find the appropriate MeSH term, and then to refine your search
Note that we have left PubMed and are in the MeSH “dictionary” AIDS You typed “breast cancer” into MeSH database Use “breast neoplasms” rather than breast cancer Click on the link to refine the search
Topical subheadings help focus search to one or more aspects of the subject Check here and your topics will be the main point of the articles you find – you won’t get peripheral citations. Not recommended the first time you search a topic – if there are few papers in existence for your topic, you may be left with no articles at all
Note that the term “Breast Neoplasms” will pick up all the more specific types of breast cancer
Send your search to the search box MeSH automatically builds the search for you – in this example, you are looking for papers in which the immunological aspects of breast cancer are the main point of all the articles you retrieve Click “Search PubMed”
Once you have sent the search to the search box, and clicked on “search PubMed”, you leave the MeSH Database, and the search is performed in PubMed Note that this is the search the MeSH Database built for you – it used the MesH term “breast neoplasms”, glued “immunology” directly to the search by using the slash, and picked up all the different types of breast neoplasms. MeSH also retrieved only the papers where these topics were the main points of the articles. You did not need to do any of this yourself – MeSH did it for you once you found the proper MeSH term, and clicked on subheading. Now we need to complete the second half of the search – vaccines
Now we need to complete the second half of the search – vaccines. Pull down the drop-down so you are in MeSH again, and search for the MeSH term. Look through the list to see if there is one that is most appropriate. Since we are looking for vaccines related to breast cancer, perhaps “cancer vaccines” would be useful. Read the “scope note” to be sure. Scope Note
As in the breast cancer search, you can choose a subheading and limit to articles where this topic is the main point; I’ve chosen not to do so here (if you don’t choose suheadings or main point, remember to click on the check box next to “cancer vaccines”.) Send to search box; click “search PubMed” You’ve now found articles on cancer vaccines, but you need to combine the breast cancer and cancer vaccines concepts
Boolean Operators • Search statements may be combined using AND, OR, NOT AND OR NOT
To combine searches, choose “Advanced Search” The Advanced Search screen displays your PubMed history; from here you can combine your two searches using the appropriate Boolean operator For Part B, print the PubMed history, which shows your searches.
You have now found papers in which the immunology of breast cancer is the main point of the article, and those papers are also about cancer vaccines
MeSH etc. • MeSH Database: • Found appropriate search terms • Automatically exploded “breast neoplasms”, so narrower terms (“breast neoplasms, male”, “carcinoma, ductal, breast”, etc) were ORed together • Allowed the addition of subheadings (immunology) to narrow to a particular aspect • Allowed narrowing to “main point” • Use History to combine (AND)
MeSH Caveats • Performing a MeSH search is usually more precise and exhaustive than a keyword search, however: • The most recent papers are not searched - therefore should also complete a keyword search “in process” • Very new concepts/scientific terms may not yet be represented by MeSH • Very specific or rare concepts may never be represented by MeSH • So sometimes you will need to do a keyword search as well
In Process • In our “breast cancer, immunology, cancer vaccine” example, perform the following keyword search, only in the newest records (in process) • ((vaccin*) AND (breast cancer* OR breast neoplasm* OR breast tumor*)) AND in process [sb] • Try as many synonyms as possible • [sb] must be included to tell computer to just search the “in process” part of the database • * truncates to word root • This search picks up the current articles that do not yet have MeSH terms
Link Out to E-journals • Remember, if you entered PubMed directly from the HSCL’s icon, you can see if the HSCL owns the journal articles you found • Choose the “abstract” or “citation” displays from the pulldown menu • Brown and blue icons tell if the HSCL owns that journal issus electronically or in print • Will NOT tell you what is available at Marston Science Library
What if PubMed does not indicate the article is owned at UF? • Use the “Catalog” to see if the paper is available in print at the HSCL, Marston Science Library or elsewhere on campus • The catalog may also be used to help locate books, government documents, videotapes, etc – items that are not indexed in PubMed
www.library.health.ufl.edu Click on “Catalog” from HSCL Website
Entrez Nucleotides (GenBank) • Database of nucleotide sequences (ATGC) • Actually contains data from several databases - GenBank, EMBL, DDBJ, RefSeq • Hard to search because many submitting scientists send in redundant information and poorly annotated information
Nucleotide Data Domain • As of December 15, 2009 • Over 110,118,557,163bases • Over 112,910,950sequence records • Over 200,000 species represented • Some complete genomes and chromosomes
Organisms Represented • Homo sapiens • Many model organisms, including: • Mus musculus • Caenorhabditis elegans • Oryza sativa • Drosophila melanogaster • Arabidopsis thaliana • Non-model organisms as well (trout, etc.)
International Collaboration • Contributors: • GenBank • European Molecular Biology Laboratory (EMBL) • DNA Databank of Japan (DDBJ) • Daily exchange of data among these groups
GenBank Sample Record • Before searching, we will look at the GenBank sample record • Retrieve the sample record from the main page – click on “DNA & RNA”, then “GenBank”, then choose the “record” link. • Note that the “Features” field provides useful biological information, and may be searched