250 likes | 535 Views
Searching Biochemistry SF-hbz meeting 2010. Biochemistry is a cornerstone of the CAS databases. Dedicated 19 sections for Biochemistry Almost 25,000 journal references in the biochemistry section from German universities in 2008-now Published in over 2600 different journals
E N D
Searching Biochemistry SF-hbz meeting 2010
Biochemistry is a cornerstone of the CAS databases • Dedicated 19 sections for Biochemistry • Almost 25,000 journal references in the biochemistry section from German universities in 2008-now • Published in over 2600 different journals • Indexing of biological species, thesaurus of taxa, CAS Registry numbers for biochemical compounds SciFinder is a trademark of the American Chemical Society
Most common journals for German biochem research SciFinder is a trademark of the American Chemical Society
Search example: Trisporic acids • Trisporic acids are a group of related compounds that are pheromones for fungi of the Mucorales species • In the terminology there is Trisporic acid A, B, C, D and E, sometimes in different enantiomers • Trisporic acids are a group of related compounds that are pheromones for fungi of the Mucorales species • In the terminology there is Trisporic acid A, B, C, D and E, sometimes in different enantiomers • There are some slight chemically different derivatives, esters, and hydrogenated compounds derived from trisporic acids SciFinder is a trademark of the American Chemical Society
Let’s meet the team of Trisporic acids TSA-A TSA-B (trans) TSA-C TSA-D TSA-E (cis) SciFinder is a trademark of the American Chemical Society
Some close derivatives of Trisporic acids Trisporol B Trisporin B Apotrisporin Trisporone Methyl trisporate B SciFinder is a trademark of the American Chemical Society
How can these variations be searched by name? • SciFinder’s Substance Identifier Explore is designed to find one exact compound • If the substance name is not found as an exact match of the full name • The name is searched as natural name fragments • TRISPORATE doesn’t retrieve TETRAHYDROTRISPORATE • If that also doesn’t produce any results the name is searched as smallest chemically relevant fragments • TRISPOR will retrieve DEOXYTRISPORONE • When the search for name fragments retrieves more than 100 answers, SciFinder will only indicate there will be too many answers SciFinder is a trademark of the American Chemical Society
Name searching example: exact match Only one answer is retrieved because there is a compound with that exact name This compound is indexed when the author doesn’t specify which variation is used SciFinder is a trademark of the American Chemical Society
How could SciFinder search name fragments? We may miss compounds only indexed with a systematic name Retrieves 24 answers that has “trisporic” as a natural name fragment SciFinder is a trademark of the American Chemical Society
The systematic approach by structure Retrieves 41 answers But not the unknown substance R1=H/O SciFinder is a trademark of the American Chemical Society
Of which 24 are not retrieved in the name search Answers include several esters and stereoisomers without trivial names SciFinder is a trademark of the American Chemical Society
Combining the results by name and structure SciFinder is a trademark of the American Chemical Society
A broad answer set is obtained SciFinder is a trademark of the American Chemical Society
This set produces 87 de-duplicated literature refs SciFinder is a trademark of the American Chemical Society
The Categorize analysis on Organisms: SciFinder is a trademark of the American Chemical Society
How would this compare to a similarity search? SciFinder is a trademark of the American Chemical Society
A ranked list of similar compounds are found • The 41 substructure search answers are all included in the 75-99% similarity search result SciFinder is a trademark of the American Chemical Society
Some fairly common compounds can be found uniquely in the similarity search result Total number of substances for the search with at least 75% similarity is 368 These lead to 650 de-duplicated references SciFinder is a trademark of the American Chemical Society
The categorize on the similar compounds shows details of the organisms or the therapeutic areas SciFinder is a trademark of the American Chemical Society
Searching for organism names Has 534 de-duplicated answers SciFinder is a trademark of the American Chemical Society
Searching for organism names in REGISTRY SciFinder is a trademark of the American Chemical Society
Comments on searching biological names • Narrower terms are not included in the research topic explore smart searching today • Full genus-species names are often indexed, but abbreviations also occur (B. Trispora) • Names of genes, clones, receptors, antagonists, etc. are often written in codes with special symbols (no standardization), e.g., 5-HT, 5-hydroxy-tryptophan, 5-HTP, serotonin receptor • Sequenced biological entities should be searched in Registry with name fragments • Patented sequences often only have SEQID codes and no biological descriptions SciFinder is a trademark of the American Chemical Society
CAS has a project to look at biological entity searching • What is needed to attract biologists/biochemists • How often would a search by species and subspecies be needed? • How much control should be given to the user? • How could this be made most intuitive? • Should specific sequences from these biological entities be automatically searched? • How often would extremely broad families be searched (all vertebrates, all fungi)? SciFinder is a trademark of the American Chemical Society
CAplus has a rich thesaurus for biological taxa which is used in the Categorize function, but not in the searching • Search all subspecies of Mucorales • Absidia • Absidia anomala • Actinomucor • Backusella • Blakeslea • Blakeslea trispora • Cunninghamella • Mucor • Rhizopus • Zygorhynchus SciFinder is a trademark of the American Chemical Society
Focus on Biochemistry Questions? Comments?