600 likes | 684 Views
Analysis Environments For Scientific Communities From Bases to Spaces. Bruce R. Schatz Institute for Genomic Biology University of Illinois at Urbana-Champaign schatz@uiuc.edu,www.beespace.uiuc.edu. Baker Center for Bioinformatics Iowa State University October 6, 2006.
E N D
Analysis EnvironmentsFor Scientific CommunitiesFrom Bases to Spaces Bruce R. Schatz Institute for Genomic Biology University of Illinois at Urbana-Champaign schatz@uiuc.edu,www.beespace.uiuc.edu Baker Center for Bioinformatics Iowa State University October 6, 2006
What are Analysis Environments • Functional Analysis • Find the underlying Mechanisms • Of Genes, Behaviors, Diseases • Comparative Analysis • Top-down data mining (vs Bottom-up) • Multiple Sources especially literature
Building Analysis Environments • Manual by Humans • Interaction user navigation • Classification collection indexing • Automatic by Computers • Federation search bridges • Integration results links
Trends in Analysis Environments Central versus Distributed Viewpoints • The 90s Pre-Genome • Entrez (NIH NCBI) versus • WCS (NSF Arizona) • The 00s Post-Genome • GO (NIH curators) versus • BeeSpace (NSF Illinois)
Pre-Genome Environments Focused on Syntax pre-Web • WCS (Worm Community System) • Search words across sources • Follow links across sources • Words automatic, Links manual Towards Integrated Searching
Post-Genome Environments Focused on Semantics post-Web • BeeSpace (Honey Bee Inter Space) • Navigate concepts across sources • Integrate data across sources • Concepts automatic, Links automatic Towards Conceptual Navigation
Worm Community System • WCS Information: Literature BIOSIS, MEDLINE, newsletters, meetings Data Genes, Maps, Sequences, strains, cells • WCS Functionality Browsing search, navigation Filtering selection, analysis Sharing linking, publishing • WCS: 250 users at 50 labs across Internet (1991)
WCS Molecular
WCS Cellular
WCS invokes gm
WCS vis-à-vis acedb
Towards the Interspace • from Objects to Concepts • from Syntax to Semantics • Infrastructure is Interaction with Abstraction Internet is packet transmission across computers Interspace is concept navigation across repositories
THE THIRD WAVE OF NET EVOLUTION CONCEPTS OBJECTS PACKETS
Technology Engineering FORMAL (manual) Electrical IEEE communities INFORMAL groups (automatic) individuals LEVELS OF INDEXES
Post-Genome Informatics I Comparative Analysis within the Dry Lab of Biological Knowledge • Classical Organisms have Genetic Descriptions. There will be NO more classical organisms beyond Mice and Men, Worms and Flies, Yeasts and Weeds. Must use comparative genomics on classical organisms Via sequence homologies and literature analysis.
Post-Genome Informatics II Functional Analysis within the Dry Lab of Biological Knowledge • Automatic annotation of genes to standard classifications, e.g. Gene Ontology via homology on computed protein sequences. • Automatic analysis of functions to scientific literature, e.g. concept spaces via text extractions. Thus must use functions in literature descriptions.
Informatics: From Bases to Spaces data Bases support genome data e.g. FlyBase has sequences and maps Genes annotated by GeneOntology and linked to biological literature information Spaces support biological literature e.g. BeeSpace uses automatically generated conceptual relationships to navigate functions
BeeSpace FIBR Project BeeSpace project is NSF FIBR flagship Frontiers Integrative Biological Research, $5M for 5 years at University of Illinois Analyzing Nature and Nurture in Societal Roles using honey bee as model (Functional Analysis of Social Behavior) Genomic technologies in wet lab and dry lab Bee [Biology] gene expressions Space [Informatics] concept navigations
System Architecture BeeSpace Concepts Concepts SEQ Expressions Expressions Databases Bees Flies Documents Documents SEQ Community Community
Behavioral Molecular Biologist Biologist Molecular Biology Literature Brain Gene Bee Bee Expression Literature Genome Profiles Flybase, Brain Region WormBase Localization Neuroscience Literature Neuro- scientist Concept Navigation in BeeSpace
V1 BeeSpace Community Collections • Organism • Honey Bee / Fruit Fly • Song Bird / Soy Bean • Behavior • Social / Territorial • Foraging / Nesting • Development • Behavioral Maturation • Insect Development • Insect Communication • Structure • Fly Genetics / Fly Biochemistry • Fly Physiology / Insect Neurophysiology
Semantic region term Concept Space Concept Space CONCEPT SWITCHING • “Concept” versus “Term” • set of “semantically” equivalent terms • Concept switching • region to region (set to set) match
BeeSpace Analysis Environment • Build Concept Space of Biomedical Literature for Functional Analysis of Bee Genes -Partition Literature into Community Collections -Extract and Index Concepts within Collections -Navigate Concepts within Documents -Follow Links from Documents into Databases Locate Candidate Genes in Related Literatures then follow links into Genome Databases
Well Characterized Gene Ling et. al., PSB 2006
Poorly Characterized Gene Ling et. al., PSB 2006