420 likes | 517 Views
CTSAconnect. Melissa Haendel Jon Corson- Rikert Carlo Torniai NCBO CTSA workshop, Baltimore April 24th, 2012. Outline. Researcher expertise Ontology alignment Ontology driven applications CTSAconnect project. Expertise systems rely largely on publications.
E N D
CTSAconnect Melissa Haendel Jon Corson-Rikert Carlo Torniai NCBO CTSA workshop, Baltimore April 24th, 2012
Outline • Researcher expertise • Ontology alignment • Ontology driven applications • CTSAconnectproject
Expertise systems rely largely on publications Created by: AsikPradhanGongaju, Chintan Tank (programming), Nianli Ma (data acquisition), Elisha F. Hardy Allgood (design), Micah Linnemeier and Katy Börner (concept).
What do scientists really do? How many publications? … and in which journals with what impact factors? How many grants did they get? What about all the other things that scientists do? What else do they produce?
Researchers are more than just their publications • Why not get “credit” for • Teaching a course? • Writing a blog? • Submitting a sequence to GenBank? • Sharing a Resource? • Why are these things not on our CVs and biosketches? Why are we not using them to connect people, share resources and expertise, and promote collaboration?
Context: eagle-i • Research resource discovery project (eagle-i.net) developed for past 2+ years using OBO Foundry ontologies • OBO Foundry principles have been helpful with individual ontologies and improved alignment e.g., orthogonality • But Foundry is missing the people and organizations piece • So started collaborating with VIVO ontology team to bring people and organizations together with resources
Context: VIVO • Primarily focused on people, activities, and outcomes typically associated with research networking (vivoweb.org) • Eager to represent more diverse components of expertise, across domains • e.g., exhibits, performances, specifics about research • Had worked with core facilities at Cornell to represent labs, equipment, and services • Started collaborating with eagle-i to go further with research resources
Finding common ontological ground People Semantic Coordination Stay out of the weeds Avoid tit-for-tat comparisons, winner/loser Align under a common upper ontology (BFO) Provides basic principles for organization Meet in neutral territory to discuss compatibility of concepts before specific label choices Agree on what each ontology is trying to model OK to be interested in different granularity Integrated Framework eagle-i eagle-i VIVO Resources VIVO
Outline • Researcher expertise • Ontology alignment • Ontology driven applications • CTSAconnectproject
Ontology issues common to eagle-i and VIVO • Organization subtype definition is institution-specific and may have many axes of classification • People • Expertise • Research focus areas • Organization structure • Funding • Location • Lab space • Equipment and other resources Need shared URIs for organizations, and partonomy relations across them
Ontology issues common to eagle-i and VIVO Services: what are they? • An ongoing service offering provided by a lab repeatedly with generic inputs and outputs and providers? • Or a specific service delivery – a process, with specific inputs and outputs?
Ontology issues common to eagle-i and VIVO Needed to reconcile human and person classes
Ontology issues common to eagle-i and VIVO People’s and organizations’ classifications change over time • Situation #1 • I am a faculty member now but I may also be a student if I take a course – but will stop being a student when the course is over • We may wish to use defined classes instead of making type assertions that may someday be false • Situation #2 • I was a faculty member at Duke before I moved to Caltech and I want that information to still be represented • We may wish to represent temporal bounds on some classifications
Open World Assumption: if we don’t have an assertion that the role has ended, we can’t assume that it holds or not
Issues with time • How to temporally bound properties, processes – use OWL2 annotations, or create time interval classes? • The participation in the process may have different time interval than the process itself, and the participation may have its own properties • How to handle instance classification when one temporal boundary isn’t present
We need shared instances for common use Establishing permanent URIs will be essential for achieving the goals of Linked Open Data Organizations Persons Geographic locations Controlled vocabularies Gazetteer (GAZ) AGROVOC U.N. Geopolitical Ontology LC Subject Headings Global instance data Global instance data • Establishing permanent URIs for key data types will be an essential step forward in achieving the goals of linked open data: • What directories exist? • Will they interoperate? • Establishing permanent URIs for key data types will be an essential step forward in achieving the goals of linked open data: • What directories exist? • Will they interoperate?
Ontology issues common to eagle-i and VIVO • Interoperability of OBO and non-OBO ontologies • Numeric class and property identifiers have drawbacks for communication and writing SPARQL queries • Not all relevant ontologies are orthogonal • Terminology vocabularies may not be ontologies • VIVO and eagle-i have a need to reference portions of non-ontological vocabularies • Referenced terms may be either classes or instances • Not all relevant vocabularies are public • Need to support other lookup services • Dedicated services (e.g., ORCID, Virtual International Authority File, (VIAF), Library of Congress Subject Headings (LCSH)) • Shared instances for organizations, persons, journals, events
Outline Researcher expertise Ontology alignment Ontology driven applications CTSAconnectproject
Designing ontology-driven applications • Both eagle-i and VIVO are ontology driven using similar methodologies • The applications adapt when ontologies change • Both use separate ontologies to control application display and behavior • Potential for both applications to consume and present the same data, each its own way
Ontology roles at every level NIF, PubMedEntrezGene Search Applications ontologies Federated Network or Consortial Index Repositories (RDF) Data Collection and Editing Application Browsing and Local Search Application Researcher & resource information collection Glossary Application Terminology Services
Indexing linked data for search Scripps VIVO UF VIVO WashU VIVO eagle-I Research resources IU VIVO Harvard Profiles RDF Ponce VIVO Other VIVOs Cornell Ithaca VIVO Solr search index Weill Cornell VIVO Iowa Loki RDF Alter-nate Solr index vivo search.org Digital Vita RDF Linked Open Data
Layered ontology approach (eagle-i example) Goal: to decouple research resource representation from information used for application appearance and behavior • Application specific module • Classes, annotation properties, and individuals required to drive the UIs • Research resource Extended modules • Sets of “referenced taxonomies” • Research Resource core module • Classes and properties used to represent information about biomedical research resources. External imported terms (MIREOT)
Layered ontology approach Messy Fish™
Need for a shared (display) ontology to drive applications Common application needs: • Providing narrower domains and ranges on reused properties • Determining whether classes or properties are displayed • Formatting of data properties • Numbers of property statements to display • Ordering of statements • Customized rendering of properties and/or related individuals
Outline Researcher expertise Ontology alignment Ontology driven applications CTSAconnectproject
CTSAconnect: The Project 2012 to 2013 • Participating Institutions • Oregon Health & Science University • Cornell University • University of Florida • Stony Brook University • Harvard University • University at Buffalo • Funded by NCATS via Booz Allen Hamilton to the CTSAs
Analyzing Connections using the Integrated Semantic Framework Information About People Clinicians Researchers ClinicalExpertise ResearchResources Publications Information About What People Have and Do = Reveal Connections, Realize Potential + Clinical Expertise People + Resources
Stony Brook UMLS as linked data • Efforts centered at Stony Brook University • Includes attributes, relationships, and semantic types from the UMLS Janos Hajagos Erich Bremer
Stony Brook University: Moises Eisenberg, Erich Bremer, Janos Hajagos Margaret Morris Harvard University: Daniela Bourges-Waldegg Sophia Cheng Share Center: Chris Kelleher, Will Corbett, Ranjit Das, Ben Sharma University at Buffalo: Barry Smith, DagobertSoergel Project Team OHSU: Melissa Haendel, Carlo Torniai,Nicole Vasilevsky, ShahimEssaid, Eric Orwoll Cornell University: Jon Corson-Rikert, Dean Krafft, Brian Lowe, Stella Mitchell University of Florida: Mike Conlon, Chris Barnes, Nicholas Rejack, Stephen Williams CTSA 10-001: 100928SB23 PROJECT #: 00921-0001
Links • CTSAconnectproject ctsaconnect.org • eagle-i federated search eagle-i.net • VIVO integrated search vivosearch.org • CTSA ShareCenter ctsasharecenter.org • CTSAconnect ontology source http://code.google.com/p/connect-isf/
Eagle-i Data Collection Application ‘eagle-i preferred definition’ is used for tooltips Classes annotated with ‘primary resource type’ ‘eagle-i preferred label’ is used for the display name Property annotated as ‘’primary property’ Construct insert is an example of a resource annotated as an ‘embedded class’ Technique is annotated as ‘referenced taxonomy’