VIVO: Collaboration and Connections for Research Discovery and Scholarship

VIVO: Collaboration and Connections for Research Discovery and Scholarship Mike Conlon, University of Florida Kristi Holmes, Washington University, St. Louis Michele Tennant, University of Florida

Public, structured linked data about investigators interests, activities and accomplishments, and tools to use that data to advance science

What is VIVO? • VIVO is open standards and linked open data regarding research and scholarship – people, papers/products, funding, events, resources, projects, data, concepts – and the relationships between them • VIVO is open source, community maintained software tools for research discovery and networking • VIVO is a world community of collaborators – scientists, implementers, developers

How does VIVO store data? • Information is stored using the Resource Description Framework (RDF) as subject-predicate-object “triples” Dept. of Genetics College of Medicine professor in Jane Smith Genetics Institute has affiliation with Journal article author of Book Book chapter Subject Predicate Object

Semantic Web • Resource Description Framework (RDF) – W3C standard for representing triples. • RDF-Schema (RDFS) – descriptions for ontologies • OWL – Web Ontology Language • Semantic Web for the Working Ontologist, Second Edition: Effective Modeling in RDFS and OWL, Dean Allemang, James Hendler, 2011 • One or more ontologies for representing information • Sets of triples • SPARQL – a query language for RDF

VIVO and the Semantic Web • VIVO is a semantic web application • VIVO stores its data in triples • All triples have URIs for each element • All data is available via RDF • The VIVO ontology is available here http://vivoweb.org/ontology/core • VIVO contains an ontology editor • VIVO provides a data harvester • VIVO provides a SPARQL endpoint

… And the connections between them

Providing Data

Linked Open Data – September 2011

Providing Open Linked Data • VIVO version 1.3 completed. Includes spreadsheet upload. Google Refine. Harvester • Fifty US schools adopting VIVO • Harvard Profiles (30 sites) providing data using VIVO ontology and RDF • SciVal experts (20 sites) working to provide VIVO ontology-based RDF • American Psychological Association adopts VIVO for its 154,000 members • USDA adopts VIVO. 40,000 scientists, 80,000 staff, 50 land grant universities • CTSA Consortium to propose VIVO ontology and RDF as a consortium wide standard • University of Rochester to provide CTSA-IP as VIVO data • Eagle-I and VIVO working to produce common ontology via RDF • ORCID, Community of Science, Federal Researcher Profile System plan interchange with VIVO • Stony Brook producing UMLS concept linkages to VIVO profiles • Indiana provides HubZero profiles (3,000) via VIVO. Iowa Loki profiles (1,000) via VIVO. • Adoptions in Mexico, Costa Rica, Puerto Rico, India, China, UK, Netherlands, Brazil • Eight major Australian research universities and Australian federal research adopt VIVO • Thomson-Reuters and Elsevier providing data to VIVO • Wellspring offering individual VIVO profiles • Wellspring, Elsevier, Symplectics offering VIVO implementation services

Data, Tools and Community Web-based applications provide services for discovery and scholarship Linked Open Data in a common format, regardless of the system providing data

Software reads RDF from VIVO and displays processOrg<-function(uri){ x<-xmlParse(uri) u<-NULL name<-xmlValue(getNodeSet(x,"//rdfs:label")[[1]]) subs<-getNodeSet(x,"//j.1:hasSubOrganization") if(length(subs)==0) list(name=name,subs=NULL) else { for(i in 1:length(subs)){ sub.uri<-getURI(xmlAttrs(subs[[i]])["resource"]) u<-c(u,processOrg(sub.uri)) } list(name=name,subs=u) } } VIVO produces both HTML and RDF

Research Discovery and Scholarship Tools • Duke – web site plug-ins – OpenSocial, Drupal, WordPress for using VIVO data • Digital Enterprise Research Institute – analytics for VIVO data • UCSF – find investigators “like me” across the network • Harvard – visualize publication collaboration patterns • Northwestern – C-IKnow Recommender for team building • Pittsburgh – Digital Vita – produce vita and biosketches • Weill – Google Refine for VIVO data • Stony Brook – mapping people to UMLS concepts • APA -- identity management • CTSA consortium – NIH reporting • Community of Science – use VIVO data for faculty interests, route opportunities to faculty • Federal Researcher Profile System – avoid duplication of entry, simplify administration • OpenPHACTS– provide provenance for assertions regarding pharmaceutical compounds • National Research Networking visualization – show data sources and inventory of data

The VIVO Community http://vivoweb.org http://vivo.sourceforge.net

China

Building Community • Federal agencies – OSTP, NIH, NLM, NSF, USDA, FDP, EPA, FRPS, STAR Metrics, … • Publishers and Aggregators – Elsevier, Thomson Reuters, ORCID, CiteSeer, Arxiv, Plos, DSpace, Symplectics, … • Professional Societies – APA, AAAS, AIRI, AAMC, ABRF, … • International collaborators – Ireland, Germany, Australia, China, Netherlands, UK, Costa Rica, Iceland, Brazil, Mexico, India, … • Semantic Web community – DERI, Tim Berners-Lee, Jim Hendler, MyExperiment, ConceptWeb, Open Phacts (EU), Linked Data, … • Ontology – OBO, NBIC, Eagle-I, BRO, eBIRT, RDS, … • Open Source cooperatives – Kuali, Sakai, Duraspace, … • Social Network Analysis Community – Northwestern, Davis, UCF, INSNA, … • Schools and Consortia – CTSAs, Pitt, Stony Brook, Duke, Weill, Indiana, Emory, Iowa, Harvard, Rochester, UCSF, Stanford, MIT, Brown, Michigan, Nebraska, Colorado, Hunter, OHSU, Minnesota, … • Four annual events – conference, workshop, hackathon, implementation fest • Over 10,000 downloads, over 1,600 participants on distribution list

VIVO 2012, August 22-24, Hotel Intercontinental, Miami, Florida

Thank you! The VIVO Team 2011

VIVO: Collaboration and Connections for Research Discovery and Scholarship