220 likes | 348 Views
Build VIVO in the Cloud. NIH Workshop on Value Added Services for VIVO http://scimaps.org/flat/meeting/110325/ Brand Niemann Semantic Community March 25-26, 2011 http://semanticommunity.net and http://semanticommunity.info/Build_VIVO_in_the_Cloud. Wiki Page: Top. Wiki Page: Bottom.
E N D
Build VIVO in the Cloud NIH Workshop on Value Added Services for VIVO http://scimaps.org/flat/meeting/110325/ Brand Niemann Semantic Community March 25-26, 2011 http://semanticommunity.net and http://semanticommunity.info/Build_VIVO_in_the_Cloud
Workshop Goals • This 1 ½ day workshop brings together programmers and users of VIVO specifically and other national researcher networking (NRN) services in general. Demonstration of existing approaches, tools, and techniques as well as discussion of synergies, will provide a point of departure for developing value added services to massive amounts of interlinked (semantic web) scholarly data. • The goals of the meeting are: • Present and discuss current research, tools, and services for VIVO/NRN. • Identify synergistic collaboration opportunities and challenges. • Outline a course of activities over the next 5 years. • Given the diverse backgrounds of the participants and the goals of the workshop, we will use the first ½ day for brief self-introductions, followed by three 30 minute overview talks that set the stage for the workshop. The day concludes with a discussion of challenges and opportunities and a hosted dinner. • The second full day features brainstorming and discussion sessions in different team sizes and combinations. A particular focus is on application development—how VIVO and other NRN services can support application development, how application development tools can be shared, and what kinds of NRN applications and services are technologically feasible and most beneficial for supporting science.
Brief Self-Introductions • My Exhibit Entry • An Interface to the Digital Library of the Atlas of Science, January 27, 2011: http://semanticommunity.info/Atlas_of_Science. • Working with Katy's Borner's Suggested Databases: • “This suggests it might be useful to try to find open data sources for at least some of the maps so others could reproduce and build on the original results: http://sdb.cns.iu.edu and http://sci2.cns.iu.edu are all about open data and open code.” • This Pilot (lots of valuable content in PDF files!) • Your great content should all be in linked open databases in the cloud along with the individual profiles. • Prepare for the 2011 VIVO Conference, August 24-26, 2011. • Continuation of this pilot.
The Challenges • Enabling collaboration and discovery between scientists across all disciplines by providing semantic web-compliant data to the network. • http://vivoweb.org/participate • The application provides linked data via RDF data making you part of the semantic web or you can use any other application that provides linked data. You can also get involved with developing applications that provide enhanced search, new collaboration capabilities, grouping, finding and mapping scientists and their work. For more information, see the participation page. • http://vivoweb.org/about/faq/about-project
Key Concepts • More broadly, VIVO is committed to the open publication, use and reuse of this researcher related data. The project plans to make data in VIVO available through embedded RDFa, through web APIs supporting the bulk download of RDF, and potentially through SPARQL endpoints. One of our biggest opportunities is in working with existing Semantic Web efforts to ensure that they can fully leverage VIVO data and integrate it into larger scientific and social systems. • http://vivoweb.org/files/websci10_submission_82.pdf • Web Science and “Little Semantic Web”: • Hendler, J., Shadbolt, N., Hall, W., Berners-Lee, T. and Weitzner, D. Web science: an interdisciplinary approach to understanding the web. 51 (7). 60-69, http://doi.acm.org/10.1145/1364782.1364798; and Jim Hendler, 1997 or 1998, http://www.cs.rpi.edu/~hendler/LittleSemanticsWeb.html. • Sparklines by Edward Tufte: • Small, high resolution graphics embedded in a context of words, numbers, images; data-intense, design-simple, word-sized graphics; and information graphic characterized by small-size & data intensity: • E.g. OMB data shows the ebb and flow of the deficit from 1983 – 2003. • More information at Wikipedia entry for sparkline & Tufte's article on sparkline.
The Realities • Open Linked Data (via RDF or N3): • Advantage: Accessible by anyone on the Web. • Disadvantage: Difficult to work with large amounts of data quickly/easily. • Examples: • N3: http://vivo-vis-test.slis.indiana.edu/vivo/individual/Person72/Person72.n3 • RDF: http://vivo-vis-test.slis.indiana.edu/vivo/individual/Person72/Person72.rdf • SPARQL Endpoints: • Advantage: Working with data is easier/faster (using SPARQL queries). • Disadvantage: May not be accessible to everyone. • IU Research Instance: • http://vivo-vis.slis.indiana.edu • Contains test data for visualization development. • Scopus publication data from May 2008 (grant data from NSF in the works). • Data was converted from CSV to RDF format.
My Suggestions • Need to use cloud computing which includes both linked open data (RDF) and any other application that provides linked data and that supports multiple visualizations, sparklines, statistics, etc. with: • Three types of content: • High-level organization and conferences; • Scholarly databases; • Individual profiles. • Need both links between databases and links to the Web.
My Suggestions • 1. Making Individuals In to Information Architects and Preservationists • OpEd submitted to FCW, January 12, 2011, and published January 31, 2011. • 2. Data Services - What Data.gov and Many Other Things Should Be • OpEdsubmitted to FCW, February 9, 2011. • 3. Federal Cloud Computing: It can really happen if we can do our own IT! • OpEdsubmitted to FCW, February 12, 2011. • 4. Gov 2.0 Platform Data Services with Cloud Computing: OMB Earmarks Database • OpEd submitted to FCW, February 19, 2011. • 5. Gov 2.0 Platform Data Services with Cloud Computing: HealthDataGov • OpEd in process. • Now I want to bring all of those together in a specific example.
1. Making Individuals In to Information Architects and Preservationists http://fcw.com/Articles/2011/01/31/COMMENT-Brand-Niemann-social-media-archives.aspx For links see: http://semanticommunity.info/A_Gov_2.0_spin_on_archiving_2.0_data
Next Steps: Scholarly Database Search results retrieved from different databases can be downloaded as data dump in csv file format. Recall Slide 12 http://sdb.cns.iu.edu/about/
Next Steps: Scholarly Database http://sdb.cns.iu.edu/download/?q=year:[1865%20TO%202010]&db=uspto&search=b3be4735bf6e4be0b49848b93c018820