210 likes | 346 Views
Research Output Repository Platform (code name “ Famulus ”). Alex Wade – Research Program Manager Savas Parastatidis – Software Philosopher eScience Workshop 2008. Tutorial Objectives And Takeaways. Overview of Semantic Computing concepts Context of Research Repositories
E N D
Research Output Repository Platform(code name “Famulus”) Alex Wade – Research Program Manager Savas Parastatidis – Software Philosopher eScience Workshop 2008
Tutorial Objectives And Takeaways • Overview of Semantic Computing concepts • Context of Research Repositories • How to leverage our technologies to build a solution for the digital repositories community • Takeaway: How to use MS technologies in order to build an extensible semantic computing platform • Takeaway: The role of digital repositories in the future academic and research environments and what MS can offer
Outline • Background • Semantic Computing • Research-output Repositories • MSR’s Repository Platform • Features • Architecture • Demos • Wrap-up But we love to improvise! So let’s make the session interactive
Semantics • Term used to refer to the concept of “meaning” • The linguistics, AI, Natural Language Processing, etc. communities have been working on “meaning” and ”knowledge” related technologies for decades • Semantic Computing • Emergence of a new breed of technologies to capture meaning (RDF, OWL, etc.) • Combine with the pervasiveness of the Web community technologies such as folksonomies …
What is Semantic Computing? • Set of concepts and technologies • Data modeling • Relationships • Ontologies • Machine learning (entity extraction) • Inference, reasoning • Data, information, knowledge… Current technologies Possibilities for innovation
Semantic Computing Set of technologies to... • Model data and their connections • e.g. RDF, Topic Maps, Unified Content Descriptors • Capture concepts and their relationships • e.g. OWL • Query data and produce information • e.g. SPARQL • Reason about data, concepts, information • e.g. Pellet • Extract structured information (machine learning) • e.g. Live Labs entity extraction (http://labs.live.com/Entity+Extraction.aspx)
Today… Computers aregreat tools in huge amountsof data For example, Google and Microsoft both have copies of the Web for indexing purposes
Tomorrow… Computers aregreat tools in huge amountsof data We would likecomputers to alsohelp with theautomatic of the world’s information
Background • Traditional research output = Journal articles • Pros: peer-review, indexed, archived • Cons: timeliness, cost, access, format limits • Response = Digital Repositories • Subject Repositories • arXiv.org (Physics, Math, CS) • PubMed Central (Biomedical) • Institutional Repositories • Data Type repositories • Data sets, presentations, workflows, etc.
Famulus A platform for building services and tools for research output repositories • Papers, Videos, Presentations, Lectures, References, Data, Code, etc. • Relationships between stored entities Goals • Enable a tools and services ecosystem for “research output” repositories on MS technologies
Services – Interoperability as one of the primary goals • Modeling • RDFs – RDF Schema • Syndication and Re-Use • RSS/Atom • OAI-PMH – Protocol for Metadata Harvesting • OAI-ORE – Object Re-Use and Exchange • Ingest & publishing protocols • SWORD – Simple Web-service Offering Repository Deposit • AtomPub • BibTex
Famulusarchitecture goals 3rd-party services, tools, applications Famulusservices, Web, interoperability Goals • Create a platform for building “research output” repositories • Engage with the digital library and scholarly communications community • Support an ecosystem of services and tools • Available to the community for free (we are still considering the open source route) • Build an easy-to-install collection of basic services and tools Non-goals • A generic platform for asset management • Support the lifecycle of publications • Compete with existing repository solutions FamulusPlatform (Based on the Entity Framework + Data model) SQL Server 2008, MS data storage technologies, Entity Framework runtime, .NET 3.5, LINQ
Research Output Repository Platform • A Semantic Computing platform • A hybrid between a relational database and a triple store • Triple stores • Evolution friendly • Poor performance • No need to model everything in advance • Semantic interpretation at the application level • Relational schema • Evolution not so easy • Great opportunities for optimization • Model everything in advance • FamulusPlatform • Maintain a balance • Try to model the frequently used entities in our app domain • Try to capture the frequently used relationships • Allow for extensibility (Relationships, Properties)
An intuitive programming experience Person tony = newPerson(); Publication pub1 = newPublication(); pub1.Title = "Title1"; Publication pub2 = newPublication(); pub2.Title = "Title2"; pub1.Cites.Add(pub2); pub1.Authors.Add(tony); Tagtag = newTag(); tag.Name = "keyword"; pub1.Tags.Add(tag);
FamulusPlatfomr PDF file Lecture on 2/19/2008 contains is representation of PowerPoint presentation authored by organized by tony presented by Elizabeth, Sebastien, Matthew, Norman, Brian, Sarah, George, Roy
Release Roadmap • Customer Technology Preview has been released • Requires SQL Sever 2008 (Express) • Public beta January-February 09 timeframe • RTM ???
© 2008 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.