190 likes | 204 Views
Report on Scholarly Communication Initiatives @ Microsoft. Lee Dirks Director, Scholarly Communications Technical Computing MSR External Research Microsoft Corporation. Agenda. Context Our Mission & Mandate Engagement Model / Methodology Some Project Examples Future Directions
E N D
Report on Scholarly Communication Initiatives @ Microsoft Lee Dirks Director, Scholarly Communications Technical Computing MSR External Research Microsoft Corporation
Agenda • Context • Our Mission & Mandate • Engagement Model / Methodology • Some Project Examples • Future Directions • Your Questions & Feedback
Technical Computing @ Microsoft Life Sciences Social Sciences Earth Sciences Accelerating Discovery New Materials,Technologies& Processes MultidisciplinaryResearch Computer &Information Sciences Math andPhysical Science
Our Commitment to Science • Advancement of Science • Global Collaboration • Technology Excellence • Interoperability • Putting computing into science… • Applying Microsoft products and research technologies to advance the scientific research and engineering innovation process • Putting science into computing… • Investing in potentially breakthrough computer science research to address the Multicore challenges facing the IT industry
The Scholarly Communication Lifecycle Excel 2007 Windows Compute Cluster Server “Astoria” / “Pop Fly” Collaboration SharePoint LiveMeeting • Tablet PC/UMPC • Office 2007: • Word • PowerPoint • Excel OpenXML XPS SQL Server Rights Management Data Protection Manager Discoverability Live Search Academic & Books Libra 2.0 SharePoint Word 2007 + PowerPoint 2007 SharePoint WPF & Silverlight “Sea Dragon” / “PhotoSynth”
Why Scholarly Communication? • Science + computation are not the entire equation • Authoring, Analysis, Publishing, Discoverability, and Data Storage/Preservation are key components to scientists’ everyday work…and Microsoft’s core businesses • The scholarly community has made it clear to us: • Microsoft must improve its offerings throughout the scholarly communication lifecycle • MSR/TCI is uniquely positioned to drive this initiative within Microsoft • Our approach: Conduct prototyping projects and proofs-of-concept to evolve Microsoft’s scholarly communication offerings
Audiences We Focus On • Academics / Scholars (higher education setting) • Researchers / Scientists • Libraries / Archives • Academic, Research and National institutions • Scholarly Publishers & Societies • Both Open Access and For-Profit enterprises • Governments / Related Organizations • EU, NIH/NLM, NSF, NASA, etc. • JISC (UK), OCLC, CNI, DLF, NISO, etc.
Goal: Transform Scholarly Communication • Optimize for data-driven research & science (open data/access) • To both data (scientific) and to information (scholarly publications) • Reproducible research + computational science • Properly document / annotate scholarly output • Interoperability is paramount • Actively lobby and drive for consensus around technical standards and standardized protocols proactively adopted by the community; enable broad community engagement • Customers have told Microsoft that the interoperability (and intellectual property) are OUR responsibility • Data preservation (and provenance) should be baseline • Documentation of the data’s provenance • Reliable and secure long-term storage – at a massive scale • Preservation needs to be like “accessibility” features – i.e., assumed as required • Social networking & semantic knowledge discovery • Harnessing collective intelligence must be a consideration – since accessing research is a core step in the life-cycle. Enable knowledge discovery • Optimize for Web 2.0 scenarios and allow end-users/experts to find things easier • Metadata conventions / taxonomies / ontologies • This is a crucial strength for libraries – and a critical component in enabling Web 2.0
Our Engagement Model: “Dual Benefit” • Work with researchers around the world • Facilitate/advise on the application of technology • Link MSR researchers with (non-CS) researchers • Work with product groups • Provide feedback on the use of MS technologies • Identify research-driven requirements for products • Terms & Conditions • Microsoft typically shares IP (via BSD-type license) or makes source code available on http://www.codeplex.com • Microsoft will not develop on a Linux platform • Project Execution Models • Internal Development (FTE) • External Development (Vendor) • External Development (Institutional) • Mixed Model
Scholarly Communications: Current & Upcoming Projects • Current or Completed Projects • Cornell – arXiv.org + Word 2007 (and repository interoperability) • MIT / Broad Institute – Authoring (Word 2007) + data for research reproducibility • MSR – CMT++ interoperability with data + metadata transfer/exchange (conference management tool enhancements) • UC San Diego / PLoS – Semantic mark-up of scholarly articles (+ submission) • LiveLabs – eJournal publishing online service (community publishing tool) • Johns Hopkins University – Digital Archive for Astronomy/Astrophysics data (storage, preservation and access) • Planets Project / EU (with MSR – Cambridge) around OpenXML and file format preservation and interoperability • eChemistry Project (Cornell, Penn State, Indiana, Cambridge, Southampton) – ORE exemplar: access to compound chemical info objects (cross-repository access to open chemistry data) • Indiana University – Toolbox for Social Networking (SRT) • British Library – Researcher Information Centre (RIC) online workflow tool for scientists and researchers • University of Southampton (UK) – Port ePrints Repository Software for installation on the Windows platform • University of Manchester / “MyExperiment” Project – social networking for scientists • ORE Acceleration Project (OAI – Object Reuse & Exchange) • UK National Archives – Virtual PC / Emulation of legacy systems to facilitate preservation • National Library of Medicine / NCBI – “PubMed Int’l” UK version of PubMed + NLM DTD • Creative Commons Add-in for Office 2007 – evolving the Word 2003 effort • Pipeline • Chem4Word with Office & Cambridge University – Create add-in to Word 2007 to facilitate drawing of chemical compounds and equations • DRIVER 2 (EU) – Infrastructure integration of across a network of European research repositories
Project Example: GenePattern for Word 2007 • Integrate data and images from GenePattern workflows into research papers. Allow for research reproducibility by combining data with the text. • Highlights OpenXML and Office 2007 technologies as well as breaking new research ground with the integration of data & workflows with research papers. • MIT Broad Institute • (http://www.broad.mit.edu/) • Contracted Work • Infusion for development work via SOW • Broad for GenePattern Development for integration
NIH National Library of Medicine • NLM’s PubMedCentral repository contains full-text of research papers resulting from work funded by NIH • Working with NCBI using Word 2007 to author the NLM-DTD tag set • TCI assisted in deployment of PMC International in the UK, Japan, Italy, China and South Africa
Research Community Publishing • eJournal Project • Extending existing MSR ‘CMT’ Conference Management Tool to offer eJournal service • Developing a toolset for ‘self-publishing’ of workshop and conference proceedings and small journals • Research Repositories • Adapting ‘arXiv’ repository to accommodate Word 2007 and interoperable web services interfaces • Developed an open source (BSD) Windows version of ‘EPrints’ software at Southampton
The British Library’s “Researcher Information Center”Virtual Research Environment • Identify information sources, tools and services to support research in STM • Explore the application of new services • Collaborative filtering of literature, continual queries and more… • Intuitive to use and navigate, user configurable
International Virtual Observatory • Working with the Astronomy community to to build the IVO • Goal is for all astronomy data and literature online and cross indexed • Tools to analyze it • OpenSkyQuery Federation of ~20 observatories • Works and is used every day • Spatial extensions in SQL 2005 • Good example of Data Grid • Good example of Web Services • TCI is facilitating a library project to link astronomy publications to the data
Other Efforts & Initiatives • “Global Research Library 2020” with University of Washington (Oct07) • Planning to participate in application(s) to the NSF “DataNet” solicitation (as an unfunded partner) • Sponsoring BioMed Central’s 2008 Research Awards (Mar08) • Aug07 Issue of CT Watch Quarterly (v. 3, no. 3) • “The Coming Revolution in Scholarly Communications & Cyberinfrastructure” • http://www.ctwatch.org/quarterly/articles/2007/08/ • New Scholarly Publishing website at: • http://www.microsoft.com/mscorp/tc/scholarly-publishing.mspx
Questions / Feedback? Lee Dirks ldirks@microsoft.com http://www.microsoft.com/science