280 likes | 565 Views
Carl Lagoze, Cornell University Prasenjit Mitra , William Brouwer (Penn State University) Mark Borkum (University of Southampton). oreChem : Linking Chemistry Scholarship into the Semantic Web and Web 2.0. Machine-actionable Substrate Integration of Datasets Exposure of Process.
E N D
Carl Lagoze, Cornell University PrasenjitMitra, William Brouwer(Penn State University) Mark Borkum(University of Southampton) oreChem: Linking Chemistry Scholarship into the Semantic Web and Web 2.0
Machine-actionable Substrate Integration of Datasets Exposure of Process The Fourth Paradigm
A “data-aware document” 2006 Astrophysics paper X-MM-Newton X-ray observation Vilspa, Spain Chandra X-ray observation Cambridge, MA Basic object information Strasbourg, France Hubble optical observation Baltimore, MD text
Reuse, Aggregation, Reuse ... Identity? Description?
Object-Centered Sociality Conversation M Collaboration Relationships Identity Mashup Actions Reputation Sharing Description Groups Presence
Open Archives Initiative – Object Reuse and Exchange Triples describes aggregation Resource Map
At-source capture of experiment data and research process (Electronic Lab Notebook) • Compound object authoring • Retrospective harvesting of chemistry data • Representation/Reuse through common ORE data model and ontology • Cloud-based triple store • Chemical structure search oreChem – The Chemical Semantic Web
Chem4Word - Chemistry Drawing in Word Author/edit 1D and 2D chemistry. Change chemical layout styles. Intent: Recognizes chemical dictionary and ontology terms Relationships: Navigate and link referenced chemistry Data: Semantics stored in Chemistry Markup Language <?xmlversion="1.0" ?> <cmlversion="3" convention="org-synth-report" xmlns="http://www.xml-cml.org/schema"> <moleculeid="m1"> <atomArray> <atomid="a1" elementType="C" x2="-2.9149999618530273" y2="0.7699999809265137" /> <atomid="a2" elementType="C" x2="-1.5813208400249916" y2="1.5399999809265137" /> <atomid="a3" elementType="O" x2="-0.24764171819695613" y2="0.7699999809265134" /> <atomid="a4" elementType="O" x2="-1.5813208400249912" y2="3.0799999809265137" /> <atomid="a5" elementType="H" x2="-4.248679083681063" y2="1.5399999809265137" /> <atomid="a6" elementType="H" x2="-2.914999961853028" y2="-0.7700000190734864" /> <atomid="a7" elementType="H" x2="-4.248679083681063" y2="-1.907348645691087E-8" /> <atomid="a8" elementType="H" x2="1.0860374036310796" y2="1.5399999809265132" /> </atomArray> <bondArray> <bondatomRefs2="a1 a2" order="1" /> <bondatomRefs2="a2 a3" order="1" /> <bondatomRefs2="a2 a4" order="2" /> <bondatomRefs2="a1 a5" order="1" /> <bondatomRefs2="a1 a6" order="1" /> <bondatomRefs2="a1 a7" order="1" /> <bondatomRefs2="a3 a8" order="1" /> </bondArray> </molecule> </cml> Intelligence: Verifies validity of authored chemistry Available soon: http://research.microsoft.com/chem4word/
data Triple store
Bibliographic metadata • Citations • Figures • Tables • Chunks • Reactions • Molecular Compounds • NMR Spectra and Structural Data • Experiment data Southampton PSU Cambridge Indiana • ComputationalChemistry (Gaussian) triplestore
ChemistryOntology (Nico Adams – Cambridge)
lab notebook experiment datument Mash-up (reuse) SemanticGraph (storage) Data (capture) observations measurements molecules scientists documents molecules text data data
Scholarly communities behave very differently (example: preprint server)? Physics Biomedicine ChemistryarXiveBiomed/ PubMedCentral CPS success 1991 Ginsparg @ LANL high-energy physics, step-wise expansion societies hands-off, cooperative modified 1999 Director of NIH all of biomedicine societies take control failure 2000 Commercial publisher all of chemistry societies adverse
Commercial value of chemical information (pharmaceuticals) • Nature of Chemistry research culture • pre-dominance of synthesis (creation) overshadows discovery mode typical of physics or biology • autonomy, successful research with limited reliance on others • Monopoly of scholarly societies qua publishers • ACS (CAS) • RSC Chemistry is particularly challenging
Continue work on technical innovations and infrastructure • Demonstrate through value-add applications • Understand socio-technical barriers • International workshop/study • Chemistry as “canary in coal mine” • Integrate with larger infrastructure effort • Data Conservancy The Future