310 likes | 447 Views
Semantic Web in . Charter outline. One page graphic Purpose: Mission & Vision Perimeter: Core competencies & activities Impact Engagement priority principles Operating principles with customer Appendix KM Definition (SWOT) TBD Organizational Design. $ 897 million
E N D
Charter outline • One page graphic • Purpose: Mission & Vision • Perimeter: Core competencies & activities • Impact • Engagement priority principles • Operating principles with customer • Appendix • KM Definition • (SWOT) • TBD • Organizational Design
$ 897 million including post-approval R&D costs to develop a new prescription drug = 250% increase in a decade • Inflation-adjusted • Including failures Industry productivity vs. investment • NMEs • Rising clinical trial costs -difficulty in recruiting patients • Expanding development programs • More chronic °enerated diseases • Longer development times Total R&D investment ($ billions) Tufts Center, May 2003: $ 802 million excluding post-approval R&D costs Note:’00-27 ‘01-24, ’02-17 NMEs23% NME obtain first approval, D. Kessler, H&Q Source: PhRMA & FDA 2003
R&D Challenges in Drug Discovery • Increase productivity • Improve submissions and approvals • Reduce costs: Clinical and preclinical studies ~80% of total • Segmented patient populations • Complexity of the science and technologies • Capturing the innovation and value • Drug-hunting ability • Knowledge creation and transfer • Consortium & Alliances
Need to utilize Knowledge more effectively Facing a Technology Gap in Drug Innovation
Knowledge Networks within Pharma that need to be supported • Scientists and Researchers • Regulatory (FDA) • Industrial Operations • Research Alliances • Business Process and Management • Competitive and Market Information • Financial
Data Integration ? How Can Scientists Work Together Better? • Chemistry • Compound Library Chemists • HT Screening • Medicinal Chemists • Synthetic Chemists • Molecular Modelers • Rational Designers • Biology • Geneticists • Pathologists • Molecular Biologists • Cytologists • ADME • Toxicologists • Clinicians • Informatics • Genomicists • Functional Genomicists • BioStatisticians • Bioinformaticists • Cheminformaticists • Dynamics Modelers • DB admins
“The data clearly shows that the compound series has hERG issues that are exacerbated by its side groups” Information Interpretation • Sharing data is not sufficient for sharing insights • Simply annotating findings with TEXT does not solve how to locate such insights. • Can researchers find different meaning in the same data? • Merge Legacy data with newly generated • Capture Context! • It is therefore necessary to be able to describe and capture such value-added elements in a formal, searchable way. ! ? “Which side groups?”
A Major Unmet Challenge- Recognizing Information Interpretation Seeing the data the same way… ℐi { } ~ℐj { } I(x) I(x) How can one guarantee that scientist i interprets data I(x) the same way as j does?
Social Participation • “[It] refers not just to local events of engagement in certain activities with certain people, but to a more encompassing process of being active participants in the practices of social communities and constructing identities in relation to these communities…. Such participation shapes not only what we do, but also who we are and how we interpret what we do.” • - Etienne Wenger, 1999
The Negotiation of Meaning • As described by Friesen: • The meaning of any set of terms, and the significance and utility of any taxonomy, according to Wenger, can be evaluated only in the context of a community whose members are involved in similar activities and share similar values. Wenger calls this process the "negotiation of meaning:" The production of meanings "that extend, redirect, dismiss, reinterpret, modify or confirm… the histories of meanings of which they are a part." (Wenger, 1999; p. 53) Example: Functional Genomics and Pathologists
Research Process and Knowledge Flow Knowledge Networks ExptDesign Action DataAnalysis Decision Interpretation
Communities and Interoperability • Semantic interoperability is tied directly to communities of practice: • “Within a community or domain, relative homogeneity reduces interoperability challenges. Heterogeneity increases as one moves outside of a focal community/domain, and interoperability is likely [to be] more costly and difficult to achieve” Moen, 2001 • Meanings encoded with a (XML) schema, for use within one community, are defined only implicitly. • Databases can only be used by those who define them; group heterogeneity impedes practical schema definitions
Why a Semantic Web for Life Science Applications? • Improve Scientific Interactions and Exchanges • Data Integration AND Interpretation • Web-compatible strategies for information encoding and sharing • Sharing Best Practices – Knowledge discovery rules • Knowledge Agents –How can they accelerate science?
Defines OWL RDF Structured Framework for Next Generation of the WebKnowledge Exchange within a Semantic Web • OWL (Ontology Web Language) • W3C Ontology Specification • Goes beyond 1st order Logic (Frames & Descriptive Logic) • Extensible by members of any community • Structurally based on RDF • RDF (Resource Description Framework) • Basic XML Semantic Format that OWL is based upon • Allows users to merge and aggregate any set of related data and relational components • Refers to Ontologies specified in OWL
Free-Text “The side chain on this compound improves GI transport significantly” Free-Text with Link “As evidenced (PKID:392384), the side chain on this compound improves GI transport significantly” RDF Statement <side chain “#element=2”> <improves><GI transport> Search feasible for any side chain improving “GI transport”, or semantically related impact Smarter, Searchable Annotations (Chemistry) Text found only if compound already selected Link can be used to find all compounds referencing it– but reason for link is unclear
Free-Text “The domain on this protein regulates catalytic activity significantly” Free-Text with Link “As evidenced (PKID:8832), our compound series interact with the catalytic site” RDF Statement <domain “#element=2”> <interacts><Cmp Series XV > Search feasible for any protein domain interacting with “Compound Series XV”, or semantically related binding Smarter, Searchable Annotations (Proteins) Text found only if compound already selected Link can be used to find all proteins referencing this link– but reason for link is unclear
Aggregation through Semantics (OWL) PROTEIN GENE mRNA CASCADE PATHWAY LOCALIZATION MICROARRAY EXPERIMENT BIO-PROCESS Data Sources INTERVENTION POINT DISEASE TARGET MODEL DRUG TREATMENT
Query, Upload Results Search Aggregate New Data Paradigm for Research • More than a collection of tables for Set-selection • Data can evolve with additions of attributes and properties as well as through new inferences
New Sharing Paradigm for Research • Sharing discoveries in a Context
Local Ontology Ontology Projects FGx rDB Extended Ontology Central Referenced DB Annotated Literature Chem Local DB’s Semantic Communities Vision DiseaseArea Platform Space
Science January 24, 2003
Information vs. Knowledge • “Information is data that is endowed with relevance or purpose. • Converting data into information thus requires knowledge. • And knowledge, by definition, is specialized. (In fact, truly knowledgeable people tend toward overspecialization, whatever their field, precisely because there is always so much to know.)” – Peter Drucker, 1988 • The conversion of data to information or knowledge is an interpretive process that implies a sociological context: • “It entails personal involvement in and commitment to specific practices, and participation in a community of those with similar or complimentary understandings.” - Norm Friesen, 2002
Communities… • Encoding is not an isolated activity, defined by mechanical conciseness: • “All of this seems to suggest that the significance of words and descriptions in metadata may not be so much a matter of clear and unambiguous definition ... Instead, it is more a matter of doing, acting, and belonging.” • - Norm Friesen, 2002
…and Meaning • Practice both defines and requires meaning: • “This focus on meaningfulness is… not primarily on the technicalities of ‘meaning.’ It is not on meaning as it sits locked up in dictionaries. It is not just on meaning as a relation between a sign and a reference…. Practice is about meaning as an experience of everyday life. • - Etienne Wenger, 1999
Explicit vs. Tacit Knowledge • Negotiating Meaning helps take what is implicit and make it explicit, tangible, and codifiable. • Context is essential for framing implicit knowledge • No Knowledge is formally either tacit or explicit – when meaning is negotiated so that interpretations and insights can be effectively shared within a context, then what was tacit is now reified. • Communities, if defined appropriately through common semantics, can capture any knowledge that is viewed as relevant and timely, thereby making it functional
Semantic Web for Life Sciences • What SWLS is- • W3C Discussion Forum for Scientists and Informaticists • Identifying critical needs and defining them as use cases • Help define the relation between information and (codified) knowledge • Effective formation and interaction of research communities • What SWLS isn’t- • Standards Group • SIG for Vendors • Closed Consortium for Industry
Semantic Web Life Science Activities • W3C Workshop Oct 27, 28 Formation of Work Groups • Mailing List: public-semweb-lifesci@w3.org • ISMB 2005 (Detroit) – Semantic Web Track • Coordination with BioPAX, GeneOntology, UniProt, NCI, etc
SWLS Resources • Semantic Web: http://www.w3.org/2001/sw/ • RDF: http://www.w3.org/rdf • SWLS: http://esw.w3.org/topic/SemanticWebForLifeSciences • DB wrapper: http://www.w3.org/2004/04/30-RDF-RDB-access/