200 likes | 458 Views
Knowledge Management Issues in a Global Pharmaceutical R&D Environment. Ted Slater Proteomics Center of Emphasis Pfizer Global R&D Michigan. W3C Workshop on Semantic Web for Life Sciences 27-28 October 2004 Cambridge, Massachusetts USA. About Pfizer Global R&D.
E N D
Knowledge Management Issues in a Global Pharmaceutical R&D Environment Ted Slater Proteomics Center of Emphasis Pfizer Global R&D Michigan W3C Workshop on Semantic Web for Life Sciences 27-28 October 2004 Cambridge, Massachusetts USA
About Pfizer Global R&D • The industry’s largest R&D organization • >12,500 employees worldwide • Estimated R&D budget in 2004:$7.9 billion • Hundreds of research projects over 18 therapeutic areas • (Not really using Semantic Web technologies just now)
Issues with Global R&D • Geographical (time & distance) • Language (even if the language is the same!) • Cultural • Increased reliance on electronic communications
5:00 10:00 2:00 18:00 5:00 4:00
What’s in a Name? • “Releasing TaqMan® Data” use case from John Wilbanks (17 Aug 2004) • GO annotation from a particular gene • TaqMan® data from an exon proximal to that gene • Annotating the TaqMan® data with GO annotation is not quite right • Different perceptions of concept “gene”
Proteomics Metabonomics RNA Profiling
Current Tools Fall Short • 100+ highly-specialized software tools in place for ’omics technologies • All query-centric • Single user • Low bandwidth • Ask a question, get a list
gi|84939483 gi|39893845 gi|27394934 gi|18890092 gi|10192893 gi|11243007 gi|20119252 gi|19748300 gi|44308356 gi|50021874 gi|10003001 gi|27762947 gi|24537303 gi|27284958 gi|37373499 … How to Drive a Biologist Crazy
Metadata? • Experimental protocols • Model system descriptions • Statistical criteria for data analysis and acceptability • Others
fan wall spear tree rope snake Physiology
Hypothesis Generation • Our domain is too big and complex to fit in our heads • Browsing and correlation can’t get us there • We need our machines to generate testable hypotheses for us based on our experimental results • We need knowledge about causation
Clinical KM Needs • Aggregate and analyze: • Safety data • Efficacy data • Genomic data • Healthcare data • Performance data • Study metadata • Staff and vendor performance • Resource utilization
The Shape of Clinical Data • >2 GB each per Phase-2, -3, or -4 protocol, split over >100 different datasets, each with 20-300 columns • Metadata complex, hard to combine across studies • Sensitive data • Project teams can be reluctant to discuss with other groups (e.g. in discovery)
Clinical Columns • Dosage and dose response data • Product differentiation • Patient demographics • Concurrent medications • Lab data • Subject experience & adverse events • How fast does it work? How long does it last?
Other Areas • Legal • “Patent searching is an art, not a science” • New cases, statutes, policies • HR • Finance • Strategic Alliances • PGRD has links with >250 partners in academia and industry • More
Summary • KM needs in discovery and clinical are complex, diverse, and sizeable • We need a knowledge architecture that can be used effectively by machines. • Ontologies • Software • Hardware
Acknowledgements • John Wilbanks (W3C) • Enoch Huang (Pfizer) • Eric Neumann (Aventis) • Stephen Dobson (Pfizer) • Mitch Brigell (Pfizer) • Dave Lowenschuss (Pfizer) • Ruth VanBogelen (Pfizer)