540 likes | 647 Views
How Philosophy of Science Can Help Biomedical Research. Barry Smith http://ontology.buffalo.edu/smith. How to Do Biology across the Genome?.
E N D
How Philosophy of Science Can Help Biomedical Research Barry Smith http://ontology.buffalo.edu/smith
MKVSDRRKFEKANFDEFESALNNKNDLVHCPSITLFESIPTEVRSFYEDEKSGLIKVVKFRTGAMDRKRSFEKVVISVMVGKNVKKFLTFVEDEPDFQGGPISKYLIPKKINLMVYTLFQVHTLKFNRKDYDTLSLFYLNRGYYNELSFRVLERCHEIASARPNDSSTMRTFTDFVSGAPIVRSLQKSTIRKYGYNLAPYMFLLLHVDELSIFSAYQASLPGEKKVDTERLKRDLCPRKPIEIKYFSQICNDMMNKKDRLGDILHIILRACALNFGAGPRGGAGDEEDRSITNEEPIIPSVDEHGLKVCKLRSPNTPRRLRKTLDAVKALLVSSCACTARDLDIFDDNNGVAMWKWIKILYHEVAQETTLKDSYRITLVPSSDGISLLAFAGPQRNVYVDDTTRRIQLYTDYNKNGSSEPRLKTLDGLTSDYVFYFVTVLRQMQICALGNSYDAFNHDPWMDVVGFEDPNQVTNRDISRIVLYSYMFLNTAKGCLVEYATFRQYMRELPKNAPQKLNFREMRQGLIALGRHCVGSRFETDLYESATSELMANHSVQTGRNIYGVDFSLTSVSGTTATLLQERASERWIQWLGLESDYHCSFSSTRNAEDV MKVSDRRKFEKANFDEFESALNNKNDLVHCPSITLFESIPTEVRSFYEDEKSGLIKVVKFRTGAMDRKRSFEKVVISVMVGKNVKKFLTFVEDEPDFQGGPISKYLIPKKINLMVYTLFQVHTLKFNRKDYDTLSLFYLNRGYYNELSFRVLERCHEIASARPNDSSTMRTFTDFVSGAPIVRSLQKSTIRKYGYNLAPYMFLLLHVDELSIFSAYQASLPGEKKVDTERLKRDLCPRKPIEIKYFSQICNDMMNKKDRLGDILHIILRACALNFGAGPRGGAGDEEDRSITNEEPIIPSVDEHGLKVCKLRSPNTPRRLRKTLDAVKALLVSSCACTARDLDIFDDNNGVAMWKWIKILYHEVAQETTLKDSYRITLVPSSDGISLLAFAGPQRNVYVDDTTRRIQLYTDYNKNGSSEPRLKTLDGLTSDYVFYFVTVLRQMQICALGNSYDAFNHDPWMDVVGFEDPNQVTNRDISRIVLYSYMFLNTAKGCLVEYATFRQYMRELPKNAPQKLNFREMRQGLIALGRHCVGSRFETDLYESATSELMANHSVQTGRNIYGVDFSLTSVSGTTATLLQERASERWIQWLGLESDYHCSFSSTRNAEDV sequence of X chromosome in baker’s yeast
MKVSDRRKFEKANFDEFESALNNKNDLVHCPSITLFESIPTEVRSFYEDEKSGLIKVVKFRTGAMDRKRSFEKVVISVMVGKNVKKFLTFVEDEPDFQGGPIPSKYLIPKKINLMVYTLFQVHTLKFNRKDYDTLSLFYLNRGYYNELSFRVLERCHEIASARPNDSSTMRTFTDFVSGAPIVRSLQKSTIRKYGYNLAPYMFLLLHVDELSIFSAYQASLPGEKKVDTERLKRDLCPRKPIEIKYFSQICNDMMNKKDRLGDILHIILRACALNFGAGPRGGAGDEEDRSITNEEPIIPSVDEHGLKVCKLRSPNTPRRLRKTLDAVKALLVSSCACTARDLDIFDDNNGVAMWKWIKILYHEVAQETTLKDSYRITLVPSSDGISLLAFAGPQRNVYVDDTTRRIQLYTDYNKNGSSEPRLKTLDGLTSDYVFYFVTVLRQMQICALGNSYDAFNHDPWMDVVGFEDPNQVTNRDISRIVLYSYMFLNTAKGCLVEYATFRQYMRELPKNAPQKLNFREMRQGLIALGRHCVGSRFETDLYESATSELMANHSVQTGRNIYGVDSFSLTSVSGTTATLLQERASERWIQWLGLESDYHCSFSSTRNAEDVVAGEAASSNHHQKISRVTRKRPREPKSTNDILVAGQKLFGSSFEFRDLHQLRLCYEIYMADTPSVAVQAPPGYGKTELFHLPLIALASKGDVEYVSFLFVPYTVLLANCMIRLGRRGCLNVAPVRNFIEEGYDGVTDLYVGIYDDLASTNFTDRIAAWENIVECTFRTNNVKLGYLIVDEFHNFETEVYRQSQFGGITNLDFDAFEKAIFLSGTAPEAVADAALQRIGLTGLAKKSMDINELKRSEDLSRGLSSYPTRMFNLIKEKSEVPLGHVHKIRKKVESQPEEALKLLLALFESEPESKAIVVASTTNEVEELACSWRKYFRVVWIHGKLGAAEKVSRTKEFVTDGSMQVLIGTKLVTEGIDIKQLMMVIMLDNRLNIIELIQGVGRLRDGGLCYLLSRKNSWAARNRKGELPPKEGCITEQVREFYGLESKKGKKGQHVGCCGSRTDLSADTVELIERMDRLAEKQATASMSIVALPSSFQESNSSDRYRKYCSSDEDSNTCIHGSANASTNASTNAITTASTNVRTNATTNASTNATTNASTNASTNATTNASTNATTNSSTNATTTASTNVRTSATTTASINVRTSATTTESTNSSTNATTTESTNSSTNATTTESTNSNTSATTTASINVRTSATTTESTNSSTSATTTASINVRTSATTTKSINSSTNATTTESTNSNTNATTTESTNSSTNATTTESTNSSTNATTTESTNSNTSAATTESTNSNTSATTTESTNASAKEDANKDGNAEDNRFHPVTDINKESYKRKGSQMVLLERKKLKAQFPNTSENMNVLQFLGFRSDEIKHLFLYGIDIYFCPEGVFTQYGLCKGCQKMFELCVCWAGQKVSYRRIAWEALAVERMLRNDEEYKEYLEDIEPYHGDPVGYLKYFSVKRREIYSQIQRNYAWYLAITRRRETISVLDSTRGKQGSQVFRMSGRQIKELYFKVWSNLRESKTEVLQYFLNWDEKKCQEEWEAKDDTVVVEALEKGGVFQRLRSMTSAGLQGPQYVKLQFSRHHRQLRSRYELSLGMHLRDQIALGVTPSKVPHWTAFLSMLIGLFYNKTFRQKLEYLLEQISEVWLLPHWLDLANVEVLAADDTRVPLYMLMVAVHKELDSDDVPDGRFDILLCRDSSREVGEMKVSDRRKFEKANFDEFESALNNKNDLVHCPSITLFESIPTEVRSFYEDEKSGLIKVVKFRTGAMDRKRSFEKVVISVMVGKNVKKFLTFVEDEPDFQGGPIPSKYLIPKKINLMVYTLFQVHTLKFNRKDYDTLSLFYLNRGYYNELSFRVLERCHEIASARPNDSSTMRTFTDFVSGAPIVRSLQKSTIRKYGYNLAPYMFLLLHVDELSIFSAYQASLPGEKKVDTERLKRDLCPRKPIEIKYFSQICNDMMNKKDRLGDILHIILRACALNFGAGPRGGAGDEEDRSITNEEPIIPSVDEHGLKVCKLRSPNTPRRLRKTLDAVKALLVSSCACTARDLDIFDDNNGVAMWKWIKILYHEVAQETTLKDSYRITLVPSSDGISLLAFAGPQRNVYVDDTTRRIQLYTDYNKNGSSEPRLKTLDGLTSDYVFYFVTVLRQMQICALGNSYDAFNHDPWMDVVGFEDPNQVTNRDISRIVLYSYMFLNTAKGCLVEYATFRQYMRELPKNAPQKLNFREMRQGLIALGRHCVGSRFETDLYESATSELMANHSVQTGRNIYGVDSFSLTSVSGTTATLLQERASERWIQWLGLESDYHCSFSSTRNAEDVVAGEAASSNHHQKISRVTRKRPREPKSTNDILVAGQKLFGSSFEFRDLHQLRLCYEIYMADTPSVAVQAPPGYGKTELFHLPLIALASKGDVEYVSFLFVPYTVLLANCMIRLGRRGCLNVAPVRNFIEEGYDGVTDLYVGIYDDLASTNFTDRIAAWENIVECTFRTNNVKLGYLIVDEFHNFETEVYRQSQFGGITNLDFDAFEKAIFLSGTAPEAVADAALQRIGLTGLAKKSMDINELKRSEDLSRGLSSYPTRMFNLIKEKSEVPLGHVHKIRKKVESQPEEALKLLLALFESEPESKAIVVASTTNEVEELACSWRKYFRVVWIHGKLGAAEKVSRTKEFVTDGSMQVLIGTKLVTEGIDIKQLMMVIMLDNRLNIIELIQGVGRLRDGGLCYLLSRKNSWAARNRKGELPPKEGCITEQVREFYGLESKKGKKGQHVGCCGSRTDLSADTVELIERMDRLAEKQATASMSIVALPSSFQESNSSDRYRKYCSSDEDSNTCIHGSANASTNASTNAITTASTNVRTNATTNASTNATTNASTNASTNATTNASTNATTNSSTNATTTASTNVRTSATTTASINVRTSATTTESTNSSTNATTTESTNSSTNATTTESTNSNTSATTTASINVRTSATTTESTNSSTSATTTASINVRTSATTTKSINSSTNATTTESTNSNTNATTTESTNSSTNATTTESTNSSTNATTTESTNSNTSAATTESTNSNTSATTTESTNASAKEDANKDGNAEDNRFHPVTDINKESYKRKGSQMVLLERKKLKAQFPNTSENMNVLQFLGFRSDEIKHLFLYGIDIYFCPEGVFTQYGLCKGCQKMFELCVCWAGQKVSYRRIAWEALAVERMLRNDEEYKEYLEDIEPYHGDPVGYLKYFSVKRREIYSQIQRNYAWYLAITRRRETISVLDSTRGKQGSQVFRMSGRQIKELYFKVWSNLRESKTEVLQYFLNWDEKKCQEEWEAKDDTVVVEALEKGGVFQRLRSMTSAGLQGPQYVKLQFSRHHRQLRSRYELSLGMHLRDQIALGVTPSKVPHWTAFLSMLIGLFYNKTFRQKLEYLLEQISEVWLLPHWLDLANVEVLAADDTRVPLYMLMVAVHKELDSDDVPDGRFDILLCRDSSREVGE
network of gene interactions in E. coli http://moebio.com/santiago/gnom/ingles.html
what cellular component? what molecular function? what biological process?
The Idea of Common Controlled Vocabularies GlyProt MouseEcotope sphingolipid transporter activity DiabetInGene GluChem
The Idea of Common Controlled Vocabularies GlyProt MouseEcotope Holliday junction helicase complex DiabetInGene GluChem
Gene Ontology male courtship behavior, orientation prior to leg tapping and wing vibration
Benefits of GO • based in biological science • links data to biological reality • links people to software • links data together • across species (human, mouse, yeast, fly ...) • across granularities (molecule, cell, organ, organism, population)
The goal all biological (biomedical) research data should cumulate to form a single, algorithmically processible, whole http://obofoundry.org
Ontologies already being applied to achieve this goal Sjöblöm T, et al. analyzed 13,023 genes in 11 breast and 11 colorectal cancers GO tells you what is standard functional information for these genes By tracking deviations from this standard 189 genes could be identified as being mutated at significant frequency and thus as providing targets for diagnostic and therapeutic intervention. Science. 2006 Oct 13;314(5797):268-74.
Towards Empirical Philosophy • processualist vs. 3-dimensionalist • reductionist vs. non-reductionist • realist vs. nominalist If ontologies based on different philosophical principles are tested for their utility in support of scientific research, which types of ontologies will prove most useful?
Some sample ontologies Cell Ontology (CL) Foundational Model of Anatomy (FMA) Environment Ontology (EnvO) Gene Ontology (GO) Infectious Disease Ontology Phenotypic Quality Ontology (PaTO) Protein Ontology (PRO) RNA Ontology (RnaO) Sequence Ontology (SO)
The problem High throughput experimentation data is meaningless unless the researcher is provided with detailed information concerning how it was obtained
To make experimental data computationally accessible we need ontologies to describe the data (1) from the point of view of their relation to reality (2) from the point of view of their relation to experiments
Three solutions The MGED Ontology OBI: The Ontology for Biomedical Investigations EXPO: The Experiment Ontology
MGED Ontology Individual =def. name of the individual organism from which the biomaterial was derived Experiment =def. The complete set of bioassays and their descriptions performed as an experiment for a common purpose. ... An experiment will be often equivalent to a publication.
MGED Ontology Chromosome =Def An abstraction used for annotation Chromosome =Def A biological sequence that can be placed on an array
OBI The Ontology for Biomedical Investigations with thanks to Trish Whetzel and Richard Scheuermann
Purpose of OBI To provide a resource for the unambiguous description of the components of biomedical investigations such as the design, protocols and instrumentation, material, data and types of analysis and statistical tools applied to the data • NOT designed to model biology
Hypothesis That it is possible to create ontology resources of genuine utility by drawing on logical and philosophical principles e.g. pertaining to consistency of definitions, avoidance of use-mention confusions.
OBI Collaborating Communities Crop sciences Generation Challenge Programme (GCP), Environmental genomics MGED RSBI Group, www.mged.org/Workgroups/rsbi Genomic Standards Consortium (GSC), www.genomics.ceh.ac.uk/genomecatalogue HUPO Proteomics Standards Initiative (PSI), psidev.sourceforge.net Immunology Database and Analysis Portal, www.immport.org Immune Epitope Database and Analysis Resource (IEDB), http://www.immuneepitope.org/home.do International Society for Analytical Cytology, http://www.isac-net.org/ Metabolomics Standards Initiative (MSI), Neurogenetics, Biomedical Informatics Research Network (BIRN), Nutrigenomics MGED RSBI Group, www.mged.org/Workgroups/rsbi Polymorphism Toxicogenomics MGED RSBI Group, www.mged.org/Workgroups/rsbi Transcriptomics MGED Ontology Group
OBI – Tools and Documentation • Open source, standards compliant and version management • Ontology Web Language (OWL) using Protégé editor • OBI.owl files are available from the OBI SVN Repository
The Problem of Clinical Investigations Regulatory bodies such as the FDA need to assess the evidentiary value of enormous volumes of data collected e.g. in trials on specific drug formulations For this, they need to impose standardization of terminologies used to express these data, e.g. as developed by the Clinical Data Interchange Standards Consortium (CDISC)
“Study Design” Descriptive research • Case study – description of one or more patients • Developmental research – description of pattern of change over time • Qualitative research – gathering data through interview or observation Exploratory research • Secondary analysis – exploring new relationships in old data • Historical research – reconstructing the past through an assessment of archives or other records Experimental research • Randomized clinical trial • Meta-analysis – statistically combining findings from several different studies to obtain a summary analysis
“Population” Recruited population • Randomized population • Eligible population • Screened population • Premature termination population Excluded population • Excluded post-randomization population • Not-eligible-population Analyzed population • Study arm population • Crossover population • Subgroup population • Intent-to-treat population - based on randomization
Development plan (CDISC) Standard operating procedures (CDISC) Statistical analysis plan (CDISC) Meta-analysis (CDISC) Quality assurance (CDISC) Quality control (CDISC) Baseline assessment (CDISC) Validation (CDISC) Coding (MUSC) Permuted block randomization (MUSC) Secondary-study-protocol (RCT) Intervention-step (RCT) Blinding-method (RCT) Study design
Negative findings (MUSC) Positive findings (MUSC) Primary-outcome (RCT) Secondary-outcome (RCT)
EXPO The Ontology of Experiments L. Soldatova, R. King Department of Computer Science The University of Wales, Aberystwyth
experimental actions part_of experimental design subject of experiment part_of experimental design