570 likes | 698 Views
An Ontology-Based Approach for Computational Phenomics: Application to Autism Spectrum Disorder. Amar K. Das, MD, PhD Departments of Medicine and of Psychiatry and Behavioral Sciences. Outline. Motivations NDAR project Phenologue project Future Directions. Motivation.
E N D
An Ontology-Based Approach for Computational Phenomics: Application to Autism Spectrum Disorder Amar K. Das, MD, PhD Departments of Medicine and of Psychiatry and Behavioral Sciences
Outline • Motivations • NDAR project • Phenologue project • Future Directions
Motivation Psychiatric Genetics Phenotyping Terminology Ontology Logic
Hasler G,et al. Toward constructing an endophenotype strategy for bipolar disorders. Biological Psychiatry (2006) Represent findings and their links using structured knowledge
Phenomics “A primary task for the new field of phenomics will be to clarify what, in practical terms, constitutes a phenotype and then to delineate the different phenotypic components that compose the phenome.” Freimer & Sabatti, Nature Genetics (2003)
dbGaP Mailman, M.D. Nature Genetics (2007)
Current Approaches • Lack of standardization • Lack of organization • Lack of computability
Autism DSM-IV Diagnosis A total of six (or more) items from (1), (2), and (3), with at least two from (1), and one each from (2) and (3) (1) qualitative impairment in social interaction, as manifested by at least two of the following: a) marked impairments in the use of multiple nonverbal behaviors such as eye-to-eye gaze, facial expression, body posture, and gestures to regulate social interaction b) failure to develop peer relationships appropriate to developmental level c) a lack of spontaneous seeking to share enjoyment, interests, or achievements with other people, (e.g., by a lack of showing, bringing, or pointing out objects of interest to other people) d) lack of social or emotional reciprocity
Autism DSM-IV Diagnosis (2) qualitative impairments in communication as manifested by at least one of the following: a) delay in, or total lack of, the development of spoken language (not accompanied by an attempt to compensate through alternative modes of communication such as gesture or mime) b) in individuals with adequate speech, marked impairment in the ability to initiate or sustain a conversation with others c) stereotyped and repetitive use of language or idiosyncratic language d) lack of varied, spontaneous make-believe play or social imitative play appropriate to developmental level
Autism DSM-IV Diagnosis (3) restrictedrepetitive and stereotyped patterns of behavior, interests and activities, as manifested by at least two of the following: a) encompassing preoccupation with one or more stereotyped and restricted patterns of interest that is abnormal either in intensity or focus b) apparently inflexible adherence to specific, nonfunctional routines or rituals c) stereotyped and repetitive motor mannerisms (e.g hand or finger flapping or twisting, or complex whole body movements) d) persistent preoccupation with parts of objects Delays or abnormal functioning in at least one of the following areas, with onset prior to age 3 years:(1) social interaction(2) language as used
Goals of NDAR • Develop standards to promote meta-analyses and cross site research data comparisons • Provide researchers access to useful software tools and infrastructure • Promote the sharing of research data relevant to ASD
NIH Research Support in Autism • $100 million/year in funding • Investigator-initiated grants (R01’s) • Special initiatives, e.g. RFA for genetics • Centers and networks • Training grants (To institutions and individuals) • New initiatives • Intramural Research Program on Autism • Autism Centers of Excellence (ACE) • National Database for Autism Research (NDAR) • ARRA stimulus program
NDAR System BIRN Services & Resources Clinical Assessments (OpenClinica) Neuroimaging Genomics Subject Tracking & Management Image Analysis Security Image Processing Common Measures Genomics data access Portal Image data access Study Management Grid Computing Collaboration Data Integration Query and Reporting Data Storage Management User Management Data Integration Tools Auditing
Phenotypes in Psychiatry ‘The observable structural and functional characteristics of an organism determined by its genotype and modulated by its environment’ • Diagnostic component • Intermediate phenotype • Quantitative phenotype • Covariates
Example Query #1 Find all subject who are verbal (ADIR A14). Then look at their IQ (Cognitive Total IQ > 70) and whether or not they have seizures (Medical History Q10). Also find out if they have an abnormal MRI or any genetic abnormalities.
Example Query #2 Use head circumference to categorize macroencephaly. Then see if the subjects differ in their ADOS, ADI-R, cognitive, and language profiles, and combine this with genetic data.
NDAR Project • Systematic Review • Ontology Development • Database Infrastructure
Systematic Review • “(ADI-R or ADOS or Vineland) and (genes or genetics) and autism” • 26/43 papers relevant • Mean # phenotypes 4.1, range 1-13 • Three basic types (1:1, sum, cutoff score) Tu, S. W. AMIA Annual Proceedings (2008)
Systematic Review • Different terms e.g., ‘age of first phrases’ and ‘age of onset of phrase speech’ • Different cutoff scores e.g., ‘delayed word’ • Different definitions e.g., ‘regression’ e.g., use of different instruments
Ontology • A taxonomy with multiple link types, each with precise meaning Clinical Research Study Case Study Clinical Trial Study Controlled Case Study Study Arms
Philosophy: The study of what entities and what types of entities exist in reality Computer Science: A schema that represents a domain and is used to reason about the objects in that domain and the relations between them Perspectives on ‘Ontology’
Critical to the ‘Semantic Web’ • Shared research and development plan to • Provide explicit semantic meaning to data and knowledge shared on the Web • Bring structure to Web content • Advance the current state-of-the-art in Web information retrieval, which is keyword searching • Distributed applications will be able to process data and knowledge automatically through the use of ontologies
OWL: Web Ontology Language • Advances current Semantic Web standards by using ontologies to represent knowledge • OWL can be used to build ontologies of high-level descriptions, based on three concepts: • Classes (e.g., Subject, Phenotype, Genotype) • Properties (e.g., isBearerOf, hasResults) • Individuals (e.g., “Macroencephaly”)
OWL: Web Ontology Language Genotype Subject hasResult mutIn- RELN isBearerOf 011451 Phenotype Macro- encephaly
BIRNLex • A controlled terminology for annotation of BIRN data sources, focusing on imaging data from human subjects and mouse models • Terms cover neuroanatomy, molecular species, behavioral and cognitive processes, subject information, experimental practice and design
Basic Formal Ontology • An upper ontology which can be used to support the development of domain ontologies used in scientific research • All concepts are subclasses of • Continuants: exists in full at any time in which it exists at all • Occurants: has temporal parts and that happens, unfolds or develops through time
OBO Foundry • Ontologies should be orthogonal • Minimize overlap • Each distinct entity type (universal) should only be represented once • Partition efforts in the OBO Foundry rationally to help organize and coordinate the ontology development
SWRL: Semantic Web Rule Language • W3C specification for expressing logical rules that can be formulated in terms of OWL concepts • Rules in SWRL can be used to deduce new knowledge about an existing OWL ontology • Specification can be extended through the use of built ins
Example SWRL Rule: hasUncle hasParent(?x, ?y) ^ hasBrother(?y, ?z) →hasUncle(?x, ?z)
Example SWRL Rule: hasSister Person(Amar) ^ hasSibling(Amar, ?s) ^ Woman(?s) → hasSister(Amar, ?s)
Person(?p) ^ hasAge(?p,?age) ^ swrlb:lessThan(?age,17) → Child(?p) Example SWRL Rule: Child
Rule-Based Methods • Extensions to SWRL • Temporal • Library of temporal built ins • Query • Extraction of results as a table • MakeSet • Support for set-based operations
Development Methods • Extensions to BIRNLex • Encoding of phenotypes • Querying of NDAR database
Figure 1. The representation of data collected through the ADI-2003 autism assessment instrument as part of the autism ontology. Autism Assessment Result
Figure 2. The representation of the Status of age of words phentotype group as a OWL class partition by the possible statuses. Phenotype Representation
Phenotype Rule ADI_2003_result(?assessment) ^acqorlossoflang_aword(?assessment,?wordage) ^swrlb:greaterThan(?wordage, 24) ^subject_id(?assessment, ?subjectId) ^orgtax:Human(?subject) ^subject_id(?subject, ?subjectId) → birn_obo_ubo:bearer_of(?subject, Delayed_word)
Ontology-Driven Querying Young, L. IEEE CBMS (2009)
Phenologue Project • Develop an ontology of endophenotypes that maps brain connectivity, neural deficits, and genetic markers into a subject domain theory • Develop logic-based methods to encode and classify endophenotypes based on multi-scale measurements • Create tools to acquire new endophenotypes and annotate phenotype-genotype findings in online resources such as published literature • Develop query-elicitation methods that can evaluate hypotheses about the subject domain theory of endophenotypes using deductive inference
Phenologue Project Query Database Phenotype Definitions New Associations Catalog Analysis
Rule Technologies • Rule paraphrasing • Rule elicitation • Rulebase visualization • Knowledge mining using rules