610 likes | 746 Views
NIH WORKSHOP: INFORMATICS FOR DATA AND RESOURCE DISCOVERY IN ADDICTION RESEARCH July 8, 2010. Case Study 5 (NEMO) : Informatics tools to support theoretical and practical integration of human neuroscience data Gwen Frishkoff, Ph.D. Psychology & Neuroscience, Georgia State University
E N D
NIH WORKSHOP: INFORMATICS FOR DATA AND RESOURCE DISCOVERY IN ADDICTION RESEARCHJuly 8, 2010 Case Study 5 (NEMO): Informatics tools to support theoretical and practical integration of human neuroscience data Gwen Frishkoff, Ph.D. Psychology & Neuroscience, Georgia State University NeuroInformatics Center, University of Oregon http://nemo.nic.uoregon.edu
What the computer scientist says… Should we write out the data to XML or RDF triples? And do you plan to use ontologyrules to do complex reasoning or just use SQL to query the data?
What the neuroscientist hears… Blah blah blah blah blah…data… blah blah blah? And blah blah blah…the data?
GOALS FOR THIS TUTORIAL • What is an ontology & what’s it for? • Why bother? (Case Study: Classification of EEG/ERP data) • What are some “best practices” in ontology design & implementation? • What is RDF & what’s it for? • How does RDF represent information? • How is it used to link data to ontologies? • How can ontology-based annotation be used to support classification of data?
Case Study 5 (NEMO): Neural ElectroMagnetic Ontologies • The problem (pattern classification) • The methods & tools • ontologies • RDF database • Proof of concept (a worked example)
Case Study 5 (NEMO): Neural ElectroMagnetic Ontologies • The challenge (pattern classification) • The methods & tools • ontologies • RDF database • Proof of concept (a worked example)
2-min Primer on EEG/ERP Methods EEGs (“brainwaves” or flunctuations in brain electrical potentials) are recorded by placing two or more electrodes on the scalp surface. ~5,000 ms 256-channel Geodesic Sensor Net
Event-related potentials (ERP) • ERPs (“event-related potentials”) are the result of averaging across multiple segments of EEG, time-locking to an event of interest.
ERP Patterns (“Components”) 120 ms ERP Patterns are characterized by 3 dimensions: TIME — peak latency, duration (WHEN in time) SPACE— scalp “topography” (WHERE on scalp) FUNCTION — sensitivity to experiment factors Donchin & Duncan-Johnson, 1977
Brain Electrophysiology (EEG/ERP): The promise (Biomarkers of addiction?) • Tried and true method for noninvasive brain functional mapping • Millisecond temporal resolution • Direct measure neuronal activity • Portable and inexpensive • Recent innovations give new windows into rich, multi-dimensionalpatterns • More spatial info (high-density EEG) • More temporal & spectral info (JTF, etc.) • Multimodal integration & joint recordings of EEG and fMRI • Specificity of different patterns beyond “reduction in P300” amplitude… 1 sec
Brain Electrophysiology (EEG/ERP): The challenge • An embarrassment of riches • A wealth of data • A plethora of methods • A lack of integration • How to compare patterns across studies, labs? • How to do valid meta-analyses in ERP research? • A need for robust pattern classification • Bottom-up (data-driven) methods • Top-down (knowledge-driven) methods
A lack of standardization Hypothetical Database Query: Show me all the N400 patterns in data set X. 450 ms Peak latency 410 ms Will the “real” N400 please step forward? 330 ms 410 ms
A Need for Integration Parietal N400 Putative “N400”-labeled patterns ≠ Frontal N400 ≠ Parietal P600
Neural ElectroMagnetic Ontologies(NEMO) • The driving goal is to develop methods and tools to support cross-lab, cross-experiment integration of EEG and MEG data • We bring a set of methods & tools to bear to address this: • A set of formal (OWL) ontologies for representation of EEG/MEG and ERP/ERF data • A suite of tools for ontology-based annotation and analysis of EEG and ERP data • AnRDF database that stores annotated data from our NEMO ERP consortium and supports ERP pattern classification via SPARQL queries
Case Study 5 (NEMO): Neural ElectroMagnetic Ontologies • The challenge (EEG pattern classification) • The methods & tools • ontologies • RDF database • Proof of concept (a worked example)
What’s an ontology & what’s it for? “Highly semantically structured” What does this mean & what does it buy us?
*NOTE: We can record pattern definitions from literature in ontology without committing to the truth of these records now and forever Science evolves… So do ontologies!! Maryann: “Avoid ontology wars…” Ontologies for high-level, explicit representation of domain knowledge theoretical integration*
Ontology design principles(based on OBO Foundry recommendations) • Factor the domain to generate modular (“orthogonal”) ontologies that can be reused, integrated for other projects • Reuse existing ontologies (esp. foundational concepts) to define basic (low-level) concepts • Validate definitions of high-level concepts in bottom-up (data-driven) as well as top-down (knowledge-driven) methods • Collaborate with a community of experts in collaborative design, testing of ontology-based tools for data representation and analysis
Factoring the ERP domain TIME SPACE 1 sec FUNCTION Modulation of pattern features (time, space, amplitude) under different experiment conditions
Overview: NEMO Ontologies • NEMO core modules: • NEMO_spatial • NEMO_temporal • NEMO_functional • NEMO_ERP • NEMO_data • NEMO backend: • NEMO_relations • NEMO_imports • NEMO_deprecated • NEMO_annotation_properties
ERP spatial subdomain TIME SPACE 1 sec FUNCTION Modulation of ERP pattern features under different experiment conditions
International 10-10 EEG Electrode Locations ITT electrode location Fz (medial frontal)
Scalp surface “regions of interest” LEFT MEDIAL RIGHT FRONTAL TEMPORAL PARIETAL OCCIPITAL
Reuse in dev’t of NEMO Spatial BFO (Basic Formal Ontology) “UPPER ONTOLOGY” FMA (Foundational Model of Anatomy) “MIDLEVEL ONTOLOGY”
ERP temporal subdomain TIME SPACE 1 sec FUNCTION Modulation of ERP pattern features under different experiment conditions
Early (“exogenous”) vs. Late(“endogenous”) ERP patterns EARLY ~0-150 ms after event (e.g., stimulus onset) MID-LATENCY ~151-500 after event (e.g., stimulus onset) LATE 501 ms or more after event (e.g., stimulus onset)
Collaboration in dev’t of NEMO ERP TIME SPACE 1 sec FUNCTION Modulation of ERP pattern features under different experiment conditions
NEMO Functional Ontology Jessica Turner BIRN (now part of Neurolex) http://brainmap.org/scribe/index.html CogPO Angela Laird BrainMap
“Cognitive ontologies” Formalization of experiment metadata
CARMEN Project: Development of MINI Frank Gibson & colleagues
Reconsistituting the ERP domain… TIME SPACE 1 sec FUNCTION Modulation of ERP pattern features under different experiment conditions
Validation through application of NEMO ontologies in modeling ERP data Frishkoff, Frank, et al., 2007
Case Study 5 (NEMO): Neural ElectroMagnetic Ontologies • The challenge (EEG pattern classification) • The methods & tools • ontologies • RDF database • Proof of concept (a worked example)
Ontologies for high-level, explicit representation of domain knowledge theoretical integration • RDF to support principled mark-up of data for meta-analysis • practical integration
NEMO International Language & Literacy Consortium John Connolly McMaster University Chuck Perfetti University of Pittsburgh Dennis Molfese University of Louisville Kerry Kiborn University of Glasgow Tim Curran University of Colorado Formed in 2007
What is RDF and what is it for? RDF graph (data model)
Annoting EEG/ERP data Pattern Labels Functional attributes Temporal attributes Spatial attributes = + + Concepts coded in OWL NEMO ontology Data coded in RDF NEMO database HOW? Robert M. Frank
Annotating Data in RDF • Data Annotation • The process of marking up or “tagging” data with meaningful symbols; tags may come from ontology linked to a URI • URI (Uniform Resource Identifier) • A compact sequence of characters that identifies an abstract or physical resource (typically located on the Web) • RDF (Resource Description Framework) • RDF is a directed, labeled graph (data model) for representing information (typically on the Web) *See Glossary (http://www.seiservices.com/nida/1014080/ReadingRoom.aspx)
Recall: The goal is to formulate pattern definitions, use them to classify data, and ultimately to revise them based on meta-analysis results Observed Pattern = “N400” iff • Event type is onset of meaningful stimulus (e.g., word) AND • Peak latency is between 300 and 500 ms AND • Scalp region of interest (ROI) is centroparietal AND • Polarity over ROI is negative(>0)
The rule (just the temporal criterion)as it appears in Protégé Protégé rendering OWL/RDF rendering
Typical tabular representation of summary ERP data ERP observation (pattern extracted from “raw” ERP data) Peak latency measurement
The “RDF Triple” • In RDF form: <001> <type> <NEMO_0000093> • Subject – Predicate – Object • In natural language: • The data represented in row A is an instance of (“is a”) some ERP pattern. • That is, measurements (cells) are “about” ERP patterns (rows). • In graph form:
RDF Triple #2 • In RDF form: <002> <type> <NEMO_0745000> • Subject – Predicate – Object • In natural language = • The data represented in cell Z (row A, column 1) is an instance of (“is a”) a peak latency temporal measurement (i.e., the time at which the pattern is of maximal amplitude)
RDF Triple #3 • This graph represents an assertion, expressed in RDF = • <001> <is_peak_latency_measurement_of> <002> • The data represented in cell Z is a temporal property of the ERP pattern represented in row A
Recall: Pattern definition is encoded in the ontology (not in RDF data rep!)