1 / 25

Model-Based Information Integration in a Neuroscience Mediator System

Model-Based Information Integration in a Neuroscience Mediator System. Bertram Ludaescher Amarnath Gupta Maryann E. Martone University of California San Diego. WWW. DB. A Standard Mediator Architecture ( MIX -- M ediation of I nformation using X ML ). USER-Query. XML Q/A.

Download Presentation

Model-Based Information Integration in a Neuroscience Mediator System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Model-Based Information Integration in a Neuroscience Mediator System Bertram Ludaescher Amarnath Gupta Maryann E. Martone University of California San Diego

  2. WWW DB A Standard Mediator Architecture(MIX -- Mediation of Information using XML) USER-Query XML Q/A INTEGRATED VIEW MIX MEDIATOR XML Integrated View Definition XML Q/A XML Q/A Wrapper Wrapper Wrapper Files Lab1 Lab2 Lab3 Data Sources VLDB2000, Cairo

  3. SEMANTIC Integration ??? • SYNTACTIC/STRUCTURAL Integration • Integrated Views (Src-XML => Intgr-XML) • Schema Integration (DTD =>DTD) • Wrapping, Data Extraction (Text => XML) MIX Mediation of Information using XML Distributed Query Processing SRB/MCAT storage, query capabilities protocols & services SYSTEM Integration TCP/IP HTTP CORBA Integration Issues VLDB2000, Cairo

  4. Integration Issues: Mediating across Multiple-Worlds • Structural Integration => common semistructured data model (XML) => XML queries & transformations to resolve schema conflicts • Limited Query Capabilities => mediator is aware of QCs exported by wrappers • ... • Semantic Integration • most work deals with issues for “one-world” scenarios (e.g., amazon.com vs. bn.com) • what if data comes from a “multiple-world” scenario (like Neuroscience), where data objects from different sources are not even similar, and only the hidden semantics (known to the domain expert) provides the “semantic link”? VLDB2000, Cairo

  5. A Neuroscience Question What is the cerebellar distribution of rat proteins with more than 70% homology with human NCS-1? Any structure specificity? How about other rodents? ??? Integrated View ??? ??? Integrated View Definition ??? ???Mediator ??? Wrapper Wrapper Wrapper Wrapper Web protein localization morphometry neurotransmission CaBP, Expasy VLDB2000, Cairo

  6. Purkinje Cell layer of Cerebellar Cortex Molecular layer of Cerebellar Cortex Fragment of dendrite Hidden Semantics: Protein Localization <protein_localization> <neuron type=“purkinje cell” /> <protein channel=“red”> <name>RyR</> …. </protein> <region h_grid_pos=“1” v_grid_pos=“A”> <density> <structure fraction=“0.8”> <name>spine</> <amount name=“RyR”>0</> </> <structure fraction=“0.2”> <name>branchlet</> <amount name=“RyR”>30</> </> VLDB2000, Cairo

  7. Branch level beyond 4 is a branchlet Must be dendritic because Purkinje cells don’t have somatic spines Hidden Semantics: Morphometry <neuron name=“purkinje cell”> <branch level=“10”> <shaft> … </shaft> <spine number=“1”> <attachment x=“5.3” y=“-3.2” z=“8.7” /> <length>12.348</> <min_section>1.93</> <max_section>4.47</> <surface_area>9.884</> <volume>7.930</> <head> <width>4.47</> <length>1.79</> </head> </spine> … VLDB2000, Cairo

  8. The Problem • Multiple Worlds Integration • compatible terms not directly joinable • complex, indirect associations among schema elements • unstated integrity constraints • Why not just use Ontologies? • typical ontologies associate terms along limited number of dimensions • What’s needed? • a “theory” under which non-identical terms can be “semantically joined” => lift mediation to the level of conceptual models (CMs) => domain knowledge, ICs become rules over CMs => Model-Based Mediation VLDB2000, Cairo

  9. Integrated-DTD := XML-QL(Src1-DTD,...) Integrated-CM:= CM-QL(Src1-CM,...) DOMAIN MAP IF  THEN  IF  THEN  Logical Domain Constraints IF  THEN  No Domain Constraints Structural Constraints (DTDs), Parent, Child, Sibling, ... Classes, Relations, is-a, has-a, ... C1 A = (B*|C),D B = ... C2 R C3 . . .... .... .... XML Elements .... (XML) Objects Raw Data Raw Data ConceptualModels Raw Data XML-Based vs. Model-Based Mediation XML Models VLDB2000, Cairo

  10. Extended Mediator Architecture => Wrappers export Conceptual Models (CMs), i.e., facts+rules for classes, relationships, ICs, ... ) => Mediator importsCMs (from sources, auxiliary knowledge bases, and domain maps (DMs) => a generic conceptual model (GCM, a subset of F-logic), extensible via rules = common target CM language => new CMs can be plugged-in by specifying them in GCM + F-logic rules => prototype implementation in FLORA: • global-as-view approach • compiler: F-logic => XSB-Prolog • top-down evaluation => virtual (demand-driven) views • external interfaces (XML, RDBs, DM visualization,...) VLDB2000, Cairo

  11. FL rule proc. LP rule proc. GCM GCM GCM Mediator Engine CM S1 CM S3 CM S2 XSB Engine Graph proc. CM-Wrapper CM-Wrapper CM-Wrapper XML-Wrapper XML-Wrapper XML-Wrapper S3 S1 S2 Model-Based Mediator Architecture USER/Client CM (Integrated View) Domain Map DM Integrated View Definition IVD CM Plug-In CM Queries & Results (exchanged in XML) Logic API (capabilities) VLDB2000, Cairo

  12. Definition of Integrated Views ... • XML-2-FL and CM-2-FL Translators <!ELEMENT Studies (Study)*> <!ELEMENT Study (study_id, … animal, experiments, experimenters> <!ELEMENT experiments (experiment)*> <!ELEMENT experiment (description, instrument, parameters)> studyDB[studies =>> study]. study[study_id=> string; … animal => animal; experiments=>> experiment; experimenters =>> string]. … • Specification of Domain Knowledge • Subclasses • Rules • Integrity Constraints • Integrated View Definition mushroom_spine :: spine S:mushroom_spine IF S:spine[head_; neck _]. ic1(S):alert[type  “invalid spine”; object  S]IF S:spine[undef ->> {head, neck}]. protein_distribution(Protein, Organism, Brain_region, Feature_name, Anatom, Value) IF I:protein_label_image[ proteins ->> {Protein}; organism -> Organism; anatomical_structures ->> {AS:anatomical_structure[name->Anatom]}], NAE:neuro_anatomic_entity[name->Anatom; loccated_in->>{Brain_region}], AS..segments..features[name->Feature_name; value->Value]. VLDB2000, Cairo

  13. association rule taxon[subspecies  string; species  string; genus  string; … phylum  string; kingdom  string; superkingdom  string]. Schema ... Definition of Integrated Views (Multiple Sources) • Creating Mediated Classes • Reasoning with Schema animal[MR] IF S:source, S.animal [MR] . X[taxonT] IF X: ‘PROLAB’.animal[name N], words(N,[W1,W2|_]), T: ‘TAXON’.taxon[genus W1;species W2]. union over all classes At Mediator subspecies::species::genus:: … kingdom::superkingdom T:TR, TR::TR1 IF T: ‘TAXON’.taxon[Taxon_Rank TR, Taxon_Rank1 TR1], Taxon_Rank::Taxon_Rank1. Class creation by schema reasoning VLDB2000, Cairo

  14. Model-Based Mediation with DOMAIN MAPS (DMs) • “Semantic Road Maps” for situating source data • => navigational aid (browsing source classes at the conceptual level) • => basis for integrated views across multiple worlds • => link points (concepts) and labeled arcs (roles) • => formal semantics (in FL and/or DLs) • Example: ANATOM DM • = antatomical entities (concepts) + is_a, has_a, overlaps, ... (roles) • => from syntactic equality to semantic joins LINK(X,Y): X.zip = Y.zip X.addr in Y.zip X.zip overlaps Y.county ... Integrated-CM(Z1,...) := get X1,... from Src1; get X2,... from Src2; LINK (Xi, Yj); Zj = CM-QL(X1,...,Y1,...) VLDB2000, Cairo

  15. ANATOM ANATOM Domain Map VLDB2000, Cairo

  16. ANATOM Domain Map with Registered Data ANATOM DATA VLDB2000, Cairo

  17. Deductive Closure of “has_a” with “tc(is_a)”:(YES -- Real Recursive Views!! ;-) ANATOM CLOSURE VLDB2000, Cairo

  18. Example Query Evaluation (I) • Example: protein_distribution • given:organism, protein, brain_region • ANATOM DM: • recursively traverse the has_a_star paths under brain_region collect all anatomical_entities • Source PROLAB: • join with anatomical structures and collect the value of attribute “image.segments.features.feature.protein_amount” where “image.segments.features.feature.protein_name” = proteinand “study_db.study.animal.name” = organism • Mediator: • aggregate over all parents up to brain_region • report distribution VLDB2000, Cairo

  19. Interactive Queries (I) KIND VLDB2000, Cairo

  20. Example Query Evaluation (II) @SENSELAB: X1 := select output from parallel fiber; @MEDIATOR: X2 := “hang off” X1 from Domain Map; @MEDIATOR: X3 := subregion-closure(X2); @NCMIR: X4 := select PROT-data(X3, Ryanodine Receptors); @MEDIATOR: X5 := compute aggregate(X4); "How does the parallel fiber output (Yale/SENSELAB) relate to the distribution of Ryanodine Receptors (UCSD/NCMIR)?" VLDB2000, Cairo

  21. KIND01 Interactive Queries (II) VLDB2000, Cairo

  22. Resulting Sub DOMAIN MAP “Browser” PROTLOC VLDB2000, Cairo

  23. Computed Protein Localization Data PROTLOC VLDB2000, Cairo

  24. Client-Side Result Visualization(using AxioMap Viewer: Ilya Zaslavsky) PROTLOC-AxioMap VLDB2000, Cairo

  25. Surface atlas, Van Essen Lab stereotaxic atlas LONI  MODEL-BASED Mediation MCell, CNL, Salk CCB, Montana SU NCMIR, UCSD Summary & Outlook: Federation of Brain Data Result (XML/XSLT) PROTLOC Result (VML) ANATOM VLDB2000, Cairo

More Related