1 / 30

Model-Based Mediation with Domain Maps

Model-Based Mediation with Domain Maps . Bertram Ludäscher * Amarnath Gupta * Maryann E. Martone +. * San Diego Supercomputer Center (SDSC) + National Center for Microscopy and Imaging Research (NCMIR) University of California, San Diego (UCSD). Overview. Motivation

clint
Download Presentation

Model-Based Mediation with Domain Maps

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Model-Based Mediation with Domain Maps Bertram Ludäscher* Amarnath Gupta* Maryann E. Martone+ *San Diego Supercomputer Center (SDSC) +National Center for Microscopy and Imaging Research (NCMIR) University of California, San Diego (UCSD)

  2. Overview • Motivation • Problem with current Mediator Architecture • Complex Scientific Multiple-World Scenarios • Model-Based Mediation Architecture • Lifting from XML to level of Conceptual Models (CMs) • Formal Framework • Domain Maps (DMs) • Generic Conceptual Model GCM • Integrated View Definition • Example Query Evaluation • Open Issues

  3. USER-Query XML Q/A INTEGRATED VIEW XML Integrated View Definition XMAS/XQuery MIX MEDIATOR XML Q/A XML Q/A WWW Wrapper Wrapper Wrapper DB Files Lab1 Lab2 Lab3 Data Sources A Standard Mediator Architecture(MIX -- Mediation of Information using XML, SDSC/UCSD)

  4. The Problem: Complex Multiple-World Scenarios • Current Integration Issues • Structural/Schema Conflicts • common semistructured data model (XML) • schema transformations/integration (XML queries & transforms) • Limited Query Capabilities • capability based rewriting (e.g., TSIMMIS) • ... • BUT scenarios are “one-world” (amazon.com vs. bn.com) or simple multiple world (home buyer) • Problem: No Support for Semantic Mediation • “complex multiple-world” scenarios (Neuroscience, Geoscience): • complex, disjoint, seemingly unrelated data • “hidden semantics” in complex, indirect relationships

  5. ??? Integrated View ??? ??? Integrated View Definition ??? ???Mediator ??? Wrapper Wrapper Wrapper protein localization (NCMIR) morphometry (SYNAPSE) neurotransmission (SENSELAB) A Neuroscience Question What is the cerebellar distribution of rat proteins with more than 70% homology with human NCS-1? Any structure specificity? How about other rodents?

  6. Purkinje Cell layer of Cerebellar Cortex Molecular layer of Cerebellar Cortex Fragment of dendrite Hidden Semantics: Protein Localization (NCMIR) <protein_localization> <neuron type=“purkinje cell” /> <protein channel=“red”> <name>RyR</> …. </protein> <region h_grid_pos=“1” v_grid_pos=“A”> <density> <structure fraction=“0.8”> <name>spine</> <amount name=“RyR”>0</> </> <structure fraction=“0.2”> <name>branchlet</> <amount name=“RyR”>30</> </>

  7. Branch level beyond 4 is a branchlet Must be dendritic because Purkinje cells don’t have somatic spines Hidden Semantics: Morphometry (SNYAPSE) <neuron name=“purkinje cell”> <branch level=“10”> <shaft> … </shaft> <spine number=“1”> <attachment x=“5.3” y=“-3.2” z=“8.7” /> <length>12.348</> <min_section>1.93</> <max_section>4.47</> <surface_area>9.884</> <volume>7.930</> <head> <width>4.47</> <length>1.79</> </head> </spine> …

  8. Approach: Model-Based Mediation • Complex Multiple Worlds Integration Problem • terms not directly joinable • complex, indirect associations • unstated, “hidden” semantics (not just schema conflicts) • Missing “Semantic Link” => how to define complex, indirect semantic links? => lift mediation to the level of conceptual models (CMs) => domain expert’s knowledge formalized as rules over CMs => Model-Based Mediation

  9. Integrated-DTD := XQuery(Src1-DTD,...) Integrated-CM:= CM-QL(Src1-CM,...) DOMAIN MAP IF  THEN  IF  THEN  Logical Domain Constraints IF  THEN  No Domain Constraints Structural Constraints (DTDs), Parent, Child, Sibling, ... Classes, Relations, is-a, has-a, ... C1 A = (B*|C),D B = ... C2 R C3 . . .... .... .... XML Elements .... (XML) Objects XML Models Raw Data Raw Data ConceptualModels Raw Data XML-Based vs. Model-Based Mediation

  10. Extended Mediator Architecture • Wrappers export Conceptual Models (CMs) • facts & rules for classes, relationships, ICs, ... • source data is “put into context” (“aboutness” index) by linking to domain maps (DMs) • Mediator employs CMs and DMs • ... to define complex semantic relationships on the formalized domain knowledge • Generic Conceptual Model (GCM) • as a common target CM • minimal requirements/core expressions: • instance(O,C), subclass(C1,C2) • method_type(C,M,C’), method_value(O,M,R) • relation_type(R,A1/C1,...,An/Cn) • relation_value(R,a1,...,an) • Expressiveness, Extensibility • allow inductive properties (inheritance, closures, ...) • employ a declarative rule language (e.g. F-Logic)

  11. FL rule proc. LP rule proc. GCM GCM GCM Mediator Engine CM S1 CM S2 CM S3 XSB Engine Graph proc. CM-Wrapper CM-Wrapper CM-Wrapper XML-Wrapper XML-Wrapper XML-Wrapper S3 S1 S2 Model-Based Mediator Architecture USER/Client CM (Integrated View) Domain Map DM Integrated View Definition IVD CM Plug-In CM Queries & Results (exchanged in XML) Logic API (capabilities)

  12. Purkinje cells and Pyramidal cells have dendrites that have higher-order branches that contain spines. Dendritic spines are ion (calcium) regulating components. Spines have ion binding proteins. Neurotransmission involves ionic activity (release). Ion-binding proteins control ion activity (propagation) in a cell. Ion-regulating components of cells affect ionic activity (release). domain expert knowledge domain map equivalent Description Logic facts Formalizing Domain Knowledge:Domain Map for SYNAPSE and NCMIR • A domain map comprises • Description Logic facts ... • - concepts ("classes") • - roles ("associations") • derived properties ... • ... expressed as logic rules • - (e.g. F-logic)

  13. In addition to registering (“hanging off”) data, a source may also refine the mediator’s domain map... Domain Map Refinement ... source can register new concepts at the mediator ...

  14. Definition of Integrated Views (Deja Vu?) ... • XML/CM-2-FL Translators <!ELEMENT Studies (Study)*> <!ELEMENT Study (study_id, … animal, experiments, experimenters> <!ELEMENT experiments (experiment)*> <!ELEMENT experiment (description, instrument, parameters)> studyDB[studies =>> study]. study[study_id=> string; … animal => animal; experiments=>> experiment; experimenters =>> string]. … • Specification of Domain Knowledge • Subclasses • Data Classification • Integrity Constraints mushroom_spine :: spine DERIVE S:mushroom_spine FROM S:spine[head_; neck _]. ic1(S):ALERT[type  “invalid spine”; object  S] IF S:spine[undef ->> {head, neck}].

  15. ... Definition of Integrated Views (Multiple Sources) • Integrated View Definition • Schema Reasoning & Dynamic Classes DERIVE protein_distribution(Protein, Organism,Brain_region,Feature_name,Anatom,Value) FROM I:protein_label_image[ proteins ->> {Protein}; organism -> Organism; anatomical_structures ->> {AS:anatomical_structure[name->Anatom]}] , % from PROLAB AS..segments..features[name->Feature_name; value->Value], NAE:neuro_anatomic_entity[name-> Anatom; % from ANATOM located_in->>{Brain_region}]. taxon[subspecies  string; species  string; genus  string; … phylum  string; kingdom  string; superkingdom  string]. TAXON DB Schema TAXON Rank Hierarchy subspecies::species::genus:: … kingdom::superkingdom DERIVE T:TR, TR::TR1 FROM T: ‘TAXON’.taxon[Taxon_Rank TR, Taxon_Rank1 TR1], Taxon_Rank::Taxon_Rank1. Create Classes from TAXON data

  16. Query Evaluation Example push selection @SENSELAB: X1 := select output from parallel fiber; determine source context @MEDIATOR: X2 := “hang off” X1 from Domain Map; compute region of interest (here: downward closure) @MEDIATOR: X3 := subregion-closure(X2); push selection @NCMIR: X4 := select PROT-data(X3, Ryanodine Receptors); compute protein distribution @MEDIATOR: X5 := compute aggregate(X4); "How does the parallel fiber output (Yale/SENSELAB) relate to the distribution of Ryanodine Receptors (UCSD/NCMIR)?"

  17. ANATOM Domain Map with Registered Data ANATOM DATA

  18. Deductive Closure of “has_a” with “tc(is_a)”:(YES -- Real Recursive Views!! ;-) ANATOM CLOSURE

  19. Interactive Queries KIND01

  20. Resulting Sub DOMAIN MAP “Browser” PROTLOC

  21. Computed Protein Localization Data PROTLOC

  22. Client-Side Result Visualization(using AxioMap Viewer: Ilya Zaslavsky) PROTLOC-AxioMap

  23. Comparison & Summary: Model-Based Mediation

  24. Conclusions and Outlook • Model-based Mediation Architecture • for complex multiple worlds scenarios (Neuroscience, ...) • sources export CMs (data “lifted” to conceptual level) • mediator employs DMs (“semantic road map”) • Simple Prototype based on XSB/FLORA • source and result data situated in DM context • domain scientists are excited ... • Some Open Issues • striking the right balance between complexity and expressiveness of DMs (e.g. subsumption and satisfiability of DMs should be decidable) • query processing/optimization • modeling query capabilities • semantic annotation tools for “dumb” sources • re-implement ... *sigh* ... • ...

  25. ADDITIONAL MATERIAL STARTS HERE

  26. ANATOM Domain Map ANATOM

  27. Model-Based Mediation with DOMAIN MAPS (DMs) • “Semantic Road Maps” for situating source data • => navigational aid (browsing source classes at the conceptual level) • => basis for integrated views across multiple worlds • => link points (concepts) and labeled arcs (roles) • => formal semantics (in FL and/or DLs) • Example: ANATOM DM • = antatomical entities (concepts) + is_a, has_a, overlaps, ... (roles) • => from syntactic equality to semantic joins LINK(X,Y): X.zip = Y.zip X.addr in Y.zip X.zip overlaps Y.county ... Integrated-CM(Z1,...) := get X1,... from Src1; get X2,... from Src2; LINK (Xi, Yj); Zj = CM-QL(X1,...,Y1,...)

  28. Example Query Evaluation (I) • Example: protein_distribution • given:organism, protein, brain_region • ANATOM DM: • recursively traverse the has_a_star paths under brain_region collect all anatomical_entities • Source PROLAB: • join with anatomical structures and collect the value of attribute “image.segments.features.feature.protein_amount” where “image.segments.features.feature.protein_name” = proteinand “study_db.study.animal.name” = organism • Mediator: • aggregate over all parents up to brain_region • report distribution

  29. Interactive Queries KIND

  30. Surface atlas, Van Essen Lab stereotaxic atlas LONI  MODEL-BASED Mediation MCell, CNL, Salk CCB, Montana SU NCMIR, UCSD Summary & Outlook: Federation of Brain Data Result (XML/XSLT) PROTLOC Result (VML) ANATOM

More Related