1 / 12

Towards Automating Complex Associative Access to Multiple Bioinformatics Data Sources

XWRAPComposer. Towards Automating Complex Associative Access to Multiple Bioinformatics Data Sources. Ling Liu, Calton Pu David Buttler, Wei Han Henrique Paques, Dan Rocco Georgia Tech. Outline. State of Art Users’ Perspective Technology Perspective Why SDM Technology – XWRAP Composer

tyrone-bean
Download Presentation

Towards Automating Complex Associative Access to Multiple Bioinformatics Data Sources

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XWRAPComposer Towards Automating Complex Associative Access to Multiple Bioinformatics Data Sources Ling Liu, Calton Pu David Buttler, Wei Han Henrique Paques, Dan Rocco Georgia Tech

  2. Outline • State of Art • Users’ Perspective • Technology Perspective • Why SDM Technology – XWRAP Composer • Users’ Perspective • Technology Perspective • Progress Report and Near Term Deliverables • Related Long Term Research

  3. Today: Simple Query-Based Searching Query Query 3 Query 4 Semantic Web Web Query 2 Query 1 Why Automating Complex Associative Access Tomorrow with SDM Technology Large & Unorganized Document Collections Complex Associative Access is automated (one stop shopping) Complex Associative Access requires experts

  4. Today: Simple Query-Based Searching Query 3 Query 4 Semantic Web Query 2 Web Query 1 Why Automating Complex Associative Access Tomorrow with SDM Technology Characterize Sort Partition Large & Unorganized Document Collections Filter Summarize

  5. Automating Complex Associative Access XWRAPComposer • Wrapper Technology • Workflow Technology • Semantic Web Technology • Service Discovery • Service Selection • Service Composition • Research Issues • Semantic Data Integration, Interoperability • Scalability, High Performance • Trusted Computing, Dependable, Survivable

  6. XWRAPComposer • What is it? • A wrapper generation system that can semi-automatically generate wrappers (info. extraction programs) • capable of accessing multiple scientific Web pages in one shot. • What makes it different from other existing XWRAP tools? • Capable of generating wrappers that extract information from multiple Web pages connected by URLs (page links) and compose them into an integrated XML document • Extremely useful for Automating Complex Associative Access to multiple scientific data sources

  7. CACCTGGAGAAACTTCTGCACTGGCACTGTGTTCCNAGAGCTCCTTCTATGCGTCCCTCC CAAGTGATTTAATTTCAGCTGATTGGACTACGAATTCACAAGGCAGAAAAGTCAAGGTCA TTTGGNATCTGGAGACAGGAGAACTCAAGGAACCNAAAGGACT Query 3 Query 4 Query 2 Query 1 AA045112 htgs SDM Enabling Technology: XWRAPComposer Existing Wrapper Technology Blast Detail Wrapper Blast Sum Wrapper Sequence Wrapper Seq. Link Wrapper Extracting Data from a single Web Document

  8. CACCTGGAGAAACTTCTGCACTGGCACTGTGTTCCNAGAGCTCCTTCTATGCGTCCCTCC CAAGTGATTTAATTTCAGCTGATTGGACTACGAATTCACAAGGCAGAAAAGTCAAGGTCA TTTGGNATCTGGAGACAGGAGAACTCAAGGAACCNAAAGGACT Query 2 Query 1 AA045112 htgs SDM Enabling Technology: XWRAPComposer WrapperComposerTechnology Blast Wrapper Full Seq Wrapper Extracting Data from Multiple Web Documents

  9. XWRAPComposer: Technical Perspective Given a sequence, list all matching DNAs. Web NCBi Blast Site Blast Query Page Blast Wrapper Blast Format Page Blast Delay Page Blast Summary Page • Interface/Outerface Specification • Composer Script • Multi-page Control Flow Modeling • Data Extraction Workflow Blast Detail Page

  10. Data Source Data Source XML Wrapper Data Source XML Wrapper Data Source XML Wrapper XML Wrapper Data Source XML Wrapper Data Source XML Wrapper Data Source XML Wrapper XWRAP Human Knowledge Code Generator Extraction Rules GUI SDM Center Data Integration Infrastructure User Agent Parameterized Workflow Specification (PWS) WF infeasible: report reason User constraints & parameters Source Capabilities (SC) Binding Patterns Workflow Resolution Service (WRS) DB WF feasible Domain Map/Ontology Workflow Instantiation Service (WIS) Data Registration Services Registration Executable Workflow Plan: “Matt’s WF” Data Integration Agent(s) Data Mediation Service registry and brokering DB External Program Database Access Program Interfacing External Interface Workflow Agent Other I/O Agents Wrapper based Agent Communication Protocol Gateway Wrapper based Agent Wrapper based Agent Other Agents (e.g., VIPAR) User (Matt) Data Sources

  11. Progress Report • Status • Produced Three Deliverables • Composer Interface/Outerface Specification • Five Java Wrappers for Pilot Scenario • Composer Script Examples for Pilto Scenario • XWRAPComposer design and development • Near Term Plan • Finish the design of XWRAP Composer scripting language ( Nov. 2002) • Develop the first prototype of XWRAP Composer system (Jan. 2003) • Performance Evaluation (March. 2003)

  12. Related Long Term Research • Semantic Web and Semantic Data Integration • Service Discovery • dynamic content crawler • Service Selection • Adaptive query routing • Service Composition • Infopipe Technology

More Related