170 likes | 283 Views
Modeling Interactive Web Sources for Information Mediation. Bertram Ludäscher, Amarnath Gupta San Diego Supercomputer Center, UCSD. Information Mediation Framework/Motivation Modeling Interactive Sources with Interaction Diagrams Computing Derived Capabilities Conclusions & Future Work.
E N D
Modeling Interactive Web Sources for Information Mediation Bertram Ludäscher, Amarnath Gupta San Diego Supercomputer Center, UCSD Information Mediation Framework/Motivation Modeling Interactive Sources with Interaction Diagrams Computing Derived Capabilities Conclusions & Future Work
Mediation of Information Using XML (MIX) BBQ UI Mediator XMAS View Def. XML-queries Mediator XML-answers Wrapper Wrapper Wrapper WWW RDB OODB MIX Mediator Architecture Modeling Web Sources for Information Mediation
Motivation • Wrappers export schema, capabilities, and data of sources • Specifics of Web sources (e.g., vs RDB/OODB sources): • limited query capabilities, i.e., • restricted i/o binding patterns • need for user interaction • complex navigations • data extraction • accessibility • ... Modeling Web Sources for Information Mediation
Example: ATM/bank locator • "Find someATM/bank locations close to a location given by street, city, state, user interaction." • involves • input and output attributes • forms, navigations, user interaction • data extraction Modeling Web Sources for Information Mediation
BBQ UI Mediator XMAS View Def. user/source interaction XML-queries Mediator XML-answers Wrapper Wrapper Wrapper WWW RDB OODB "Non-Wrappable" User Interactions Modified MIX Mediator Architecture Modeling Web Sources for Information Mediation
Modeling Interactive Sources... • Model input/output behavior of (HTML) elements as select-project queries: • Given values for input attributes a, • select tuples satisfying y(a)from source relation R, • project on output attributes b : • pb(sy(a)(R)) Modeling Web Sources for Information Mediation
Modeling Interactive Sources... • Example (HTML Link): • clicking on an airport code apc produces the airport address with zip code: • pzip(sy(apc)(R)) • where • y(apc):= (apc=$apc) and • $apcrepresents the actual parameter (here: link label) • (or: answer(ZIP) APC=$APC, R(APC,ZIP,...) ) Modeling Web Sources for Information Mediation
... Modeling Interactive Sources ... • Example (HTML Forms): • Given a street and city name, extract the zip code: • pzip(sy(street,city)(R)) • where y(street,city):= (street=$scity=$c) • By default, forms are modeled as conjunctions of equalities. • (similar for menus and other elements) Modeling Web Sources for Information Mediation
... Modeling Interactive Sources ... • Example (Non-Wrappable Elements): • interactions requiring explicit user interaction with the source, and where the result depends on a1, ..., an are denoted by • !ui(a1, ..., an) • The user interaction is modeled by a new, internal attribute x: • py(sy(a1, ..., an, x)(R)) Modeling Web Sources for Information Mediation
... Modeling Interactive Sources... • Use select-project queries to model i/o behavior of individual user interaction elements (links, forms, menus, maps, ...) • Missing: how to "glue" together ... • navigations through a Web site and • data extraction • ... in order to model source capabilities of complex interactions • model a Web source using interaction diagrams: • nodessource page(s) and exported data • edgestransitions and required interactions Modeling Web Sources for Information Mediation
i[,u\v] t: xy Interaction Diagrams • diagram: labeled graph over given attributes • node source page(s) with same i/o behavior • node label exported attributes (tuple/set/list): (a_1,...,a_n), {(a_1,...,a_n)}, [(a_1,...,a_n)] • edge transition, edge label required interaction: [a1,... ] {b1, ... } • interaction i{href, form, menu, !ui, ...} • transition condition • t's result depends on attributes u but not v Modeling Web Sources for Information Mediation
Example Interaction Diagram • transition path t1.t3.t5: • fill in form with street, city, menu with state, • fill in form s.t. radius>0, • follow all bankid links • derivable query template q(t1.t3.t5) = (+street,+city,+state,+radius,-bname,-bstreet, -bcity) Modeling Web Sources for Information Mediation
i[,u\v] t: xy Deriving Query Capabilities: Single Transitions • input atts • in(t) := (atts(x) atts(i) u) \ v • output atts • out(t) := atts(y) • interaction requirements • act(i) := i Modeling Web Sources for Information Mediation
Deriving Query Capabilities:Transition Paths t1.t2.t (t = t3. ... .tn) • in(t1.t2.t) := in(t1) in(t2.t) \ propagate(con(t1.t2))) • out(t1.t2.t) := out(t2.t) • act(t1.t2.t) := act(t1) con(t1.t2)) act(t2.t) • where • con(t1.t2)) connects out(t1) with in(t2) (e.g., ) • propagate(...) are attributes whose values are passed along • (e.g., out(t1) in(t2)) • denotes serial conjunction Modeling Web Sources for Information Mediation
Use of Interaction Diagrams • modeling tool • complex query capabilities (binding patterns) along paths t = t1. ... .tn qt(in(t), out(t)) • sequence of interaction requirements execution plans • subsumption and equivalence of binding patterns check for supported queries • distinction between wrappable vs. unwrappable queries using the number of !ui nodes in paths Modeling Web Sources for Information Mediation
Interacting with the Mediator • Diagram-enabled wrapper can do local query optimizations • Models of wrapper-mediator interaction: • Wrapper exports complete interaction diagram, mediator plans query • Mediator hands complete subquery to wrapper, wrapper optimizes query • Wrapper gives mediator partial plans, mediator computes overall query plan • An interesting model • Touting wrapper: identifies missing element in a query that would make an infeasible query feasible Modeling Web Sources for Information Mediation
Conclusions & Future Work • Ongoing: • Stand-alone Web site modeling tool (SiteModel: P. Nguyen) which exports an interaction diagram • Definition & analysis of query capabilities of complex sources • Future Work: • support for query evaluation at the MIX mediator and user/source interaction • use interactions diagrams for semi-automatic wrapper generation Modeling Web Sources for Information Mediation