170 likes | 328 Views
Querying the deep Web. By John Muntunemuine and Martha Kamkuemah Supervisor: Sonia Berman. Outline. Problem being tackled Why its important Related Work Overview of the system Scope Design challenges Main components of project Key success factors Risks Conclusion.
E N D
Querying the deep Web By John Muntunemuine and Martha Kamkuemah Supervisor: Sonia Berman
Outline • Problem being tackled • Why its important • Related Work • Overview of the system • Scope • Design challenges • Main components of project • Key success factors • Risks • Conclusion
Problem being tackled • Querying databases hidden behind query interfaces and retrieve results from them • Build a query based system able to send a query to multiple deep Web databases simultaneously • Then investigate a generic solution
Why its important? • Not many tools out there to query the deep Web • Can create Internet services such as “comparison shopping” by integrating data from competing service providers. • Less effort to query single interface
Related work • S Raghavan & H Garcia-Molina: HiWE (Hidden Web Exposer) that automatically parses, processes & interacts with form-based search interfaces
Related work • Article: Web data management Integration of query interfaces Query processing Result processing
Related work • Wu: Mismatch problem with one-to-one mapping Developed a clustered-based schema integration technique that maps fields in query forms
Scope • Heterogeneous nature of data stored in hidden databases • Solution: look at airline domain • Start specific • Generalize solution
Design challenges Locating the relevant sections Semantically matching attributes
Main components of project • Two main parts • Query formulation • Result interpretation
Success factors • If we can send a query to one deep Web database and display results • Then expand system for more general solution by implementing heuristic to deal with general pages
Risks • If the implementation of both parts of the system takes time, we might concentrate on one side of the system – query formulation
Conclusion Query hidden databases Investigate ways to make the system more general Subsystems Query formulation Result interpretation