1 / 17

Querying the deep Web

Querying the deep Web. By John Muntunemuine and Martha Kamkuemah Supervisor: Sonia Berman. Outline. Problem being tackled Why its important Related Work Overview of the system Scope Design challenges Main components of project Key success factors Risks Conclusion.

stesha
Download Presentation

Querying the deep Web

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Querying the deep Web By John Muntunemuine and Martha Kamkuemah Supervisor: Sonia Berman

  2. Outline • Problem being tackled • Why its important • Related Work • Overview of the system • Scope • Design challenges • Main components of project • Key success factors • Risks • Conclusion

  3. Problem being tackled • Querying databases hidden behind query interfaces and retrieve results from them • Build a query based system able to send a query to multiple deep Web databases simultaneously • Then investigate a generic solution

  4. Why its important? • Not many tools out there to query the deep Web • Can create Internet services such as “comparison shopping” by integrating data from competing service providers. • Less effort to query single interface

  5. Related work • S Raghavan & H Garcia-Molina: HiWE (Hidden Web Exposer) that automatically parses, processes & interacts with form-based search interfaces

  6. Related work • Article: Web data management Integration of query interfaces Query processing Result processing

  7. Related work • Wu: Mismatch problem with one-to-one mapping Developed a clustered-based schema integration technique that maps fields in query forms

  8. Overview of the system

  9. Scope • Heterogeneous nature of data stored in hidden databases • Solution: look at airline domain • Start specific • Generalize solution

  10. Design challenges Locating the relevant sections Semantically matching attributes

  11. Main components of project • Two main parts • Query formulation • Result interpretation

  12. Success factors • If we can send a query to one deep Web database and display results • Then expand system for more general solution by implementing heuristic to deal with general pages

  13. Risks • If the implementation of both parts of the system takes time, we might concentrate on one side of the system – query formulation

  14. Conclusion Query hidden databases Investigate ways to make the system more general Subsystems Query formulation Result interpretation

  15. Ouestions?

More Related