1 / 6

Observations on DB & IR and Conclusions

This article explores the myth of absolute facts in DB & IR, highlighting the importance of uncertainty and ambiguity. It advocates for a search paradigm based on syntactic approximation and leveraging context information. It also discusses the need for collaboration between DB, IR, NLP, and ML fields to tackle challenges in data search.

rcash
Download Presentation

Observations on DB & IR and Conclusions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ! business data is boring, action is in e-science, e-culture, edutainment, etc. ? is wishful thinking Observations on DB & IR and Conclusions ! absolute facts is a myth created by accountants, uncertainty and ambiguity are fact ! hope for precise semantics based on universally agreed upon ontologies and perfect metadata  similarity search with ranking based on statistical properties of data is the best possible syntactic approximation to „semantic“ search IR and can still leverage context information (metadata where existing, ontological knowledge if beneficial, multivariate distributions, etc.) DB

  2. DBS D XML Killer Queries: Where Google & DBMS Fail Find gene expression data and regulatory paths related to Barrett tissue in the esophagus. What are the most important results in percolation theory? Are there any theorems isomorphic to my new conjecture? Find related theorems. Find information about public subsidies for plumbers. Find new EU regulations that affect an electrician‘s business. Where can I download an open source implementation of the ARIES recovery algorithm? Which professors from D are teaching DBS and have research projects on XML? Who was the French woman that I met at the PC meeting where Peter Gray was PC Chair?

  3. + methodologically rich (statistics & prob., logic, NLP, ...) + appreciation and experience with ML + awareness of cognitive models for end-user intention & behavior IR Strengths

  4. + ities (integrity, scalability, availability, manageability, etc.) + system engineering + resource optimization (caching, memory mgt., query opt. & exec., physical design, scheduling, etc.) DB Strengths

  5. DB & IR Marriage Separation was historical accident (driven by app classes Business vs. Libraries) From DB guy perspective: team up with IR, NLP, ML folks (e.g., merge SIGMOD and SIGIR) or learn all this yourself Hypothesis: similarity-based ranking will become ubiquitous search paradigm (with traditional DB queries as a special case)

  6. DB & IR:Issues and Non-Issues • exploit collective human input • crawl structured data • simple IR on XML • use ML & ontologies • for XML IR • add flexible ranking to XQuery • polish XQuery & • implement efficiently • use ML to convert Web to XML • homepage.xml schema • extend Google to Deep Web • (DB/DL federations) • break Google monopoly: • collaborative P2P Web search • acquire broader skills (IR/DB/ML/NLP/...): • team up, learn&teach, ... ?

More Related