1 / 21

TIJAH @ INEX 2003

TIJAH @ INEX 2003. The Cirquid Team CWI and University of Twente. Overview. Introduction Content-Only (CO) (Pattern-Based) Structured Querying Conclusions and Future Work Questions/Discussion. Content-Only (CO). Same model as for INEX 2002 Exhaustivity (content-based relevance)

rian
Download Presentation

TIJAH @ INEX 2003

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. TIJAH @ INEX 2003 The Cirquid Team CWI and University of Twente

  2. Overview • Introduction • Content-Only (CO) • (Pattern-Based) Structured Querying • Conclusions and Future Work • Questions/Discussion

  3. Content-Only (CO) • Same model as for INEX 2002 • Exhaustivity (content-based relevance) • Statistical Language Model [Hiemstra 2000] • Specificity • Log-normal distribution • Component size, mean at ~2500 words

  4. Structured Querying (SCAS/VCAS) • Pattern-Based Structured Querying • Collection of 3 patterns • Base pattern for determining a single subtree (pattern 1) • More complex combinations of pattern 1 instances (patterns 2 and 3)

  5. About Function • Use of ALL topic’s keywords to process EACH OF the about clauses //article[about(./,’IR’) AND about(.//sec,’XML’)]

  6. About Function • Use of ALL topic’s keywords to process EACH OF the about clauses //article[about(./,’IR’) AND about(.//sec,’XML’)] //article[about(./,’IR XML’) AND about(.//sec,’IR XML’)]

  7. Pattern 1 • Simplest pattern instance • Topic 69 • VCAS and SCAS /article/bdy/sec[about(.//st, ‘…’)]

  8. Pattern 1 – VCAS and SCAS article • Nodeset selections • Containment • Relevance • Containment bdy … sec … … st

  9. Average ranked containing • Previous operation is ranked containing • Multiple subtrees within the target element are averaged sec st st 0.2 0.1

  10. Average ranked containing • Previous operation is ranked containing • Multiple subtrees within the target element are averaged sec sec 0.15 st st 0.2 0.1

  11. Pattern 2 • Topic 73 • VCAS • Absence of subtree does not render target irrelevant completely • SCAS • All subtrees specified need to be present for relevance //article[about(.//st, ‘…’) AND about(.//bib, ‘…’)]

  12. article … … bib Pattern 2 – VCAS • Split up into set of pattern 1 instances • Combine resultsets • OR -> max • AND -> min • (non zero) st

  13. Pattern 2 – SCAS • Split up into set of pattern 1 instances • Combine resultsets • AND -> min • OR -> max article … … st bib

  14. Art 1 0.2 Art 2 0 Art 1 0.1 Art 2 0.3 Art 1 0.1 Art 2 0.3 Art 1 0.1 Art 2 0 Pattern 2 - Example //article[about(.//st, ‘+comparison’) AND about(.//bib, ‘machine learning’)] 1.- Execution of 2 pattern 1 //article[about(.//st, ‘comparison machine learning’)] //article[about(.//bib, ‘comparison machine learning’)] 2.- Combining results VCAS AND SCAS

  15. Pattern 3 • Topic 64 CAS //article[about(., ‘…’)]//sec[about(., ‘…’)] • VCAS • What does the first about mean? • Drop all about-calls, except those specified • for target element • SCAS • Split up into set of pattern 1 instances • Topdown structural correlation to correct • nodeset

  16. Pattern 3 – VCAS article (about 1) … … (about 2) sec //article//sec[about(., ‘…’)]

  17. article 1.- about 1 … 3.- containment … 2.- about 2 sec Pattern 3 – SCAS Ranked by the scores of the target element

  18. Art 1 0.2 Art 2 0 sec 1 0.1 sec 2 0.3 sec 1 0.1 sec 2 0.3 sec1 0.1 Pattern 3 - Example //article[about(./, ‘hollerith’)]// sec[about(., ‘DEHOMAG’)] 1.- Execution of 1 or 2 pattern 1 //article [about(./, ‘hollerith DEHOMAG’)] //article//sec [about(./, ‘hollerith DEHOMAG’)] 2.- Ranked containing Only second about VCAS In case sec 1 belongs to art 1 and sec 2 do not SCAS

  19. //st /article/bdy/sec W Q  /article/bdy/sec about   avg-groupby Physical Query Plan - Pattern 1 /article/bdy/sec[about(.//st, ‘…’)]

  20. Conclusions • CO model works pretty well • Article run still • ‘Keep it simple’ approach

  21. Any questions?

More Related