1 / 6

Query Caching in Agent-based Distributed Information Retrieval

Query Caching in Agent-based Distributed Information Retrieval. Hemali Majithia. Problem Definition. DIR (IR) systems access their collections to perform searches and answer queries Query resolution on large corpora is expensive in terms of time and resources

paniz
Download Presentation

Query Caching in Agent-based Distributed Information Retrieval

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Query Caching in Agent-based Distributed Information Retrieval Hemali Majithia Hemali Majithia - CADIP, UMBC

  2. Problem Definition • DIR (IR) systems access their collections to perform searches and answer queries • Query resolution on large corpora is expensive in terms of time and resources • Similar queries produce similar results • Repetitive and redundant searching of the collections • Resource Wastage and Inefficiency • Solution – “ CACHING QUERIES ” Hemali Majithia - CADIP, UMBC

  3. Solution • Caching Mechanism • Cache new queries along with the results • Answer future similar queries using the cached queries • New Query • Query which has not been answered before • Similar Query • Query which is identical or similar to the queries existing in the cache • Emphasis • If similar queries exist, you can retrieve the results for those queries from the previous searched queries rather than exact match • Retrieval  linear time  collection size Hemali Majithia - CADIP, UMBC

  4. Caching Mechanism • Two level Caching Mechanism • First level  Exact Match • Second level  Inverted Index of the queries • Caching Algorithm • Least Recent Used (LRU) • Least Frequent Used (LFU) • Lowest Relative Value (LRV) • Similarity Metric • Cosine Similarity Hemali Majithia - CADIP, UMBC

  5. Secondary Cache Secondary Cache 9.. Update cache 5. Miss 3. MISS 4. Query forwarded 10. Results returned 8. HIT 2. Lookup 11. Response 7. Lookup 1. User query 6. Query forwarded to best C2 Primary cache Primary cache Primary cache Primary cache Primary cache Primary cache Caching in CARROT–II Node I Node II Query Agent C2 Agent C2 Agent C2 Agent C2 Agent C2 Agent C2 Agent Hemali Majithia - CADIP, UMBC

  6. Metrics for Evaluation of Caching Mechanism • Efficiency • Round Trip Time (RTT) = Total time to answer queries fired at the system • Hit Rate = For each agent cache and total hit rate • Cost of caching = The over head caused by caching (assuming that the HIT rate is 0) • Effectiveness • Precision =fraction of retrieved documents that are relevant • Recall =fraction of relevant documents that are retrieved Hemali Majithia - CADIP, UMBC

More Related