1 / 22

Semantic Data Caching and Replacement

Semantic Data Caching and Replacement. Based on the talk by Kunhao Zhou about the paper by: Shaul Dar, Michael J. Frankin, Bjorn T. Jonsson, Divesh Srivastava, Michael Tan. Proceedings of the 22 nd VLDB Conferences Mumbai (Bombay), India, 1996. Outline. Motivation

foster
Download Presentation

Semantic Data Caching and Replacement

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Semantic Data Caching and Replacement Based on the talk by Kunhao Zhou about the paper by: Shaul Dar, Michael J. Frankin, Bjorn T. Jonsson, Divesh Srivastava, Michael Tan Proceedings of the 22nd VLDB Conferences Mumbai (Bombay), India, 1996

  2. Outline • Motivation • Client Caching Architecture • Model of Semantic Caching • Simulations and Results • Conclusion and Future Work

  3. Motivation • Distributed database • Clients are high-end workstations (fat client) • High computational power. • Big local storage

  4. Motivation (Contd.) • Effective use of a client is the key to achieving high performance. • Less network traffic. • Faster response time. • Higher server throughput. • Better scalability.

  5. Client Caching Architecture • Data-Shipping. • Client process query. • Data is brought on-demand from servers. • Navigational access. • Object ID (Tuple ID or Page ID). • Can be categorized as tuple-based or page-based • Cache Replacement Policies: • LRU. • MRU.

  6. Client Caching Architecture (Contd.) • Data-Shipping. • Problem. • Applications require associative access to data, that is, as provided by relational query languages.

  7. Client Caching Architecture (Contd.) • Query-Shipping. • Associative access to data. • Problems. • Implementations do not support client caching. (No caching).

  8. Client Caching Architecture (Contd.) • Semantic Caching. • A model that integrates support for associative access into an architecture based on data-shipping. • Advantage. • Exploit the semantic information to effectively manage client cache.

  9. Client Caching Architecture (Contd.) • Semantic Caching. • Semantic description of the data rather than use record-id or page-id. • Can be used to generate remainder query to send to server if the requested tuples are not available locally. • Information for replacement is maintained as semantic regions. • Low overhead, insensitive to bad clustering. • Cache replacement use value function based on semantic description. Not just LRU or MRU.

  10. Client Caching Architecture (Contd.)

  11. Model of Semantic Caching • Remainder Query • Semantic Regions • Replacement Issues

  12. Remainder Query • Relation Re, query Q, client cache V. • Probe query P(Q,V) = Q ÙV can be answered locally. • Remainder query R(Q,V) = QÙ(ØV) should be sent to the server. • Example: • Select * from E where. salary< 60,000 and salary >30,000. • Client cache all the tuples, which salary < 50,000. Q = (salary< 60,000 ) Ù (salary >30,000). V = (salary <50,000). P = (salary<50,000) Ù(salary >30,000). R = (salary>=50,000) Ù(salary< 60,000 ). P R Re V Q

  13. Semantic Regions • Cache management and replacement unit. • Grouped by semantic value. Each semantic region has a single replacement value. • Described by a constrained formula. • Consideration: • Semantic region merge. (a)Original regions (a)Regions after Q

  14. Semantic Regions • Cache management and replacement unit. • Grouped by semantic value. Each semantic region has a single replacement value. • Described by a constrained formula. • Consideration: • Semantic region merge.(always merge) (a)Original regions (a)Regions after Q

  15. Replacement Issues • Temporal locality • LRU, MRU

  16. Replacement Issues (Contd.) • Semantic locality • Manhattan distance (Note) Manhattan distance Definition: The distance between two points measured along axes at right angles. In a plane with p1 at (x1, y1) and p2 at (x2, y2), it is |x1 - x2| + |y1 - y2|. O p1 O O o p2 | p1 p2 | = |p2O | + |p1O |

  17. Simulation and Result Relation has three candidate keys, Unique2 is indexed and clustered, Unique1 is indexed and unclustered, Unique3 is unindexed and unclustered.

  18. Simulation and Result (Contd.) • Unique2 (Clustered Index). • Performance: • Almost the same. • Page-based is slightly better. • Reason: • Page-based overhead is smaller.

  19. Simulation and Result (Contd.) • Unique1(Unclustered Index). • Performance: • Tuple-based and semantic-based. are much better. • Reason: • Page-based is sensitive to clustered.

  20. Simulation and Result (Contd.) • Unique3(UnIndexed and Unclustered). • Performance: • Semantic-based is better. • Reason: • Remainder enables client and server. process query in parallel.

  21. Simulation and Result (Contd.) • Semantic locality / Manhattan distance on Unique1. • Performance: • Manhattan distance is better than LRU. • Reason: • “Cold regions” will be replaced faster.

  22. Conclusion and Future Work • Conclusion. • A simple model with selection query, semantic caching provides better performance. • Future work. • Implementation issues for complex query, update, deletion, and insertion: • Concurrency control. • Consistency. • Completeness. • A Predicate-based caching scheme for client-server database architecture. (Arthur M. Keller and Julie Basu)

More Related