1 / 30

Cache Tables: Paving the way for an Adaptive Database Cache

Cache Tables: Paving the way for an Adaptive Database Cache. Mehmet Altınel, Christof Bornhövd, C. Mohan, Hamid Pirahesh, Berthold Reinwald (IBM Almaden Research Center) Sailesh Krishnamurthy (Computer Science Division,UC Berkeley) Presented by: Umar Farooq Minhas October 04, 2006.

gur
Download Presentation

Cache Tables: Paving the way for an Adaptive Database Cache

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cache Tables: Paving the way for an Adaptive Database Cache Mehmet Altınel, Christof Bornhövd, C. Mohan, Hamid Pirahesh, Berthold Reinwald (IBM Almaden Research Center) Sailesh Krishnamurthy (Computer Science Division,UC Berkeley) Presented by: Umar Farooq Minhas October 04, 2006

  2. Motivation • Issues • Response time • Scalability • Wide-spread use of Transactional Web Applications (TWA) in enterprise applications • Broad range of components e.g. network load balancers, HTTP servers, application servers, … , databases etc. • Solutions • Caching of static HTML pages • Multiple level caches

  3. Motivation contd.. • Static Caching, Drawbacks • TWAs tend to be more & more dynamic • High volumes of data • Highly personalized contents • Run business logic in remote application servers close to end users • Reduced response time • Reduced load on in-house systems • Benefits are limited by the frequency with which remote server needs to access backend DB • Proposed Solution: DBCache • Allows DB caching at mid-tier nodes, remote data centers and edge servers

  4. DBCache: Overview • Built using full-fledged DBMS, DB2 • Reduced development effort • Allows caching of related DB objects • Triggers, constraints, indices , stored procedures, … • Makes use of existing distributed query execution • Provides cache transparency • Supports both full-table and partial-table caching • On-demand caching • Adapts to dynamically changing loads • Exploits typical characteristics of TWA queries

  5. DBCache: Contributions • Database cache model • Introduces a new DB object ‘Cache Table’ • Dynamic/static caching support • Novel query re-write scheme • Cache load and maintenance mechanisms

  6. Outline • Motivation • DBCache: Overview • Cache Tables • Dynamic Cache Model • Query Compilation • Cache Table Population and Maintenance • Performance Evaluation • Conclusions & Future Work • Discussion

  7. Cache Tables • A Cache Table is a database object by which an end user can specify that a table (cache table) in a database (cache database) is a cache of a table (backend table) in another database (backend database) Back end Table Cache Table Backend DB Cache DB • Two types of cache tables supported: • Declarative/Static Cache tables • Dynamic Cache tables

  8. Declarative/Static Cache Tables • When table contents static and known upfront • Use declarative cache tables • Similar to materialized views • Entire table cached in absence of predicate definition • Exploits existing materialized view support in DB2

  9. Dynamic Cache Tables • Populated on-demand • Provides adaptability • Can choose to cache only “hot” items

  10. DBCache Schema Setup • Cache schema exact mirror of backend DB schema • Each backend DB table represented by • Cache Table or • Nickname (caching disabled) • Requires no change in existing queries • Allows caching of other relevant logical and physical objects

  11. Outline • Motivation • DBCache: Overview • Cache Tables • Dynamic Cache Model • Query Compilation • Cache Table Population and Maintenance • Performance Evaluation • Conclusions & Future Work • Discussion

  12. Dynamic Cache Model • Key concepts • Cache Keys • Defined on cache table column • Can be non-unique • Must be ‘domain-complete’ • Unique/Primary key columns complete by definition • Guarantees correctness of equality predicates

  13. Dynamic Cache Model • Key concepts contd.. • Referential Cache Constraints (RCCs) • Defined between any cols of two cache tables • Creates a cache-parent/cache-child relationship • Guarantees the correctness of equi-join predicates • Somewhat similar to referential integrity constraints

  14. Dynamic Cache Model • Key concepts contd.. • Cache Groups • Set of related cache tables whose content is (directly or transitively) populated by the values of one or more cache keys of a single cache table, called the root table. • Tables reachable by RCC constraints from the root table are called member tables • Advantages • Application context recognized more easily • Helps avoiding conflicting cache constraints

  15. Dynamic Cache Model • Key concepts contd.. • Cache Groups contd.. • Represented by a directed graph called cache group graph, nodes denote cache tables and edges denote RCCs • Direction of an edge for RCC is from a cache-parent to a cache-child • Bi-directional edges possible • Two or more groups can be overlapping • Captured in connectivity graphs

  16. Dynamic Cache Model • Issues with Cache Constraints • Can cause unexpected cache loads resulting in a phenomena called recursive cache load problem • A cache group is called safe if it avoids this problem • How to ensure group safety ?

  17. Dynamic Cache Model • Rules for cache group safety • Rule-1: A cache group graph must not include any heterogeneous cycles. • Rule-2: A cache table must not have more than one non-unique domain-complete column. • A new cache constraint is created only if it doesn’t violate Rule 1 and Rule 2.

  18. Outline • Motivation • DBCache: Overview • Cache Tables • Dynamic Cache Model • Query Compilation • Cache Table Population and Maintenance • Performance Evaluation • Conclusions & Future Work • Discussion

  19. Query Compilation • Declarative Cache Tables • Existing materialized view matching mechanism in DB2 is exploited • Name switching • Dynamic Cache Tables • Generate two plans local plan and remote plan • Choose at run-time through a switch operator which uses the probe query to decide which leg to execute • Janus (two-headed) plan: derived from Roman Mythology • God of gates, doors, doorways, beginnings and endings. Month of January ? http://en.wikipedia.org/wiki/Janus_%28mythology%29

  20. Query Compilation • Constructing a Janus Plan: 1 Initial Query Plan Remote Query Plan Replace Cache Table names with Nicknames 2 Generate a probe query by checking all equality predicates that can potentially participate in probe query condition if none found then ABORT ( remote query plan gets executed ) 3 Cloned Input Query Graph Local Query Plan Replace Nicknames with eligible Cache Table names from step - 2 4 Insert switch operator on top of remote, local and probe query plans

  21. Outline • Motivation • DBCache: Overview • Cache Tables • Dynamic Cache Model • Query Compilation • Cache Table Population and Maintenance • Performance Evaluation • Conclusions & Future Work • Discussion

  22. Cache Table Population & Maintenance • Declarative Cache Tables • Relies on DPropR utility: IBM’s asynchronous data replication tool • Dynamic Cache Tables • On-demand loading • Cache key values failing probe query are used to extract data • Extracted data populated asynchronously by a cache daemon • Cache invalidation • Generate invalidation messages and send to cache daemon • Cache daemon generates and executes deletes against cacheDB • Updated rows get loaded with new requests

  23. Outline • Motivation • DBCache: Overview • Cache Tables • Dynamic Cache Model • Query Compilation • Cache Table Population and Maintenance • Performance Evaluation • Conclusions & Future Work • Discussion

  24. Performance Evaluation • Focus: Evaluate overhead of Janus plans for dynamic tables • Overhead of probe query and switch operator • Overhead of on-demand loading • Experimental settings

  25. Performance Evaluation • Cache Hit Case • Janus plan vs. pure local queries • Difference gives the overhead for probe query and the switch operator • Cache table loaded with all the data from backend table

  26. Performance Evaluation • Cache Miss Case • Janus plan vs. pure remote queries • Difference gives the overhead • Cache table initially empty

  27. Outline • Motivation • DBCache: Overview • Cache Tables • Dynamic Cache Model • Query Compilation • Cache Table Population and Maintenance • Performance Evaluation • Conclusions & Future Work • Discussion

  28. Conclusions & Future Work • Significant contributions • Provides a new frame-work to implement DB caching for TWAs and tends to provide: • Seamless integration with current applications • Supports static/dynamic cache tables • Adapts to the changing workloads in TWAs • Re-uses the functionality of a full-fledged DBMS i.e. DB2 • What next ? • Provide efficient, scalable, zero-admin DBCache • Development of new tools to ease deployment • Improve adaptability and maintenance

  29. Comparison vs. amco05: • Relies on asynchronous data propagation utility • Not completely transparent • May not work for heterogeneous DBMSs • Allows stale data vs. gula04: • Cache constraints against C&C constraints • Doesn’t provide any guarantees of freshness/consistency • Relatively more transparent • Maintenance-centric vs. query-centric • Both deployed as mid-tier level caches • Both use a full-fledged DBMS • Both use Materialized views • Both use two-headed query plans

  30. Discussion • Is it really that good ? • Using full-fledged DBMS at each middle-tier node, drawbacks ? • How is data freshness specified/guaranteed ? • Is it adaptable ? Weakly ? Strongly ? • When can cache constraints become bottleneck ? • Size of dynamic cache tables ? • Cache replacement policies/cleansing mechanisms? • Caching of other physical & logical DB Objects ? • Updates to those objects in backend DB? • Message traffic between Cache Daemon & Backend DB ? • Very frequent updates in backend DB • Local updates ? • Flaws in performance evaluation ?

More Related