320 likes | 695 Views
LRU-K Page Replacement Algorithm. CSCI 485 Lecture notes Instructor: Prof. Shahram Ghandeharizadeh. Outline. History Motivation for LRU-K Alternatives to LRU-K LRU-K Design and implementation Conclusion. History.
E N D
LRU-K Page Replacement Algorithm CSCI 485 Lecture notes Instructor: Prof. Shahram Ghandeharizadeh.
Outline • History • Motivation for LRU-K • Alternatives to LRU-K • LRU-K • Design and implementation • Conclusion
History • LRU-K is attributed to Elizabeth J. O’Neil, Patrick E. O’Neil, and Gerhard Weikum: • The LRU-K Page Replacement Algorithm for Database Disk Buffering, ACM SIGMOD 1993, Washington D.C., page 297-306.
Least Recently Used (LRU) • When a new buffer page is needed, the buffer pool manager drops the page from buffer that has not been accessed for the longest time. • Originally for patterns of use in instruction logic (Denning 1968). • Limitation: Decides what page to drop from buffer based on too little information (time of last reference).
Pseudo-code for LRU LRU (page p) If p is in the buffer then LAST(p) = current time; Else i) Min = current time + 1; ii) For all pages q in the buffer do a) If (LAST(q) < min) victim = q Min = LAST(q) iii) If victim is dirty then flush it to disk iv) Fetch p into the buffer frame held by victim • LAST(p) = current time
Example 1: LRU Limitation • Consider a non-clustered, primary B-tree index on the SS# attribute of the Employee table. • t(Emp) = 20,000 • P(Emp) = 10,000 (2 records per disk page) • lp(I, Emp) = 100 • Workload: queries that retrieve Emp records using exact match predicates using SS# attribute, e.g., SS#=940-98-7555 • If the B-tree is one-level deep (root-node, followed by the 100 leaf pages), pattern of access is: Ir, I1, D1, Ir, I2, D2, Ir, I3, D3, …. • Assume your workload consists of 101 frames, what is the ideal way to assign leaf pages and data pages to these frames?
What will LRU do? LRU (page p) If p is in the buffer then LAST(p) = current time; Else i) Min = current time + 1; ii) For all pages q in the buffer do a) If (LAST(q) < min) victim = q Min = LAST(q) iii) If victim is dirty then flush it to disk iv) Fetch p into the buffer frame held by victim • LAST(p) = current time • In our example: Data pages compete with the leaf pages, swapping them out. More disk I/O than necessary.
Example 2: LRU Limitation • A banking application with good locality of shared page references, e.g., 5000 buffered pages out of one million disk pages observe 95% of the references. • Once a few batch processes begin sequential scans through all one million pages, the referenced pages swap out the 5000 buffered pages.
Possible Approaches • Page pool tuning • Query execution plan analysis • LRU-K
Page pool tuning • DBA constructs page pools, separating different reference patterns into different buffer pools. • Disadvantage: • requires human effort, • What happens when new reference patterns are introduced? Or existing reference patterns disapper?
Query execution plan analysis • Query optimizer should provide hints about the usage pattern of a query plan • Buffer pool manager employs FIFO for pages retrieved by a sequential scan. • Buffer pool manager employs LRU for index pages. • In multi-user situations, query optimizer plans may overlap in complicated ways. • What happens with Example 1?
LRU-K • The victim page (page to be dropped) is the one whose backward K-distance is the maximum of all pages in buffer. • Definition of Backward K-distance bt(p,K): Given a reference string known up to time t (r1, r2, …,rt), the backward distance bt(p,K) is the distance backward to the Kth most recent reference to page p.
LRU-K (Cont…) • Design limitations: • Early page replacement: An unpopular page may observe correlated references shortly after being referenced for the first time. • Extra memory because LRU-K retains history of pages referenced (even those that are not in the buffer). LRU does not have this limitation; its memory requirement is well defined.
Key observation • Two correlated references are insufficient reason to conclude that independent references will occur. • One solution: The system should not drop a page immediately after its first reference. Instead, it should keep the page around for a short period until the likelihood of a dependent follow-up reference is minimal. Then the page can be dropped. AND, correlated references should not impact the interarrival time between requests as observed by LRU-K. • Correlated reference period = timeout.
Memory required by LRU-K • Why not keep the last K references in the header of each disk page (instead of main memory)? • After all, when the page is memory resident then its last K references are available.
Memory required by LRU-K • Forget history of pages using the 5 minute rule. Those pages that are not referenced during the last 5 minute, loose their history.
HIST & LAST are main memory data structures. Optimizations: Use tree search to find the page with maximum backward K-distance. Pseudo-code of LRU-K
Performance Analysis • Compare LRU (LRU-1) with LRU-2 and LRU-3. • Three different workloads • Measured metrics: • Cache hit for a given buffer pool size. • How much larger should the buffer pool with LRU-1 be in order to perform the same as LRU-2? This value is represented as B(1)/B(2).
Workload 1: Two Pool Experiments • Designed to resemble Limitation 1 shown earlier. • Two pools of disk pages: N1 and N2. • Alternate references to each pool. A page in pool Ni is referenced randomly. • What is the probability of reference to a page in Pool N1?
Key Observations • LRU-3 is identical/very-close to optimal. • Why would one not choose K=3?
Key Observations • LRU-3 is identical/very-close to optimal. • Why would one not choose K=3? • For evolving access patterns, LRU-3 is less adaptive than LRU-2 because it needs more references to adapt itself to dynamic changes of reference frequencies. • LRU-3 requires a larger number of requests to forget the past. • Recommendation: Advocate LRU-2 as a generally efficient policy.
Workload 2: Zipfian random access • 1000 pages accessed using a Zipfian distribution of access.
Workload 3: Trace driven • Gather traces for one hour from an OLTP system used by a large bank. • Number of unique page references is 470,000. • Key observation: LRU-2 is superior to both LRU and LFU.
Workload 3 • LRU-2 is superior to both LFU and LRU. • With small buffer sizes (< 600), LRU-2 improved the buffer hit ratio by more than a factor of 2. • LFU is surprisingly good. Why not LFU?
Workload 3 • LRU-2 is superior to both LFU and LRU. • With small buffer sizes (< 600), LRU-2 improved the buffer hit ratio by more than a factor of 2. • LFU is surprisingly good. Why not LFU? • LFU never forgets previous references when it compares the pirorities of pages. Hence, it cannot adapt to evolving access patterns.
LRU-K • Advantages: • Discriminates well between page sets with different levels of reference frequency, e.g., index versus data pages (Example 1). • Detects locality of reference within query executions, across multiple queries in the same transaction, and across multiple transactions executing simultaneously. • Does not require external hints. • Fairly simple and incurs little bookkeeping overhead.