1 / 23

Memory Resource Allocation for File System Prefetching -- From a Supply Chain Management Perspective

Memory Resource Allocation for File System Prefetching -- From a Supply Chain Management Perspective. Zhe Zhang (NCSU ), Amit Kulkarni (NCSU) Xiaosong Ma (NCSU/ORNL), Yuanyuan Zhou (UIUC). Aggressive prefetching: an idea whose time has come *. Enlarging processor-I/O gap

temple
Download Presentation

Memory Resource Allocation for File System Prefetching -- From a Supply Chain Management Perspective

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Memory Resource Allocation for File System Prefetching-- From a Supply Chain Management Perspective Zhe Zhang (NCSU), Amit Kulkarni (NCSU) Xiaosong Ma (NCSU/ORNL), Yuanyuan Zhou (UIUC)

  2. Aggressive prefetching: an idea whose time has come* • Enlarging processor-I/O gap • Processing power doubling every 18 to 24 months • Disparity between growth of disk latency and throughput • Latency improving 10% per year while throughput improving 40% per year [Hennessy 03] • Large memory cache sizes • Usually 0.05% ~ 0.2% of storage capacity [Hsu 04] * [Papathanasiou 05]

  3. … and whose challenges follow • Systems facing large number of concurrent requests #1 Facebook How to manage file systems’ memory resource for aggressive prefetching? • Servers handling large number of clients #10 … Jaguar @ Oak Ridge National Lab Lustre 11,000 Compute nodes … 72 I/O nodes … 18 DDN S2A9500 couplets

  4. All streams are not created equal • Allocating memory resource according to access rate? • Related work • Access pattern detection: rate not detected [Lee 87, Li 04, Soundararajan 08] • Aggressiveness control: based on sequentialty [Patterson 95, Kaplan 02, Li 05] • Multi-stream prefetching: rate not sufficient utilized [Cao 96, Tomkins 97, Gill 07] MP3 Youtube : 128 kbps : 200 kbps : 900 kbps Youtube HQ

  5. Similar story in grocery stores! • Allocating storage resource according to consumption rate? • Studied in Supply Chain Management (SCM) • Demand rate measurement/analysis/prediction • Dated back to first wars • Yet active • Wal-Mart: $24M on satellite network for instant inventory control • Dell: aiming at “zero inventory” …… …… Milk : 200 per day Beer : 80 per day $300 Wine : 1 per year

  6. Our contributions • A mapping between data prefetching and SCM problems • Novel rate-aware multi-stream prefetching techniques based on SCM heuristics • Implementation and performance evaluation • Modified Linux 2.6.18 kernel • Extensive experiments with modern server and multiple workloads • Coordinated multi-level prefetching • Based on multi-echelon inventory control • Extending application access pattern to lower level • Evaluation with combinations of state-of-the-art single level algorithms

  7. Outline • Motivation • Background and problem mapping • Algorithms • Performance evaluation • Conclusions

  8. Background – Inventory cycles • Inventory theory • Task: manage inventory for goods • Goal: satisfy customer demands order quantity Inventory level fast dem -and cycle inventory average demand slow demand reorder point safety inventory Time lead time

  9. Background – Prefetching basics Memory cache trigger distance prefetch degree Disk

  10. Background – Prefetching cycles • Prefetching techniques: • Task: manage the cache for data blocks • Goal: satisfy application requests order quantity prefetch degree Prefetched blocks fast dem -and cycle inventory average demand slow demand reorder point Tc trigger distance safety inventory Ts Time disk access time lead time

  11. Challenges in mapping • Data requests  Customer demands • Data blocks are unique • “Linear sequence of blocks” in detected streams GroceryStore::getMilk(); FileSystem::getNextBlock(); FileSystem::getBlock(Position p); • Prefetched data blocks Inventory • Accessed data blocks remain in the cache • But as “second class citizens” [Gill 05, Li 05]

  12. Outline • Motivation • Background and problem mapping • Algorithms • Performance evaluation • Conclusions

  13. Performance metrics and objectives • SCM optimization objective: improve fill rate • Fraction of demand satisfied from inventory • Prefetching optimization objective: improve cache hit rate • Dynamically adjust • Trigger distance • Prefetch degree • ESC: Expected Shortage per Cycle • Q: order quantity

  14. Rate aware prefetching algorithms prefetch degree Prefetched blocks • Task: calculating Tc and Ts • Tc: lead time × average consumption rate • Ts: based on estimation of uncertainty cycle inventory average demand slow demand fast demand reorder point Tc trigger distance safety inventory Ts Time

  15. Algorithm1: Equal Time Supplies (ETS) • Safety inventory for all goods set to the same time supply (e.g., amount of goods consumed in 5 days) • With “standard” distribution shapes, uncertainty is proportional to the mean value • Ts: set to be proportional to average data access rate trigger distance of streami average rate of streami total allowed trigger distance

  16. Algorithm2: Equal Safety Factors (ESF) • Safety inventory set to maintain the same safety factor across all goods • Ts: set to be proportional to standard deviation of access rate standard deviation • Implementation challenges • Measurement and calculation overhead • Limited floating point calculation in kernel

  17. Outline • Motivation • Background and problem mapping • Algorithms • Performance evaluation • Conclusions

  18. Comparing with Linux native prefetching • Linux prefetching algorithm (kernel 2.6.18) • Trigger distance (T) = Prefetch degree (P) • Doubling T and P for each sequential hit • Upper bounds: • T = P = 32 (pages) • Implementation of SCM-based algorithms • Principle: maintaining same memory consumption as original algorithm • Default parameters • Tdefault = 24, Pdefault = 48 32-32 24-48

  19. Experimental setup • Platform • Linux server • 2.33GHz quad-core CPU, 16GB memory • Comparing 32-32, 24-48, ETS and ESF algorithms • Workloads • Synthetic benchmarks • Linux file transfer applications • HTTP web server workload • Server benchmarks • SPC2-VOD-like (sequential) • TPC-H (random)

  20. Two streams with different rates • Rate of stream 1 fixed at 1000 pages / second • Rate of stream 2 varying b/w 3000 to 7000 pages / second Rate of fast stream (pages/second ) Average response time ETS: 19%~25% improvement over 32-32 # of cache misses per prefetch cycle (ESC) ETS: same # of cycles as 24-48 and similar ESC as 32-32

  21. Two streams with different deviations • SD of stream 1 fixed at square root of rate • SD of stream 2 varying b/w 3 to 7 times of the average rate SD of unstable stream SD of unstable stream Average response time ESF: 20%~35% improvement over ETS Response time of individual streams ESF: large improvement for unstable stream, small degradation for stable stream

  22. Throughput of server benchmarks • SPC2-VOD-like (sequential streams) • TPC-H (random accesses) Random application throughput ETS: never worth than 32-32; 2.5% average improvement Sequential+random apps. throughput ETS: 6%~53% improvement over 32-32 Sequential+random apps. memory consumption

  23. Conclusions and future work • Observations • File blocks can be managed as apples! • Simple approaches such as ETS seems to perform well • Future work • Awareness of both access rate and delivery time • Adjusting the prefetch degree • Acknowledgements • Anonymous reviewers • Our shepherd: George Candea • Our sponsors: NSF and DOE Office of Science

More Related