Memory Resource Allocation for File System Prefetching -- From a Supply Chain Management Perspective

Memory Resource Allocation for File System Prefetching-- From a Supply Chain Management Perspective Zhe Zhang (NCSU), Amit Kulkarni (NCSU) Xiaosong Ma (NCSU/ORNL), Yuanyuan Zhou (UIUC)

Aggressive prefetching: an idea whose time has come* • Enlarging processor-I/O gap • Processing power doubling every 18 to 24 months • Disparity between growth of disk latency and throughput • Latency improving 10% per year while throughput improving 40% per year [Hennessy 03] • Large memory cache sizes • Usually 0.05% ~ 0.2% of storage capacity [Hsu 04] * [Papathanasiou 05]

… and whose challenges follow • Systems facing large number of concurrent requests #1 Facebook How to manage file systems’ memory resource for aggressive prefetching? • Servers handling large number of clients #10 … Jaguar @ Oak Ridge National Lab Lustre 11,000 Compute nodes … 72 I/O nodes … 18 DDN S2A9500 couplets

All streams are not created equal • Allocating memory resource according to access rate? • Related work • Access pattern detection: rate not detected [Lee 87, Li 04, Soundararajan 08] • Aggressiveness control: based on sequentialty [Patterson 95, Kaplan 02, Li 05] • Multi-stream prefetching: rate not sufficient utilized [Cao 96, Tomkins 97, Gill 07] MP3 Youtube : 128 kbps : 200 kbps : 900 kbps Youtube HQ

Similar story in grocery stores! • Allocating storage resource according to consumption rate? • Studied in Supply Chain Management (SCM) • Demand rate measurement/analysis/prediction • Dated back to first wars • Yet active • Wal-Mart: $24M on satellite network for instant inventory control • Dell: aiming at “zero inventory” …… …… Milk : 200 per day Beer : 80 per day $300 Wine : 1 per year

Our contributions • A mapping between data prefetching and SCM problems • Novel rate-aware multi-stream prefetching techniques based on SCM heuristics • Implementation and performance evaluation • Modified Linux 2.6.18 kernel • Extensive experiments with modern server and multiple workloads • Coordinated multi-level prefetching • Based on multi-echelon inventory control • Extending application access pattern to lower level • Evaluation with combinations of state-of-the-art single level algorithms

Outline • Motivation • Background and problem mapping • Algorithms • Performance evaluation • Conclusions

Background – Inventory cycles • Inventory theory • Task: manage inventory for goods • Goal: satisfy customer demands order quantity Inventory level fast dem -and cycle inventory average demand slow demand reorder point safety inventory Time lead time

Background – Prefetching basics Memory cache trigger distance prefetch degree Disk

Background – Prefetching cycles • Prefetching techniques: • Task: manage the cache for data blocks • Goal: satisfy application requests order quantity prefetch degree Prefetched blocks fast dem -and cycle inventory average demand slow demand reorder point Tc trigger distance safety inventory Ts Time disk access time lead time

Challenges in mapping • Data requests  Customer demands • Data blocks are unique • “Linear sequence of blocks” in detected streams GroceryStore::getMilk(); FileSystem::getNextBlock(); FileSystem::getBlock(Position p); • Prefetched data blocks Inventory • Accessed data blocks remain in the cache • But as “second class citizens” [Gill 05, Li 05]

Performance metrics and objectives • SCM optimization objective: improve fill rate • Fraction of demand satisfied from inventory • Prefetching optimization objective: improve cache hit rate • Dynamically adjust • Trigger distance • Prefetch degree • ESC: Expected Shortage per Cycle • Q: order quantity

Rate aware prefetching algorithms prefetch degree Prefetched blocks • Task: calculating Tc and Ts • Tc: lead time × average consumption rate • Ts: based on estimation of uncertainty cycle inventory average demand slow demand fast demand reorder point Tc trigger distance safety inventory Ts Time

Algorithm1: Equal Time Supplies (ETS) • Safety inventory for all goods set to the same time supply (e.g., amount of goods consumed in 5 days) • With “standard” distribution shapes, uncertainty is proportional to the mean value • Ts: set to be proportional to average data access rate trigger distance of streami average rate of streami total allowed trigger distance

Algorithm2: Equal Safety Factors (ESF) • Safety inventory set to maintain the same safety factor across all goods • Ts: set to be proportional to standard deviation of access rate standard deviation • Implementation challenges • Measurement and calculation overhead • Limited floating point calculation in kernel

Comparing with Linux native prefetching • Linux prefetching algorithm (kernel 2.6.18) • Trigger distance (T) = Prefetch degree (P) • Doubling T and P for each sequential hit • Upper bounds: • T = P = 32 (pages) • Implementation of SCM-based algorithms • Principle: maintaining same memory consumption as original algorithm • Default parameters • Tdefault = 24, Pdefault = 48 32-32 24-48

Experimental setup • Platform • Linux server • 2.33GHz quad-core CPU, 16GB memory • Comparing 32-32, 24-48, ETS and ESF algorithms • Workloads • Synthetic benchmarks • Linux file transfer applications • HTTP web server workload • Server benchmarks • SPC2-VOD-like (sequential) • TPC-H (random)

Two streams with different rates • Rate of stream 1 fixed at 1000 pages / second • Rate of stream 2 varying b/w 3000 to 7000 pages / second Rate of fast stream (pages/second ) Average response time ETS: 19%~25% improvement over 32-32 # of cache misses per prefetch cycle (ESC) ETS: same # of cycles as 24-48 and similar ESC as 32-32

Two streams with different deviations • SD of stream 1 fixed at square root of rate • SD of stream 2 varying b/w 3 to 7 times of the average rate SD of unstable stream SD of unstable stream Average response time ESF: 20%~35% improvement over ETS Response time of individual streams ESF: large improvement for unstable stream, small degradation for stable stream

Throughput of server benchmarks • SPC2-VOD-like (sequential streams) • TPC-H (random accesses) Random application throughput ETS: never worth than 32-32; 2.5% average improvement Sequential+random apps. throughput ETS: 6%~53% improvement over 32-32 Sequential+random apps. memory consumption

Conclusions and future work • Observations • File blocks can be managed as apples! • Simple approaches such as ETS seems to perform well • Future work • Awareness of both access rate and delivery time • Adjusting the prefetch degree • Acknowledgements • Anonymous reviewers • Our shepherd: George Candea • Our sponsors: NSF and DOE Office of Science

Memory Resource Allocation for File System Prefetching -- From a Supply Chain Management Perspective

Memory Resource Allocation for File System Prefetching -- From a Supply Chain Management Perspective

Presentation Transcript

Supply Chain Management System Project

Supply-Chain Management

Supply Chain Management

Supply Chain Management

Supply Chain Management

Supply Chain Management

Supply Chain Management

Supply Chain Management and supply chain management systems

POKKA Supply Chain Management System

Supply Chain - A Strategic Perspective

Supply Chain Management

Resource Allocation System

Supply Chain Management

SUPPLY CHAIN MANAGEMENT

Supply Chain Management

SUPPLY CHAIN MANAGEMENT

Supply Chain Changes A manufacturer’s perspective

A Prefetching Memory System for Mediaprocessors

Supply Chain planning management system

Supply Chain Management System Project

Supply Chain - A Strategic Perspective

Supply Chain Management System