1 / 21

Differentiated Storage Services

Differentiated Storage Services. Tian Luo The Ohio State University. Michael Mesnier, Jason Akers, Feng Chen Intel Corporation. 23rd ACM Symposium on Operating Systems Principles (SOSP) October 23-26, 2011, Cascais , Portugal . Technology overview. An analogy: moving & shipping.

berne
Download Presentation

Differentiated Storage Services

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Differentiated Storage Services Tian Luo The Ohio State University Michael Mesnier, Jason Akers, Feng Chen Intel Corporation 23rd ACMSymposium on Operating Systems Principles (SOSP) October 23-26, 2011, Cascais, Portugal

  2. Technology overview An analogy: moving & shipping Classification Policy assignment Policy enforcement Why should computer storage be any different?

  3. Differentiated Storage Services Technology overview (offline) Classify each I/O in-band Classification Policy assignment Policy enforcement Storage system Computer system Management firmware Applications or DB QoS Policies I/O Classification Storage Pool A Storage Pool B Storage Pool C Operating system I/O Classification Storage controller File system I/O Classification QoS Mechanisms = Current & future research

  4. The SCSI CDB 5 bits  32 classes

  5. Motivation: disk caching with SSDs Universal challenges in the industry Keeping the right data cached Avoiding thrash under cache pressure Conventional approaches Cache bypass for large/sequential requests Evict cold data (LRU commonly used) How I/O classification can help Identify cacheable I/O classes Assign relative caching priorities Technology overview

  6. Filesystem prototypes (Ext3 & NTFS) Technology overview Classify each I/O in-band FS classification FS policy assignment FS policy enforcement Storage system Computer system Management firmware Applications or DB QoS Policies I/O Classification Disk SSD Operating system I/O Classification Storage controller File system QoS Mechanisms I/O Classification = Current & future research

  7. Database prototype (PostgreSQL) Technology overview Classify each I/O in-band DB classification DB policy assignment DB policy enforcement Storage system Computer system Management firmware Applications or DB QoS Policies I/O Classification Disk SSD Operating system I/O Classification Storage controller File system QoS Mechanisms I/O Classification = Current & future research

  8. Technology overview Selective cache algorithms • Selective allocation • Always allocate high-priority classes • E.g. FS metadata and DB system tables always allocated • Conditionally allocate low-priority classes • Depends on cache pressure, cache contents, etc. • High/low cutoff is a tunable parameter • Selective eviction • Evict in priority order (lowest priority first) • E.g., temporary DB tables evicted system tables • Trivially implemented by managing one LRU per class

  9. Technology development

  10. Technology development Ext3 prototype • OS changes (block layer) • Add classifier to I/O requests • Only coalesce like-class requests • Copy classifier into SCSI CDB • Ext3 changes • 18 classes identified  • Optimized for a file server • Small files & metadata • A small kernel patch • A one-time change to the FS

  11. Technology development Ext3 classification illustrated • echo ‘Hello, world!’ >> foo; sync • READ_10(lba 231495 len 8 grp 9) <=4KB • WRITE_10(lba 231495 len 8 grp 9) <=4KB • WRITE_10(lba 16519223 len 8 grp 8) Journal • WRITE_10(lba 16519231 len 8 grp 8) Journal • WRITE_10(lba 16519239 len 8 grp 8) Journal • WRITE_10(lba 16519247 len 8 grp 8) Journal • WRITE_10(lba 8279 len 8 grp 5) Inode • 7 I/Os (28KB) to write 13 bytes • Metadata accounts for most of the overhead I/O classification shows read-modify-write and metadata updates NTFS classification is implemented with Windows filter drivers

  12. Technology development PostgreSQL prototype • Classification API: scatter/gather I/O • OS changes (block layer) • Add O_CLASSIFIED file flag • Extract classifier from SG I/O • A small OS & DB patch • A one-time change to the OS & DB fd=open("foo", O_RDWR|O_CLASSIFIED, 0666); class = 19; myiov[0].iov_base = &class; myiov[0].iov_len = 1; myiov[1].iov_base = “Hello, world!”; myiov[1].iov_len = 13; writev(fd, myiov, 2); Preliminary DB classes

  13. Technology development Cache implementations • Fully associative read/write LRU cache • Insert(), Lookup(), Delete(), etc. • Hash table maps disk LBA to SSD LBA • Syncer daemon asynchronously cleans cache • Monitors cache pressure for selective allocate • Maintains multiple LRU lists for selective evict • Front-ends: iSCSI (OS independent) and Linux MD • MD cache module (RAID-9) Striping: mdadm –create /dev/md0 –level=0 –raid-devices=2 /dev/sdd /dev/sde Mirroring: mdadm –create /dev/md0 –level=1 –raid-devices=2 /dev/sdd /dev/sde RAID-9: mdadm –create /dev/md0 –level=9 –raid-devices=2 <cache> <base

  14. Evaluation

  15. Evaluation Experimental setup • Host OS (Xeon, 2-way, quad-core, 12GB RAM) • Linux 2.6.34 (patched as described) • Target storage system • HW RAID array + X25-E cache • Workloads and cache sizes • SPECsfs: 18GB (10% of 184GB working set) • TPC-H: 8GB (28% of 29GB working set) • Comparison • LRU versus LRU-S (LRU with selective caching)

  16. SPECsfs I/O breakdown LRU LRU-S Large files pollute LRU cache (metadata and small files evicted) LRU-S fences off large file I/O

  17. SPECsfs performance metrics Hit rate Running time 1.8x speedup HDD LRU LRU-S LRU LRU-S Syncer overhead I/O Throughput LRU LRU-S LRU LRU-S

  18. SPECsfs file latencies Reduction in write latency over HDD Reduction in read latency over HDD LRU LRU-S LRU LRU-S LRU LRU-S LRU suffers from write outliers (from eviction overheads) LRU-S reduces read latency (most small files are cached)

  19. TPC-H I/O breakdown LRU LRU-S Indexes pollute LRU cache (user tables evicted) LRU-S fences off index files

  20. TPC-H performance metrics Hit rate Running time 1.2x speedup HDD LRU LRU-S LRU LRU-S Syncer overhead I/O Throughput LRU LRU-S LRU LRU-S

  21. Conclusion & future work • Intelligent caching is just the beginning • Other types of performance differentiation • Security, reliability, retention, … • Other applications we’re looking at • Databases • Hypervisors • Cloud storage • Big Data (NoSQL DB) • Work already underway in T10 • Open source coming soon… Thank you! Questions?

More Related