1 / 31

Flashing Up the Storage Layer I. Koltsidas, S. D. Viglas (U of Edinburgh), VLDB 2008

Flashing Up the Storage Layer I. Koltsidas, S. D. Viglas (U of Edinburgh), VLDB 2008. Shimin Chen Big Data Reading Group. Motivation:. Flash Disks: 64GB – 128GB SSDs available as of Feb’08 Intel announced 80GB SSDs Flash disks vs. magnetic disks Same I/O interface: logical 512B sectors

konane
Download Presentation

Flashing Up the Storage Layer I. Koltsidas, S. D. Viglas (U of Edinburgh), VLDB 2008

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Flashing Up the Storage LayerI. Koltsidas, S. D. Viglas (U of Edinburgh), VLDB 2008 Shimin Chen Big Data Reading Group

  2. Motivation: • Flash Disks: 64GB – 128GB SSDs available as of Feb’08 • Intel announced 80GB SSDs • Flash disks vs. magnetic disks • Same I/O interface: logical 512B sectors • No mechanical latency, I/O asymmetry, erase-before-write: • Random reads 10X faster than magnetic disks • Random writes 10X slower than magnetic disks, esp MLC • Exploit flash disks for storage?

  3. Architecture • Flash disk as a cache for magnetic disk? • Suboptimal for database workloads because of write inefficiency • Flash disk and magnetic disk on the same level (This Paper)

  4. ProblemStatement • Page migrations (Storage Manager) • Workload prediction • Self-tuning • Page replacement (Buffer Manager)

  5. Outline • Introduction • Page placement • Page replacement • Experimental study • Conclusion

  6. Model • Random read/write costs of flash and magnetic disks • Page migration decision is always made when a page is in buffer pool • Migration cost == write cost • The ideas are not new. The novel thing here is that logical I/Os are served by buffer pool. Only part of them are seen physically.

  7. r, w: the cost of the current disk; r’, w’: the cost of the other disk pg.C: a counter per page – the accumulated cost difference

  8. Conservativeness • Migration operation only after the cost of migrating to and back • Only physical operations on pages • 3-competitive to optimal offline algorithm

  9. Properties • Not conservative on migrations • Based on logical operations

  10. Hybrid Algorithm • Idea: • Consider both physical and logical operations • More weight on physical ones • If a file has n pages, and b pages are cached in the buffer pool, then • Prob_miss = 1 – b/n

  11. Outline • Introduction • Page placement • Page replacement • Experimental study • Conclusion

  12. Eviction Cost • Evicting a page: • Dirty page incurs write cost • Fetching a page back in the future incurs read cost • Cost:

  13. Buffer Pool Organization Sorted on timestamp LRU Sorted on cost of eviction

  14. Impact of λ • As λ increases: • Time segment decreases • Cost segment increases • Disk pages increases, flash pages decreases • Flash pages are evicted first, typically only found in time segment • Let Hm be the increase of disk hit rate, Mf be the increase of flash miss rate • So we want

  15. Outline • Introduction • Page placement • Page replacement • Experimental study • Conclusion

  16. Experimental Setup • Implementation: • Buffer manager, storage manager, B+trees for storing data • Machine: • 2.26GHz Pentium4, 1.5GB RAM • Debian linux, kernel 2.6.21 • Two magnetic disks (300GB Maxtor DiamondMax) • 1 SSD (Samsung MLC 32GB) • Data is stored on 1 disk + 1 SSD (both raw devices)

  17. Experimental Setup Cont’d • Capacity of either disk is enough to hold all data • Metadata for files, pages, page mappings, and free space are not modeled • B+tree is 140MB large, scattered across 1.4GB address space • Buffer pool is 20MB large

  18. Raw Performance: 1 million 4KB random accesses

  19. Impact of Using Both Disks • Conservative + LRU • Query mix: read-only, write-only, read/write • Each set of queries executed 15 times

  20. Read-Only

  21. Write-Only

  22. Mixed

  23. Infrequently changing workload Page Placement Algorithms

  24. Frequently changing workload

  25. Buffer Pool Replacement

  26. Conclusion • Flash disk vs. magnetic disk • Page migration and placement • Page replacement • Can be applied to databases and file systems (?)

  27. Outline • Introduction • Page placement • Page replacement • Experimental study • Conclusion

More Related