1 / 24

BPLRU: A Buffer Management Scheme for Improving Random Writes in Flash Storage

Explore BPLRU, a buffer management scheme for improving random writes in flash storage and addressing its performance limitations. This original work by Hyojun Kim and Seongjun Ahn from Samsung Electronics, Korea, presented at FAST'08, March 2008.

zamorano
Download Presentation

BPLRU: A Buffer Management Scheme for Improving Random Writes in Flash Storage

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BPLRU: A Buffer Management Scheme for Improving Random Writes in Flash Storage Origianal Work Of Hyojun Kim and SeongjunAhn Software Laboratory of Samsung Electronics, Korea Presented At : FAST'08 , March, 2008 NehaSahay and SreeramPotluri

  2. Flash!! Flash!! • Speed of traditional hard disks in bound down by the speed of mechanical parts. • Decreasing costs (by 50% per year) of flash presents us with an alternative. • Advantages of Flash : High random read performance Very low power consumption Smaller and portable Shock resistance • Robust • Disadvantages of Flash: Very poor random write performance Limited Life time (100,000 erases for SLC NAND and 10,000 for MLC NAND)

  3. Outline • Characteristics of Flash • Flash Translation Layer • Existing Techniques and Related Work • BPLRU • Implementation Details • Evaluation • Conclusion

  4. Characteristics of Flash • Planes, blocks and pages. • Erased before programmed. Random rewrites are not allowed. • Read/Write in pages but we erase in blocks. • Effectively we write sequentially within a page boundary. • Erase operation takes a much longer time. • Requires wear-leveling. • An FTL masks these properties and emulates a normal hard disk. Flash memory has poor performance for random writes while it has good read and sequential write performance.

  5. Flash Translation Layer • Emulates hard disk and provides logical sector updates. • Types : • Page Mapping • Maintains mapping information at the page level • Requires large amount of memory for mapping information. • Block Mapping • Maintains mapping information at the block level • A page update requires a whole block update. • Hybrid Mapping • Maintains block level mapping but page position is not fixed inside a block. • Requires additional offset-level information. • Other Mapping Techniques • Exploited write locality using some reserved locations. • Effective algorithms can be applied for these reserved locations while simple block mapping for others

  6. Flash Translation Layer • Log-Block FTL • Writes to a log block that use a fine-grained mapping policy. • Once full it is merged with the older block and written to a new block. • The older location and the log block become free blocks. • Full Merge and Switch Merge P0 P0 Valid P1 P2 Invalid P2 P2 Valid P3 P3 Invalid P4 P3 Valid Data Block New Block Log Block

  7. Flash Aware Caches • Use of RAM Buffer inside SSDs • Clean First LRU (CFLRU) • Chooses a clean page as a victim rather than a dirty page. • Flash Aware Buffer Policy (FAB) • Buffers that belong to the same erasable block are grouped together. • The block with maximum number of buffers is evicted. • Works well for sequential writes. Effective than LRU. • Related Work – DULO – proposed by Zhang et al. • Exploits both temporal and spatial locality. • Dual locality caching. P31 P11 P21 P12 P32 P13

  8. BPLRU – Block Padding LRU • Applied to write buffer inside SSDs. • Reads are simply redirected to the FTL. • Coverts random writes to sequential writes. • Three Pronged • Block-level LRU • Page Padding • LRU Compensation

  9. Block-Level LRU • RAM Buffers are grouped in blocks that have same size as erasable block size in NAND. • Groups all pages in the same erasable block range into one buffer block. • Least recently used block is selected as the victim instead of a page. MRU Block LRU Block 0 0 12 9 1 1 9 5 5 6 6 15 19 19 15 6 Referenced 12

  10. Block-Level LRU • Example – 0,4,8,12,16,1,5,9,13,17,2,6,10. • 2 Log blocks and 2 pages can reside on write buffer. • 12 Merges in FTL while only 7 merges in Block-Level LRU.

  11. Page Padding • Replaces expensive full merge to switch merge

  12. LRU Compensation • To compensate for sequential writes

  13. Implementation • Two-level indexing using two sets of nodes, Block Header Nodes and Sector Nodes. • Two link points for LRU(nPrev, nNext), Block Number(nLbn), Number of sectors in a Block(nNumOfSct) and Sector Buffer(aBuffer). • For Sector Nodes, aBuffer[] contains contents of writing sector. • For Block Header Nodes, it contains secondary index table pointing to its child nodes. • Faster searching of sector nodes; memory overhead is the cost.

  14. Evaluation MS Office Installation task (NTFS) • 43% faster throughput than FAB for 16-MB buffer. • 41% lower erase count than FAB for 16-MB buffer.

  15. Evaluation Temporary Internet files of Internet Explorer (NTFS) • Performance slightly worse than FAB for buffers of size less than 8 MB. • For buffer size greater than 8MB, performance improves. • Erase count always less than FAB.

  16. Evaluation HDD test of PCMark 05 (NTFS) • Performance and erase count very similar to the previous Temporary Internet Files test.

  17. Evaluation Random writes by Iometer (NTFS) • No locality exists in Iometer. • FAB shows better write performance, getting better with bigger buffer sizes. • BPLRU shows better erase counts due to page padding.

  18. Evaluation Copying MP3 Files (FAT16) • 90 MP3 files with an average size of 4.8 MB. • Sequential write pattern.

  19. Evaluation P2P File Download, a 634-MB file (FAT 16) • Peer-to-peer program randomly writes small parts of a file as different parts of the file are getting downloaded concurrently from numerous peers. • This graph illustrates the poor performance of flash storage for random writes. • FAB requires more RAM for better performance. • Performance improves significantly by BPLRU.

  20. Evaluation Untar Linux Source Files • From linux-2.6.21.tar.gz (EXT3). • BPLRU shows 39% better throughput than FAB.

  21. Evaluation Kernel Compile • With Linux-2.6.21 sources (EXT3). • BPLRU shows 23% better performance than FAB.

  22. Evaluation Postmark • Evaluation the performance of I/O subsystems. • One of file creation, deletion, read or write is executed at random. NTFS FAT16 EXT3

  23. Evaluation Buffer Flushing Effect • File systems use buffer flush command to ensure data integrity. • Reduces the effect of write buffering. • With a 16-MB buffer reduces the throughput by approximately 23%.

  24. Conclusion • The proposed BPLRU scheme is more effective than the previous two methods, LRU and FAB. • Two important issues still remain, • When a RAM buffer is used, integrity of file system may be damaged due to sudden power failures. • Frequent buffer flush commands from the host computer degrades BPLRU performance. • Future Research, • Hardware like small battery or capacitor, or non volatile magneto resistive RAM or ferroelectric RAM. • Host side buffer cache policy similar as in the storage device. • Read requests with a much bigger RAM capacity and an asymmetrically weighted buffer management policy.

More Related