170 likes | 284 Views
uFLIP : Understanding Flash IO Patterns. Luc Bouganim Björn Þór Jónsson Philippe Bonnet. Assistant Professor : Kyumars Sheykh Esmaili. Danesh Zandi , Afshin Rahmany & mohamad kavosi. Overview. 1.Introduction 1.Motivation 2.Contributions 3.How Flash Devices Work
E N D
uFLIP: Understanding Flash IO Patterns Luc BouganimBjörnÞórJónsson Philippe Bonnet Assistant Professor : KyumarsSheykhEsmaili DaneshZandi , AfshinRahmany & mohamadkavosi SRBIAU, Kurdistan Campus
Overview 1.Introduction 1.Motivation 2.Contributions 3.How Flash Devices Work 4.Why the State Matters 2.Content 1.Definitions 2.The Benchmark 3.Benchmarking Methodology 4.Device Evaluations 3.Review 1.Problems 2.Evaluation 3.Conclusion
Introduction: Motivation • Flash devices (vs. HDDs) • Faster • More robust • Soon as big • Lower latency • Higher throughput • More complex to handle • Read • Write → Program/Erase • New/adapted algorithms? • We need to understand the devices!
Introduction: Contributions • The uFLIP Benchmark • Consisting of 9 micro benchmarks • Benchmarking Methodology • How to apply the benchmark • Device Evaluations • Example evaluations of a set of devices
Introduction: How Flash Devices Work • Units • Page: ~2KB • Block: 64 * 2KB ≈ 128 KB • Read • Program • Default state: 1 • Program → 0 • Erase • Back to default • Only possible 10⁵ to 10⁶ times • Per block → slow
Introduction: How Flash Devices Work • Flash Chips • Block Manager • Wear leveling • Maps LBA to flash page • Possibly trades in-place for writes into free pages • Possibly asynchronous page reclamation
Introduction: Why the State Matters • General principles are well known • Details not. Flash Devices are black-boxes. • # free pages unknown • Time of next erase unknown • Cost of I/O operation is non uniform in time • Depends on state of device
Content: Definitions • I/O operation • Time, size, LBA, read/write • Baseline patterns • Sequential/random read, sequential/random write • Time • Consecutive, pause, burst • Logical Block Address • Sequential, random, ordered, partitioned • Target offset/size, shift • Others • IOIgnore, IOCount
Content: The Benchmark(s) • Granularity • I/O size • Locality • Target size
Content: The Benchmark(s) • Partitioning • Target space divided into partitions • Operations within partition are sequential • Order • Linear increase/decrease, in-place • Parallelism • Target space divided into subsets • Each accessed by different process
Content: Benchmarking Methodology • Device state • Out of the box 16KB write: 1msec • After writing whole device: 8msec • Well defined initial state • „Write the whole flash device completely yields a well-defined state.“ • Start-up Phase • Defined by IOIgnore • Running Phase • Defined by IOCount – IOIgnore
Content: Device Evaluations • Devices are from 2009 • Range from USB stick over IDE modules to SSDs • From $12 to $943 and 2GB to 32GB • More expensive → faster • Parallelism has no effect
Review: Problems • They vary only one parameter at a time • Interactions between parameters not captured • Multidimensinal graphs can be analyzed • Full factorial design is not feasible • e.g. what if locality and partitioning work well together? • Why not 2^k factorial design? • „Writing the whole flash device completely yields a well-defined state.“ • Next paragraph: • „...by performing random IOs of random size (ranging from 0.5KB to • the flash block size, 128KB) on the whole device.“ • „writes“ or „random IOs“? • What does „the whole device“ mean? • All LBAs? All flash pages (not possible)? Total size?
Review: Evaluation • The paper was interesting to read. • Of the 3 contributions: • The results are obsolete (but interesting). • The methodology are (mostly) well known benchmarking • best practices. • The benchmark is still valid and useful. • More explanations for the results
Review: Conclusion • Many areas for improvement • Automation • Capturing interaction • SSDs are getting more and more important • Evaluation with todays devices • Parallesim? • No alternative offers as much information