230 likes | 245 Views
Explore flash memory characteristics, in-page logging, IPL algorithms, and transaction support in database management. Learn the impact on software design and experimentation results using IPL. Discover innovative solutions for updating and reading data efficiently.
E N D
Design of Flash-Based DBMS :An In-Page Logging Approach Tsung Computer Science , RUC
Outline • Characteristics of Flash Memory • Main Idea of In-Page Logging • Design Manifesto && Data structure of IPL • Read ,Update && Merge Algorithms of IPL Without Transaction • Support for transaction • Experiments of IPL • Conclusion
Characteristics of Flash Memory (NAND) • Structure of NAND Flash Memory Page(512k) ↓ Erase Unit(16 Pages) ↓ Flash Chip (*G Units) ↓ Flash Memory(**G is available ) • Key hardware limits of Flash Memory Electronic device (Uniform access speed) Different granularity (Read/Write page ,Erase unit) Erase before write (Can’t overwrite) Sequential write (From page 0 ,1,2 page 15) Finite number of erase cycles (Typically up to 100,000)
Characteristics of Flash Memory (NAND) • Impact on software design No In-Place Update How to Avoid && Buffer Update How to maintain index without high cost No Mechanical Latency Clustering Storage make no sense now , we can scatter information around the memory without substantial penalty Asymmetric Speed of Read/Write/Erase Traditional I/O times make no sense now , What’s more ,we can avoid write in order to avoid erase in the expense of read
Main Idea of In-Page Logging Use log to buffer write ,so can combine writes and decrease erases Take advantage of Uniform sequential and random write to co-locate the page and its log in one unit
Design Manifesto && Data structure of IPL • Design Manifesto (Memory Hierarchy = RAM + NAND ) Take advantages of Flash memory Access characters (uniform sequential/random access and faster read ) Overcome the erase-before-write limitation Avoid write/erase Minimize the changes made to DBMS architecture Changes limited to buffer manage and storage manage only
Data Structure of IPL 1. EraseUnit =15Page-basis + 1 log area 2. Structure of Buffer and Flash Memory 0 1 2 3
Read ,Update && Merge Algorithms of IPL Without Transaction • Update logic 1. Update the page-basis and append a log record to it’s log sector directly (Both in RAM) . 2. When the log sector is full or the dirty page is evicted from buffer pool , write the log sector to flash. (The page- basis isn’t written to flash ) 3. When no free log sector is available , trigger a Merge operation. • Read logic 1. When the page is in buffer pool , return it . 2. When there is a page fault , read the page and it’s log from flash ,then apply the changes to the page-basis ,return the page newly computed.
Support for transaction • Some concepts 1. Two buffer management policy = No-force (REDO)/ Force 2. Three transaction status = Committed / Abort / Active 3. IPL adopt No-force and avoid REDO with redefined read logic • Additional log and Data Structure 1. Global transaction log status of Transaction 2. Dirty page list find transaction ’s page quickly • Transaction handling idea T commit T’s log is applied T abort T’s log is ignored T active T’s log isn’t applied now Erase Unit’s log sector overflow + low merge efficiency Append Unit for log + Selective merge Simple to handle New problems
View the Support for transaction in operation perspective • Update 1. append a log record to it’s log sector directly (in RAM) , change log-sector only. 2. When the log sector is full or the dirty page is evicted from buffer pool , or the transaction is committed ,write the log sector to flash. (The page-basis isn’t written to flash ) 3. When no free log sector is available , trigger a Merge operation. • Read read the page-basis and it’s log sector from flash and applied the log to the page-basis use the following policy Committed Applied Aborted / Active Ignored
Experiments of IPL Flash/Disk Access characters In Sequential/Random Patterns
Flash is a electronic device Write/Erase are time-consuming Experimental results show: Read ---------- Disk is more sensitive than Flash Write ---------- Flash is more sensitive than Disk
TPC-C Locality Analysis Setup of TPC-C
Experimental results show : the distribution of updates frequencies was highly skewed. the temporal locality of page updates is bad Using log to buffer write may be efficient (d) The temporal locality of data page updates ---the probability that 16 consecutive physical writes would be done for 16 distinct data pages is 99.99% , the probability of Unit Erase is 93.1% Using buffer pool isn’t enough
Impacts of different size of log region per erase unit (IPL)
Experimental results show : Size of log region per Unit Write time
Impacts of different size of buffer pool size of database server (IPL)
tConv=(а* page writes)*20 ms а denote the probability of a write will cause Unit Erase а is 93.1% according to previous analysis of TPC-C temporal locality Experimental results show : Buffer pool size Write/Merge times
Conclusion • This paper is simple but timely Main contribution : In-page log + Support for transaction • What is Flash and What is DBMS • Q&A