180 likes | 197 Views
File Processing : Storage Media. 2018, Spring Pusan National University Ki-Joune Li. Major Functions of Computer. Computation Storage Communication Presentation. Storage of Data. Major Challenges How to store and manage a large amount of data
E N D
File Processing : Storage Media 2018, Spring Pusan National University Ki-Joune Li
Major Functions of Computer • Computation • Storage • Communication • Presentation
Storage of Data • Major Challenges • How to store and manage a large amount of data • Example : more than 100 peta bytes for EOS Project • How to represent sophisticated data
Modeling and Representation of Real World • Example • Building DB about Korean History • Very complicated and Depending on viewpoint • Database Course : 2017 Fall semester Real World Computer World
Managing Large Volume of Data • Large Volume of Data • Cost for Storage Media • Not very important and negligible • Processing Time • Time is the most valuable resource • Comparison between main memory and disk access time • RAM (Random Access Memory) : several 10-9 sec • SSD (Solid State Driver) : under 10-4 sec • HDD (Hard Disk Driver): several 10-3 sec HDD is 106 times slower than RAM • Difference between handling data in RAM and HDD • Handling data in HDD: Same way that we handle data in RAM • How to handle this gap between RAM and Disk Memory
Managing Large Volume of Data • Management of Data • Secure Management • From hacking • From any kinds of disasters • Consistency of Data • Example • Failure during a flight reservation transaction • Concurrent transaction
Goals of File Systems • To provide with 1. efficient Data Structures for storing large and complex data 2. Access Methods for rapid search 3. Query Processing Methods 4. Robust Management of Transactions
Faster 8 M bytes (Core i7, L3 Cache) Cache Memory 16 G bytes Main Memory 1 T bytes Secondary Memory Tertiary Memory 10 Peta bytes Cheaper Memory Hierarchy • Large Data Volume • Not be stored in main memory • But in secondary memory • Memory Hierarchy
SSD (Flash Memory) • Solid State Driver • Only Electronic Operations unlike HDD. • Characteristics • Aging Problem: only a limited number of write/erase cycles. (e.g. 1 M) • Asymmetric Read/Write Speed: • a byte (or word) can be read at a time • Write: Erasing of memory has to be done to an entire bank of memory • Reading is fast and a byte (or word) can be read at a time • Writing is a little bit slower than reading • Easing is slower • NAND vs. NOR Flash Memory
Optical Storage • Non-volatile • CD, DVD • Speed • Slower than HDD • Juke-box systems • Large numbers of removable disks, • Few drives, and • Mechanism for automatic loading/unloading of disks • For storing large volumes of data
Tape • Non-Volatile and Large Volume (e.g. 15 TB per Cartridge) • Primarily Used for backup • Sequential access: much slower than disk • But data transfer rate: up to 750 MB for some tape driver
Get Data Hit Ratio rh = nh / na Get Data Load on main memory Access to Disk Data Access with Secondary Memory Access Request How to increase hit ratio ? If in main memory MainMemory If not in main memory Disk
1000 disk accesses ? when rh = 1 when rh = 0 Why Hit Ratio is so important ? • Example for(int i=0;i<1000;i++) Nbytes=read(fd,buf,100); 1000 * 10-2 sec = 10 sec 1000 * 10-8 sec = 10-5 sec
200~400 sectors 512 bytes 2 * nDF Physical Structure of Disk
Disk Access Time • Disk Access Time t = tS + tR + tT , where • tS : Seek Time • Time to reposition the head over the correct track • Average seek time is 1/2 the worst case seek time • 4 to 10 milliseconds on typical disks • tR : Rotational Latency • Time to reposition the head over the correct sector • Average rotational latency : ½r (to find index point) + ½r = r • In case of 15000 rpm : r =1*60sec/15000 = 4 msec • tT : Transfer Time • Time to transfer data from disk to main memory via channel • Proportional to the number of sectors to read • Real transfer time is negligible
10 bytes 1000 times 10 times 100 times Buffer in main memory 1 block (e.g. 1024 bytes) Number of Disk Accesses 1024 bytes Block-Oriented Disk Access • Example for(int i=0;i<1000;i++) Nbytes=read(fd,buf,10);
Disk Block • Unit of Disk Access • Block Size • Normally multiple of sectors • 1K, 4K, 16K or 64K bytes depending on configuration • Why not large block ? • Limited by the size of available main memory • Too large : unnecessary accesses of sectors • e.g. only 100 bytes, when block size is given as 64K • 1 block : 128 sectors (about ½ track, ½ rotation, 2 msec) • Too wasteful