260 likes | 417 Views
Paper by: Chris Ruemmler and John Wikes Presentation by: Timothy Goldberg, Daniel Sink, Erin Collins, and Tony Luaders. Introduction. Disk Drive performance improvements at 7-10% Compared to microprocessors at 40-60% or disk storage capacities at 60-80% (annually)
E N D
Paper by: Chris Ruemmler and John Wikes Presentation by: Timothy Goldberg, Daniel Sink, Erin Collins, and Tony Luaders
Introduction • Disk Drive performance improvements at 7-10% • Compared to microprocessors at 40-60% or disk storage capacities at 60-80% (annually) • Simulation models to compare alternative approaches • High quality disk drive model • Error factor 14 times smaller
Outline • Introduction • Characteristics of Modern Disk Drives • Recording Components • Positioning Components • Disk Controller • Modeling Disk Drives
Characteristics of Modern Disk • Non-removable magnetic disk drives • Contain a mechanism and controller • Recording Components: rotation disks and heads • Positioning Components: moves heads into correct position with track-following system • Emphasis on features that could be important when creating a disk drive model
Recording Components • Smaller disks: • Less surface area for data • Less power consumption • Can spin faster • Smaller seek distances • Increased storage density: • Better linear recording density, maximum rate of flux changes • Packing separate tracks of data more closely together • May contain from 1 to 12 platters • Stack rotates in lockstep
Recording Components • Spindle rotation speed: • Higher spin speed increases transfer rates, shortens rotation latencies • Higher power consumption, requires better bearings • Each platter surface has a disk head • Responsible for recording (writing) • And sensing (reading) magnetic flux variation • Single Read-Write data channel • Can be switched between the heads • Responsible for encoding and decoding data stream into or from a series of magnetic phase changes stored on the disk
Positioning Components • Data surfaces are set up to store data in tracks • Modern disks have about 2,000 cylinders and are 3.5 inches. • Cylinder is a single stack of tracks at a common distance from the spindle • To access the data stored on a track, the disk arms must rotate all the disks to get the desired track to the disk head. • This system ensures that the track is reached even with interruptions • External vibrations, shocks, and disk flaws (non circular tracks)
Seeking • The speed of head movement • Faster seeking requires more power • Half the seek time requires 4x power • Seek is composed of: • Speedup (arm moves until at half seek distance) • Coast (for long seeks, max velocity) • Slowdown (rest close to desired track) • Settle (puts disk head on desired location)
Track Following • Fine-tuning the head position at the end of the seek and keeping the head on the desired track • Determines if head is correctly aligned by using positioning information on the disk at manufacturing time • Performs head switches • When the controller switches its data channel from one surface to the next in the same cylinder
Data layout • A disk appears to its client computer as a linear vector of addressable blocks which are mapped to physical sectors on the disk. • Using this method, the disk can hide bad sectors and do low-level performance optimizations. • Zoning: tracks are longer at the outside of a platter than at the inside. • Maximize storage capacity • Track skewing: faster sequential access across track boundaries • Allows data to be read or written at nearly full media speed • Sparing: stores a list of flaws in the desk surface to be skipped
The Disk Controller • Mediates access to the mechanism • Runs the track-following system • Transfers data between the disk drive and the client • Manages an embedded cache
caching of requests • Speed-matching buffer can be extended to include some form of caching for both reads and writes. • Caches in disk drives are relatively small because of space limitations. • Read-ahead: faster than seeking if the cache gets a hit • Write caching: saves cache information • Cache is volatile, losing its contents if power to the drive is lost • Command queuing: allows for multiple outstanding requests at the same time • Disk controller determines the best execution order, subject to additional host constraints.
The Simulator • Based in C++ using a version of the AT&T tasking library • The Basic ideas are readily applicable to other simulation environments • The disk drive is modeled as two tasks and some additional control structures • Task one models the mechanism, including the head and platter (rotation) positions. • Task two, the direct memory access engine (DMA), models the SCSI bus interface and its transfer engine.
The Simulator • The cache object buffers requests between two tasks and is used to manage the asynchronous interactions between the bus interface and the disk mechanism tasks. • The simulator can process about 2,000 I/Os per second on an HP9000 Series 800 Model H50 system • This allows 1 million requests to be serviced in approximately 10 minutes
Evaluation • Took week long samples from a longer trace series of HP-UX (Unix) computer systems. • A metric to evaluate the models used a time distribution curve for the real drive and the model output and use the root mean square of the horizontal distance between these two curves.
No modeling • Uses a constant fixed time for each I/O • A demerit factor that is 35% of the average I/O time • This model is not good
A simple model • A better model requires: • A seek time linear with the distance • No head-settle effects or head-switching costs • A rotational delay • A fixed controller overhead • A transfer time linear with the length of request • demerit of 15% of a mean I/O time
Modeling head-positioning effects • Determined which track and cylinder the request started on and where it ended • Added a fix cost of 2.5 ms for each head and track switch • Demerit of 6.2% of a mean I/O time
Modeling rotation position • Calculate rotational latency by keeping track of rotational position of the disk • Account for spare sectors • A demerit of 2.6% of mean I/O time
Modeling data caching • Uses both read-ahead and immediate reporting • Large disparity due to caching • 50% of request are completed in 3ms or less • Demerit of 112% is not acceptable!
Modeling data caching • Added aggressive read-ahead and immediate reporting to the model • Demerit is now only 5.7% of the mean I/O time
Model summary • Careful modeling is neither too difficult nor too costly • A good model needs careful calibration and tuning • These features and others may become particularly important when a workload has large data transfers