660 likes | 1.12k Views
Chapter 8. Multimedia Storage. Magnetic Media. Magnetic disks are: Suitable for dynamic data that requires frequent changes. Good access time and high transfer rate. Used for data that must be kept online during data capturing and processing.
E N D
Chapter 8 Multimedia Storage
Magnetic Media Magnetic disks are: • Suitable for dynamic data that requires frequent changes. • Good access time and high transfer rate. • Used for data that must be kept online during data capturing and processing. • Suitable for video-on-demand applications where large amounts of time dependent information must be transferred at high bit rates.
RAID • Redundant Arrays of Inexpensive Disks. (developed at UC Berkeley in 1987) • Use parallelism between multiple disks to improve aggregate I/O performance • Something like parallelism from multiple CPUs • Data is distributed across several physical disks • As an alternative to single large expensive disk (SLED) in traditional mainframe systems • Several levels of RAID, seeking to optimize among • Performance • Availability • Cost
RAID (2) • Advantages: • high data transfer rate for large data accesses. • high I/O rates (short queuing time) on small data accesses. • uniform load balancing across all of the disks. • Disadvantages: • Large disk arrays are highly vulnerable to disk failures => Need to add redundancy for better availability => write overhead!
Data Striping • distribute data transparently over multiple disks to make them appear as a single, fast, large disk. • multiple I/Os can be served in parallel => better performance • parallel independent requests => shorter queuing time • parallel accesses of a single request => higher transfer rate
Data Striping (2) • granularity of data interleaving: • fine: distributes the data so that all of the array’s disks cooperate in servicing every request=> high I/O transfer rate. Typical stripe size = 1 bit / 1 byte / 512 bytes. • 1 bit stripe size speed up every single disk access • 512 bytes stripe size sometimes may not have any sped up e.g small disk access of 100 bytes.So the disk access time is still bounded by the slowest disk in the group.
Redundancy • redundancy is needed to tolerate disk failures • data and parity distribution • how redundant information is computed? • where shall the redundant information reside? • Hamming error correction • XOR
RAID: Level 0 (Nonredundant) • Data striping only. • Stripe size: segment e.g. 512 bytes • No data protection redundancy. • No need to write redundant information => best write performance. • Read performance ok. • Any disk failure => data loss. • Used in supercomputing environments where performance and capacity, rather than reliability, are the primary concerns.
RAID: Level 1 (Disk Mirroring) • Use twice as many disks as level 0. • Data is duplicated, called mirroring, or shadowing. • Read is faster, but write is slightly slower (why?) • If a disk fails, its mirror copy can still serve. • Used in database application where availability and transaction rate are more important than storage efficiency. Controller channel channel redundant data (mirror)
RAID 1 (2) • Compare the following 3 RAID-1 configurations. 0 0 2 2 Simple Shadowing 1 1 3 3 0 1 2 3 Declustering 3 0 1 2 0 1 2 3 Chained Declustering 1c 0a 0b 0c 2b 2c 1a 1b 3a 3b 3c 2a
RAID: Level 2 (Error Correction) • Uses Hamming code • Bit interleaving (bit-level data striping) • For n disks, about log(n) of them store redundant data. (More space efficient than mirroring). • If a disk fails, multiple redundant disks need to be read to identify the bad one. However, only one redundant disk needs be read to recover the lost data. • No practical use.
A Hamming Code Example • Suppose we want to encode 4 bit information • We distribute the bit at the following four red locationsx1 x2x3x4x5 x6 x7 • The blue bits are redundant bits for error detection and correction. • Next, we calculate the following 3 equations to find x1, x2, x4 • This code can detect and correct 1 bit error
A Hamming Code Example (2) • Example encoding:
A Hamming Code Example (3) • Suppose there is 1 bit error at x6 • e.g. original: 0 1 0 0 1 0 1 received: 0 1 0 0 1 1 1 • To detect and correct error we calculate • Notation r is used to indicate it is the current data which contains error. • Obviously, if all b are 0, there is no error, otherwise, there is error • Believe or not, the error must be located at position b2b1b0 e.g.
RAID: Level 3 (Bit-Interleaved Parity) • Hamming code can detect 1 bit error, but require 3 redundant bits to tell which bit is wrong. • However, useless in disk application, because we always know which disk fails. • If 1-bit recovery is needed, simple XOR parity is enough. • Bit-interleaving.
RAID: Level 4 (Block-Interleaved Parity) • Note that a single parity disk is enough to recover data lost due to single disk failure. • Block level interleaving • Small read => access one data disk; large read => access many data disks; small write => 4 I/O (read the data disk, compute the difference between the old and new images, update the data disk, update the parity disk; • Read is fast. How about write? • If one disk is dedicated for parity, bottleneck at parity disk due to writing. • Easy to implement, high transfer rate.
A A A A Ap Parity Disk B B B B Bp C C C C Cp D D D D Dp A A A A Ap Striped Parity B B B Bp B C C Cp C C D Dp D D D A A A Ap B C Declustered Parity B Bp C B Cp D Dp C Ep D D E E F F F E Fp Enhancing RAID-4 • Problems with RAID-4?
Distributing Parity • Parity disk • simplify the mapping of logical addresses to disk addresses. • every write must update the associated bits on the single parity disk. (Fine for fine-grained data striping, bad for coarse.) • Striped parity • can perform parallel parity update • Declustered parity • logically equivalent to combining several smaller arrays protected by striped parity into a large one.
RAID: Level 5(Block-Interleaved Distributed-Parity) • Eliminates the parity disk bottleneck by distributing the parity uniformly over all of the disks. • Improves read performance by allowing all disks be used to serve read requests. • Best for small reads, large reads, large writes.
RAID: Level 6 (P+Q Redundancy) • Uses 2 redundant disks to protect up to two disk failures. • Compute 2 different parities instead of 1. • Similar read performance as with Level 5, but write is slightly worse
Optical Media • Well accepted because: • High storage capacity • Random access to data • Life span of more than 30 years (c.f. << 20 years for magnetic media) • Removable and portable
History of Optical Media • Optical videodisk was invented by Friebus in 1929. Prototype using laser to record and read was demonstrated by Phillips and MCA in 1972. • Videodisks developed by Philips has been commercially available since 1978. • Then compact disk technology for digital audio (CD-DA) came out in early 1980s. • The use of optical disks for digital data storage came with the introduction and improvement of CD-ROM during the 1980s.
Optical Disk Technology • Optical storage media use the intensity of reflected laser light as an information source.
An optical disk consists of 3 layers: Protective layer (only 0.002mm thick on the label side). Reflective layer (aluminum coating). Substrate layer (transparent). In the factory, depressions are cut on the disk surface, forming “lands” and “pits” (0.12um different in heights). Optical Disk Technology (2)
Simple thresholding yields the H and L readback. Do you know: that data are read from the disk “inside-out?” that a CD should be cleaned radially? Optical Disk Technology (3)
Advantages of Optical Media • Continuous data stream. Data stored in spiral or concentric tracks. For the spiral track storage, data can be easily played back in a continuous data stream. • High density. Distance between tracks is 1.6um, each track is 0.6um wide, i.e., 1 bit per sq.um or 1Mb per sq.mm. Floppy disk has 96 tracks per inch, optical disk has 16000 tracks per inch. • Long life. Magnetization can decrease over time. ‘Lands’ and ‘pits’ not changed unless physically damaged. • Low wearing. Laser source in head can be positioned at 1mm from disk surface. Does not have to be as close to the surface as with magnetic disks. It reduces friction and increases life span.
Digital Optical Disks • Audio CD was developed by Philips and Sony in 1982. • Basic technology extended to 550 MB CD-ROM in 1985. • When used for multimedia, storage capacity is inadequate for motion video, and data rate limited to 1.5Mbps. • CD-ROM/XA and CD-I announced in 1986 and 87 to support applications of text, images, audio and FSFM video. • Recent developments include WORM (write once read many), MO (magneto optical), CD Recordable disks, and DVD.
Digital Optical Disk (2) Why CD is slower than hard disk? • CD is originally designed for squeezing as much music data into the disk as possible. The density of data is same in inner and outer tracks.=> The disk has to rotate slower when reading the outer track=> Variable speed is slow to adjust for random access (as in computer-based multimedia application) • Optical disk head is heavier than magnetic heads. More inertia takes longer seek time for head movements.
CD-DA(Compact Disk Digital Audio) • 1982 by Philips and Sony. • 12cm diameter, 1.2 mm thick optical disk, stores/plays in CLV. Spiral tracks of about 20,000 windings in total. • Data are recorded such that pit-to-land and land-to-pit transitions are coding ‘1’s. ‘0’s are coded as no transition. • Pits and lands are not directly used to represent digital information. How can you represent “11”? • Redundancy added to break up consecutive ‘1’s and ‘0’s.
CD-DA • Data rate: 44.1KHz sampling, 16-bit quantization, 175KBytes/sec. • Capacity: 747MB, up to 74 min high-quality sound. • Capability of random access to tracks and index points. • Error rate: as low as 10^(-8). However, still not low enough for computer data.
8 to 14 Modulation (EFM) • Pits and lands may not follow too closely one after another on a CD-DA. Rule 1: between any 2 ‘1’s, there are at least 2 ‘0’s. • For synchronization, pit or land sequences are not allowed to be too long. Rule 2: at most 10 ‘0’s can follow one after another. • Solution: Map every 8 bit pattern into a 14 bit pattern that satisfies the 2 rules. Among the 2^14 patterns, 267 of them are valid => just fit.Also, between consecutive 14-bit sequences, 3 merging bits are added to enforce the rules.
Low Level Data Encoding • Thus, an eight-bit byte of actual data is encoded into a total of 17 channel bits. • For synchronization and error correction, every 24 bytes of data is packaged into a frame: • sync pattern (24 + 3 bits) • control byte (17 bits) • 12 data bytes (12 * 17 bits) • 4 error correction bytes (4 * 17 bits) • 12 data bytes (12 * 17 bits) • 4 error correction bytes (4 * 17 bits) Total: 588 channel bits for 192 actual data bits.
First Level Error Correction • Cross Interleave Reed-Solomon Coding. • Recall that each frame contains 24 data bytes and 8 error correction bytes. • The first 4 correction bytes cover the frame’s data. The other 4 correction bytes cover data over 7 frames. • When a frame is read, the first 4 correction bytes are checked. If not ok, the decoder decodes the data bytes after subsequent correction codes are read. • 7 frames = 7.7 mm track length. Try radially scratch your CD with a cutter and see if it still works.
2352 bytes Sync 12 User Data 2048 Blanks 8 ECC 276 Header 4 EDC 4 CD-ROM (Compact Disk Read Only) • 1985 by Philips and Sony. • Tracks are divided into audio and data types. Disk containing both types are called Mixed Mode Disk. • It operates in 2 modes: mode 1 is for computer data, and mode 2 is for media data. • Mode 1 • Error rate requires better than 10^(-8) for computer data. Mode 1 achieves 10^(-12) error rate by using a second level error correction..
2352 bytes Sync 12 User Data 2336 Header 4 CD-ROM (2) • Random access to subtrack units called blocks (2352 bytes). (For CD-DA, random access is on track level only.) • Mode 1 for computer data. A capacity of 333,000 blocks to be played in 74 min, i.e. 660MB storage with data rate of 150KBps. Each block consists of 32 frames (@588 bits each). • Mode 2 • Mode 2 holds data of any media. • Additional error correction not crucial, so not used. • Disk has capacity of 750MB and a data rate of 175KBps.
CD-ROM (3) • CD-ROM is a very economic medium for publication and distribution. • Limitations of CD-ROM: • Random access to a CD track can be anywhere from 200ms up to 1 sec in access time. • Continuous media stored sequentially in CD-ROM tracks. Although important for multimedia applications, simultaneous playback of audio and other data is not possible.
CD-ROM/XA (Extended Architecture) • 1989, established by Microsoft, Philips and Sony. • Based on CD-ROM and CD-I. • Goal: concurrent output of several media. Within 1 track, blocks of different media can be stored. It allows interleaved storage and retrieval of multimedia data. • A sub-header is added to each block to describe the block. • CD-ROM/XA uses CD-ROM mode 2 to define actual blocks. Two forms:
2352 bytes Sync 12 Subheader 8 User Data 2048 ECC 276 Header 4 EDC 4 2352 bytes Sync 12 Subheader 8 User Data 2324 Header 4 EDC 4 CD-ROM/XA (2) • Form 1 provides more error detection/correction at the expense of redundancy. 2048 bytes (of 2352) are for user data.Form 2 allows 13% more storage for user data, but at the expense of the error correction.
CD-R (Compact Disk Recordable) • CD-R allows tracks to be recorded once. • 4 layers: protective, reflective, absorption, and substrate. Traditional CD-ROM CD-R Media Lacquer Lacquer Gold Don’t leave out in sunlight Aluminum Dye Polycarbonate Polycarbonate “Molded” by stamper Burned by high power laser beam
CD-R(2) • Land and pit reflections realized by irreversible thermal effect (above 250C) on the absorption layer. • Playable on CD players.
... Lead-in Lead-in Information Information Lead-out Lead-out Session 1 Session 2 CD-R (3) • Recording sessions • A CD has 3 areas: lead-in, actual data, lead-out. • Lead-in includes the table of contents: directory, indices to individual tracks. • Data area include all tracks where actual data is stored. • Lead-out marks the end of the data area. • Multiple sessions of lead-in, data, lead-out can be written separately over time. • During 1 write activity, all data for a session are written with their table of contents, after which the session can be played on any CD player.
CD-MO(Compact Disk Magneto Optical) • Specification published by Philips and Sony in 1991. • Working principle is different from other CD technologies. (Incompatible with other CD formats.) • Based on the polarization of light by magnetic field. • Disk surface is light reflecting magnetic substrate. • During writing, surface is heated to above 150C, and magnetic field polarizes individual dipoles. • During reading, surface is irradiated with a laser beam, polarization of laser light changed according to the magnetization.
Digital Versatile Disk (DVD) • Also called Digital Video Disk. • Capacity: 4.7 to 17 GB (25 CDs). • Q: Is it a good idea to replace VHS tapes by DVD disks in video rental stores? • Digital video can be stored and distributed more cheaply, also it allows interactivity. • Can be used to store up to 133 minutes (8-9 hrs for high capacity ones) of studio quality video and multi-channel surround-sound audio, or 30 hours of CD-quality audio.
DVD (2) • DVD achieves a greater capacity by • minimum pit length is reduced from 0.834 micron (CD) to 0.4 micron (DVD). • inter-track space is reduced from 1.6 micron (CD) to 0.74 micron (DVD).
DVD (3) • To read the condensed pits, DVD uses a laser of shorter wavelength (635-650 nm; for CD it is 780 nm). • Reducing the pit size and track distance increases the disc’s capacity to 4.7GB. • Dual layering. A semireflective layer (3.8GB) on top of a fully reflective layer (4.7GB) => 8.5GB total. • Double side. Two substrates bonded back-to-back. Each side could have one layer or two layers => capacity ranges from 9.4GB to 17GB.