230 likes | 342 Views
Retrieval Multimedia Data from Disks Presented by Yuni Xia. Fundamental characteristics : Real-time storage and retrieval Large data transfer rate and storage space requirement. Why choosing magnetic disk Storage capacity Speed Moderate cost / Random access / Writing. Side View:.
E N D
Retrieval Multimedia Data from Disks Presented by Yuni Xia
Fundamental characteristics: • Real-time storage and retrieval • Large data transfer rate and storage space requirement • Why choosing magnetic disk • Storage capacity • Speed • Moderate cost / Random access / Writing
Side View: platters Read/write head Spindle Top View: Tracks Sector
Symbol Meaning tnum snum itd ss rv dtr total # of tracks total # of sectors intertrack distance spin speed radial velocity data transfer rate Suppose: Wish to read data sector i on track ti, read head is currently over sector j in track tj Readtime = seek(ti, tj) + rotation(si,sj)+data/dtr seek(ti, tj) = abs(ti-tj) / rv rotation(si,sj) = (abs(si-sj) / snum ) / ss
Raid arrays and Placement methods • By spreading data across several hard disks • faster performance • greater storage capacity • higher data security • Six standards: 0-5 • (cross-type variations, such as 0/1, 3/5) • Implemented by software and hardware
RAID 0: Striped Disk Array without Fault Tolerance RAID Level 0 requires a minimum of 2 drives to implement A B C D E F G H I J K L M N O etc..
RAID 1: Mirroring and Duplexing RAID Level 1 requires a minimum of 2 drives to implement A A E E I I M M B B F F J J N N = = = = C C G G K K O O D D H H L L P P
RAID 5: Independent Data Disks with Distributed Parity Blocks RAID Level 5 requires a minimum of 3 drives to implement A0 B0 C0 D0 0 parity A1 B1 C1 1 parity E1 A2 B2 2 parity D2 E2 A3 3 parity C3 D3 E3 4 parity B4 C4 D4 E4
A model of heterogeneous disk servers Router Server1 Server n . . . ... ... d1 d2 d3 dm d1 d2 d3 dn
What needs to be modeled? • The intrinsic characteristics of each disk server • The intrinsic characteristics/capabilities of each client • The relationship between the disk servers and clients • The distribution of data across the disk server
Disk Server Characteristics 1. Dtr(i): Total disk bandwidth of disk server i 2. Buf(i): Total buffer space associated with server i 3. Switchtime(i, t): Time required for si to switch between clients at time t 4. Cyctime(i, t): One cycle of read operation to be executed by si at time t
Client Characteristics 1. Cons(i,t): The consumption rate of client Ci at time t 2. Data(i, t): (M, b) Play: data(i, t) = {(m,b), (m, b+1), …} FF: data(i, t) = {(m,b), (m, b+ffs), (m, b+2ffs), …} RW: data(i, t) = {(m,b), (m, b-rws), (m, b-2rws), …} Pause: data(i, t) = {(m,b)}
Client Characteristics Data(i, t): (M, b, len, step) b, (b+step), (b+2*step), …. , (b+(len-1)*step) 1. Play: step =1 2. FF: step = ffs 3. RW: step = -rws 4. Pause: step = 0
Client-Server Characteristics • 1. Timealloc(i,j,t): • In any given cycle of disk server i, each client • cj has a time-slice, timealloc(i, j, t) • cyctime(i, t) >= sum( timealloc(i,j,t)) • + (ni,t * switchtime(i,t)) • 2. active(t): • The set of all clients that are active at time t. • 3. d_active(i, t) • active(t)= Union(d_active(i, t))
Client-Server Characteristics 4. Ut (i): The set of servers which are handling the requests of client Ci. Ut(i) = { S | Ci d_active(s, t)} 5. Bufreq(j, i, t): The amount of buffer that is required at server Si so that data that client Cj needs to read doesn’t get overwritten. Buf(i) >= sum(bufreq(j, i, t)
Distribution of Data • M (mi , b) : placement mapping • The set of all servers that contain block b of mi • M ( “Sound of Music”, 20 ) = {2, 4, 5} • Placement constraint • data (C, t) = (m, b, len, step) i (0< i <len) • ( j Ut (i) ) ( j M (mi , b + ( i * step ) )
Suppose: Data (C, t) = (M, 5, 5, 3), Ut (C) = {1, 3, 4} { 5, 8, 11, 14, 17} must be in {S1, S3, S4} Definition: State of an MOD System S(t) 1. Active ( t ) 2. Cyctime (i, t) 3. Cons ( i, t ) 4. Timealloc ( i, j, t) 5. Data ( i, t) 6. Ut
Disk availability constraint 1. Consumption Rate Constraint: Sum(cons(j,t)) + switchtime(i,t) * dtr(i)/ cyctime(i,t) <=dtr(i) 2. Buffer requirement constraint: sum(buf(j, i, t)) < = buf (i) timealloc (i, j, t) = cyctime (i, t)* cons (j, t) / dtr(i) bufreq(j,i,t) = (dtr(i)-cons(j,t))* timealloc(i, j, t)
Router Server1 B: 1-150 Server 2 B: 151-250 Server 3 B: 200-300 ... ... ... (mi, 140, 2, 5) (140, 145) (150, 155) (mi, 199, 2, 1) (199, 200) (201, 202)
Trans Transaction type Priority tr1 tr2 tr3 tr4 tr5 tr6 exiting client continuing client-normal continuing client-needs switching continuing client- needs splitting new client new client -needs splitting 5 4 3 3 2 1
An event-based algorithm QuickSOL • FindSOL • OptimizeSOL FindSOL Phase: 1. Split EV(t) into 6 sets: new(t), exit(t), cont(t), pause(t), ff(t), rew(t) 2. (handle exiting clients) For each clients Ci in exit(t) do 1) free the resources 2) delete Ci from state table
3. (Handle Continuing Clients) For each clients Ci in cont(t) or ff(t) or rew(t) do Ifservers currently assigned to C satisfy .. then modify the state table else 1) re-set C’s priority to 3 2) Move it into new(t) 3) update the resource table 4. (Handle New Clients) For each clients Ci in new(t) do 1) Identify the servers that have the data required by C
2) Determine which server have enough bandwidth .. • IF no such server is available, • split the event into 2 sub-events: • data(C, t) = (m, s, l/2, 2*step) and • data(C, t) = (m, s+step, l/2, 2*step) • Keep splitting till for both sub-events ... • Update state table • 3) Do the same as 2) in terms of buffer requirement OptimizeSOL phase: 1. Switching 2. Splitting Balancing the load, Maximizing the # of clients ...