630 likes | 762 Views
Efficient Data Dissemination and Survivable Data Storage. Lihao Xu http://www.cs.wayne.edu/~lihao/. Ubiquitous Information Access. Key Building Blocks. Storage Retrieval Dissemination Consumption. Key Building Blocks. Storage Retrieval Dissemination Consumption.
E N D
Efficient Data Dissemination and Survivable Data Storage Lihao Xu http://www.cs.wayne.edu/~lihao/
Key Building Blocks • Storage • Retrieval • Dissemination • Consumption
Key Building Blocks • Storage • Retrieval • Dissemination • Consumption
Error Correcting Codes Message 1 2 3 … k
Error Correcting Codes Message 1 2 3 … k Codeword 1 2 3 … n - 1 n
Error Correcting Codes Message 1 2 3 … k Codeword 1 2 3 … n - 1 n m Message 1 2 3 … k
(n,k) MDS Codes Reed-Solomon (RS) Code
a b c d d+c d+a a+b b+c (n,k) MDS Codes (4,2) B-Code
Data Dissemination Wireless Server Wireless Clients 1 2 3 want 1 want 1 want 2 want 3
Broadcast in a Cell Wireless Server Wireless Clients 1 2 3 want 1 want 1 want 2 want 3
Broadcast Model • Model clients as random processes • Desired item is random with probability pi for item i of length li. Wireless Server Wireless Clients 1 2 3 want 1 want 1 want 2 want 3
Scheduling Problem • 2 items, l1=l2 • Each item consists of k packets, k large • Challenge: choose packet broadcast schedule to minimize wait for clients S = 1 2 1 2
Prior Work • Complexity of optimal schedules • Bar-Noy, Bhatia, Naor, Schieber, Foltz • Complexity of computing optimal schedules • Kenyon, Schabanel • Error correction/detection • Bestavros
Metric: Delivery Time Delivery Time for item 1 S = 1 2 1 2
Delivery Time Instant in time when client starts waiting for item. Total amount of time spent waiting for item i when starting at time in schedule S. S = 1 2 1 2
Expected Delivery Time (EDT) uniformly distributed over schedule S.
EDT Calculation P = P = 1/2 1 2 1 2 1 2
EDT Calculation P = P = 1/2 1 2 1 2 1 2 DT 2
EDT Calculation P = P = 1/2 1 2 1 2 1 2 DT 2 3/2
EDT Calculation P = P = 1/2 1 2 1 2 1 2 DT 2 3/2 DT 7/4 1
EDT Calculation P = P = 1/2 1 2 1 2 1 2 DT 2 3/2 DT 7/4 1 7/4 EDT
Performance with Errors • Data items consist of k packets • What happens if a packet is lost? Original: 1 2 3 4 5 . . . k 1 Transmitted: 1 2 3 4 5 . . . k k 1 Received: 1 2 3 4 . . . k k 1
Performance with Errors • What happens if a packet is lost? Original: 1 2 3 4 5 . . . k 1 Transmitted: 1 2 3 4 5 . . . k k 1 Received: 1 2 3 4 . . . k k 1 1 2 3 4 5
Performance with Errors EDT = 3 ! • What happens if a packet is lost? Original: 1 2 3 4 5 . . . k 1 Transmitted: 1 2 3 4 5 . . . k k 1 Received: 1 2 3 4 . . . k k 1 1 2 3 4 5
Solution – Coding • Use k of n MDS code, n = 2k • Now only need to wait for 1 additional packet k + Original: 1 2 3 4 5 . . . k 1 1 2 3 4 5 . . . k k + Transmitted: 1 2 3 4 5 . . . k k 1 1 2 3 4 5 . . . k k + Received: 1 2 3 4 . . . k k 1 1
Solution – Coding EDT = 9/4 k + Original: 1 2 3 4 5 . . . k 1 1 2 3 4 5 . . . k k + Transmitted: 1 2 3 4 5 . . . k k 1 1 2 3 4 5 . . . k k + Received: 1 2 3 4 . . . k k 1 1
Solution – Coding • Use k of n MDS code, m = 2(k+1) • Now only need to wait for 1 additional packet k + Original: 1 2 3 4 5 . . . k n 1 1 2 3 4 5 . . . k n k + Transmitted: 1 2 3 4 5 . . . k n 1 1 2 3 4 5 . . . k n Received: 1 2 3 4 5 . . . k n
Solution – Coding EDT = 7/4 + e k + Original: 1 2 3 4 5 . . . k n 1 1 2 3 4 5 . . . k n k + Transmitted: 1 2 3 4 5 . . . k n 1 1 2 3 4 5 . . . k n Received: 1 2 3 4 5 . . . k n
General Solution Given loss probability p, what is the optimal n? k + Original: 1 2 3 4 5 . . . k n 1 1 2 3 4 5 . . . k n k + Transmitted: 1 2 3 4 5 . . . k n 1 1 2 3 4 5 . . . k n Received: 1 2 3 4 5 . . . k n
General Solution k = 100 and p = 0.1
General Solution k = 100
Two-Channel Broadcasting Wireless Server Wireless Server Wireless Clients 1 1 2 2 want 1 want 1 want 2 want 3
Coordinating Schedule Data • Use (2k, k) MDS code to eliminate data overlap • Channel 1 sends packets 1 through k (raw data) • Channel 2 sends packets k+1 through 2k • Features • Each channel is self-sufficient • No overlap between channels S1 = 12 1 2 (same schedule, different data) S2 = 12 1 2
Two Broadcast Channels • Scheduling for two channels • Two items with equal length and demand • Two synchronized channels of equal bandwidth • First channel’s schedule fixed at 12 • What is the optimal schedule for channel 2? S1 = 1 2 S2 = ?
Some Schedules Reshuffle Repeat 1 2 1 2 1 2 1 2 1 2 1 2 1 2 Unequal Portions Swap 1 2 1 2 2 1 1 2 1 2 1 Shift Arbitrary 1 2 1 2 2 1 2 2 1 2 1 2
Some Schedules Reshuffle Repeat 1 2 1 2 EDT = 1 EDT = 1 1 2 1 2 1 2 1 2 1 2 Unequal Portions Swap 1 2 1 2 EDT = 1 2 1 1 2 1 2 1 Shift Arbitrary 1 2 1 2 EDT = 1 2 1 2 2 1 2 1 2
Some Schedules Reshuffle Repeat 1 2 1 2 EDT = 1 EDT = 1 1 2 1 2 1 2 1 2 1 2 Unequal Portions Swap 1 2 1 2 EDT = 1 EDT = 63/64 2 1 1 2 1 2 1 Shift Arbitrary 1 2 1 2 EDT = 1 EDT < 63/64? 2 1 2 2 1 2 1 2
Schedule Performance • Symmetric Problem • Equal lengths • Equal demands • Equal bandwidth channels • Symmetric “fixed” schedule for 1st channel • Asymmetric Solution • Asymmetric schedules can beat any symmetric schedule for the 2nd channel • How is this possible?
Wireless Server 1 2 Wireless Server 3 1 2 3 Wireless Server Wireless Clients 1 2 Wireless Server 3 1 want 1 want 1 2 want 2 3 want 3 More to Explore … • More servers/Channels • Differing levels of synchronization • Transmission Errors • Streaming Data • Bounds
Secure and Survivable Storage • Availability • Recoverability • Persistence • Confidentiality • Integrity • Scalability • Efficiency
Secure and Survivable Storage • Yahoo • Ebay • Amazon • Google • Banks • Your Labs • More …