220 likes | 377 Views
Network Flow Watermarking Attack on Low-Latency Anonymous Communication Systems. Xinyuan Wang, Shiping Chen, Sushil Jajodia. Presented by Eun Kyoung Kim. Content. Introduction Network Flow Identification and Anonymous Communication Interval Centroid Based Watermarking Scheme
E N D
Network Flow Watermarking Attack on Low-Latency Anonymous Communication Systems Xinyuan Wang, Shiping Chen, Sushil Jajodia Presented by Eun Kyoung Kim
Content • Introduction • Network Flow Identification and Anonymous Communication • Interval Centroid Based Watermarking Scheme • Properties of the Interval Centroid Based Watermarking Scheme • Experiments • Conclusions • Discussions
Introduction • To address privacy concerns, anonymous communication systems have been designed to provide anonymity • Traditional methods of achieving anonymity include using proxies, MIXes, and various other flow transformations • We investigate the fundamental limitations of flow transformations by developing a novel flow watermarking technique
Network Flow Identification and Anonymous Communication(1/5) • Network information flow : the transmission path of some information along the network • Network flow identification problem : how to determine network flows that belong to any particular network information flows • Network flow identification is inherently related to anonymous communication whose goal is to conceal the true identities and relationships among the communication parties
Network Flow Identification and Anonymous Communication(2/5) • Anonymous communication systems usually mix multiple network information flows among multiple communicating parties and transform each network flow substantially • Existing network flow transformations can be divided into intra-flow transformations and inter-flow transformations
Network Flow Identification and Anonymous Communication(3/5)
Network Flow Identification and Anonymous Communication(4/5)
Network Flow Identification and Anonymous Communication(5/5) • Existing low-latency anonymous communication systems have used variations of the flow transformations in addition to any cryptographic operations they may use • Whether or not we could uniquely identify a network flow despite these flow transformations is a key problem that has a direct impact on some of the very foundations of existing anonymizing techniques
Interval Centroid Based Watermarking Scheme(1/6) • Goal : to make a sufficiently long flow uniquely identifiable even after significant transformations have occurred • Method : given a packet flow of duration Tf, to embed l-bit watermark with redundancy r
Interval Centroid Based Watermarking Scheme(2/6) • Random grouping and assignment of intervals where n = l x r
Interval Centroid Based Watermarking Scheme(3/6) • Finding aggregated centroids • Aggregate all of the time stamps in the r group A and group B intervals ( IAi, j and IBi,j), respectively, and calculate the centroids of group A and B packets (Ai and Bi), respectively, assigned for watermark bit i • Before watermark encoding • E(Ai) = E(Bi) = T/2 • E(Yi) = 0, where Yi = Ai - Bi
Interval Centroid Based Watermarking Scheme(4/6) • Encoding scheme • To encode bit ‘1’ or ‘0’, make Yi positive or negative by increasing Ai or Bi, respectively • To increase Ai or Bi, delay each packet within each interval IAi, j or IBi,j, respectively • Delay strategy • After watermark encoding • E(A’i) = E(B’i) = (T+a) / 2 • E(Yi1) = a/2, E(Yi0) = -a/2
Interval Centroid Based Watermarking Scheme(5/6) • Decoding scheme • Calculate each Yi(i=0, …, l-1) given the exact interval grouping and assignment information <o, T, RNG, s> • If Yi is positive/negative, the decoding of watermark bit i is 1/0
Interval Centroid Based Watermarking Scheme(6/6) • The upper bound of the decoding error probability by Chebyshev inequality • Given any T and a, we can minimize the error by increasing Ni, which can be achieved by increasing r provided that the flow is long enough with sufficient packets
Properties of the Interval Centroid Based Watermarking Scheme(1/3) • Self-synchronization • Try a rage of different offsets and find the offset that results in the closest match with the watermark • Problem : increasing the false-positive rate • Solution : lowering the false-positive rate of the single-offset decoding if we have enough packets
Properties of the Interval Centroid Based Watermarking Scheme(2/3) • Robustness Against Chaff and Flow Mixing • The chaff added to a watermarked flow tends to shift the centroid within each interval toward the center of the interval • How large is the impact of the chaff packets over the watermark detection error probability? • The upper bounds on the decoding error probabilities says no matter how large the RA, RB, R, we can always make the decoding error probabilities arbitrarily close to zero by having sufficiently large Ni, which can be achieved by having sufficiently large number of packets
Properties of the Interval Centroid Based Watermarking Scheme(3/3) • Robustness against packet dropping, repacketization, and flow splitting • When there are enough packets left in the flow, the centroids of all the intervals tend to remain the same
Experiments(1/2) • Real-time experiments on live anonymized web traffic
Experiments(2/2) • Offline experiments
Conclusions • We demonstrate that existing flow transformations do not necessarily make a long network flow indistinguishable from others • By developing a novel flow watermarking technique, we can uniquely identify a long flow even after drastic flow transformations • Our flow watermarking attack is applicable to all practical low-latency anonymous communication systems
Discussions • Potential research topics • How to keep privacy from this attack • Make the flow “sufficiently” short • What is the capability of the low-latency anonymous communication systems in the presence of active adversary