Information Theory and the Security of Binary Data Perturbation

Information Theory and the Security of Binary Data Perturbation Poorvi Vora Dept. of Computer Science George Washington University

Statistical Database • Database A: • Q = {q1 ,q2 ,...qi ,... } (queryable bits) and • S = {s1, s2,...si ,... } (sensitive bits). • Data collector B can ask for: fi(q1, q2, q3, …)qjQ = Xi Poorvi Vora/CS/GWU

The statistical database security problem • Can query multiple fi(q1, q2, q3, …)qjQ = Xi And simultaneously solve • (perfect zk protocols do not leak additional information about xi, but Ai are revealed; thus not a traditional cryptographic problem) Poorvi Vora/CS/GWU

Random Data Perturbation (RDP) Used in public health community for twenty odd years, can be used together with cryptographic techniques • If xi perturbed each time, the simultaneous equations are inconsistent fi(q1+1i, q2 +2i, q3 +3i, …) = Xi+ i • Security and attack characterization open problem for 20+ years; though many attempts (Denning, Adams, Duncan, … Landers). Poorvi Vora/CS/GWU

RDP Salary 25,000 Salary 40,000 -25,000 25,000 q 0 0 p = 1-q F(x) G(x) Yes HIV? p = 1-q q 1 1 stats. over many are accurate Poorvi Vora/CS/GWU

Known Security Property of RDP m repeated queries m probability of error m  0  m   Chernoff Bound: m = [ln(2/)] /[0.38 2]  m<  Probability of lie = 0.5   Poorvi Vora/CS/GWU

A simple inference attack • Query 1: Female? • Query 2: Over 40? • Query 3: Losing Calcium? Really asking about age and gender How does one characterize all such attacks? What can one say about security wrt such attacks? Poorvi Vora/CS/GWU

Our definitions Definition An inference attack is a set of queries x not independent of the set of sensitive bits S, i.e. I (S ;x)  0 Definition A small error inference attack is one in which lim nm = 0 . Definition The query complexity per bit, of query sequence x of length m, as a means of distinguishing among M possible values of x is m = m/log2M . Poorvi Vora/CS/GWU

Recall attack example • Query 1: Female? • Query 2: Over 40? • Query 3: Losing Calcium? Query 3 checks answers to Query 1 and 2 Is a parity-check bit of sorts, but not quite If 1 and 2 independent,  = 3/2 m 0  m   ? Poorvi Vora/CS/GWU

Our analogy (ISIT ‘03) • All attacks are communication over channel • When attacks are codes: x = f(S) • What B queries is a codeword bit • What B receives is the transmitted codeword that he decodes Poorvi Vora/CS/GWU

Shannon’s theorems apply when x = f(S) and  constant (ISIT ’03) Assuming x = f(S) (including adaptive, related queries) – queries are channel codes • constant – reliable transmission Result: m 0    1/C Above this bound, m 0 exponentially, Below it, it m increases exponentially Poorvi Vora/CS/GWU

What about the general zero-error inference attack? All inference attacks are not codes, i.e. x f(S).  is not necessarily kept constant as m , i.e. transmission is not necessarily reliable. Poorvi Vora/CS/GWU

Thm. 1 lim m  m = 0  { m}m=1 s.t. i   m  im; lim m  m = 1/C Proof modifies the converse of Shannon’s proof of the channel coding theorem Poorvi Vora/CS/GWU

The Proof log2M = H(sm) =H(sm|ym) + I(sm;ym) • 1 + Emlog2M + I(sm;ym) • 1 + Emlog2M + mC • m= m/log2M  (1-Em)/(1/m+C) = m Lim m m = 1/C Poorvi Vora/CS/GWU

Thm. 2 Small error attacks with constant   1/C exist. Proof: Follows from channel coding theorem Poorvi Vora/CS/GWU

Thm. 3 For data of entropy H, stationary record sequence, Nr records, and m the number of queries per record, lim m  m = 0  {m}m=1 s.t. i   m  im; lim m  m = H/C Proof: Modification of source-channel coding theorem Poorvi Vora/CS/GWU

Proof Given Theorem 1, smaller lengths can be shown to violate Shannon’s source coding theorem when the data is stationary. Poorvi Vora/CS/GWU

Corollary m  ln2/22 When p = 0.5 For any probability of error Different from Chernoff bound, does not increase with a smaller probability of error This is the improvement bought over the repetition code Poorvi Vora/CS/GWU

Where to? • Block Ciphers as channels for properties of the key (Filiol, ePrint 2003) • Attacks on Stream Ciphers as codes over key bits (Johansson et al, Golic et al, Filiol et al) • It appears there is a framework (Vora, working documents): • all statistical attacks as channel communication • efficient attacks as codes • related-input (key, message) attacks as concatenated codes1 • Wagner’s Cryptanalytic Model (FSE ‘03) to determine inner codes Do related-key attacks provide an improvement in efficiency over repeated key attacks? 1Filiol shows the repeated key attack on block ciphers as a concatenated code with the outer code as the repetition code Poorvi Vora/CS/GWU

Also traffic analysis, e.g.Crowds: Reiter and Rubin/Lucent and AT&T N nodes; C colluding pf probability of forwarding At node i+1: Probability that node i originated the message (probability of truth): 1 – pf (N-C-1)/N Probability of any other non-collaborating node originating message: pf/N Observable information changes the pdf on the data of interest: the originator of the message Crowds Poorvi Vora/CS/GWU

The Crowds protocol as a simplex channel X Y Φ: X = set of originator nodes {0, ..N-3} → Y = set of predecessor nodes {0, ..N-3} Φ(X) = Y Assumption: all senders equally likely P(Y = j | X = i) = pij = pf/N i j; = 1 – pf(N-2)/N; i=j Poorvi Vora/CS/GWU

The Crowds protocol X Y C = 1+ (N-2)pf/N log [1- (N-2)pf/N] + pf/N log [pf/N] = 2log2/N if pf=1  2log2/N + (N-1)2 if pf= 1 -  Average path length = (1 - )/ = O(1/ ) Poorvi Vora/CS/GWU

The replay attack on Crowds Repetition code  resending message, along different (randomly chosen) route How about attacks corresponding to other codes? Poorvi Vora/CS/GWU

Information Theory and the Security of Binary Data Perturbation

Information Theory and the Security of Binary Data Perturbation

Presentation Transcript

Time Dependent Perturbation Theory

Time-Independent Perturbation Theory 1

Perturbation Theory

Information Theory and Security

Perturbation Theory

Time-Dependent Perturbation Theory

Perturbation Theory

PERTURBATION THEORY

Nonlinear Spectroscopy: Diagrammatic Perturbation Theory

Perturbation Theory

Perturbation Theory

Perturbation Theory, part 1

Renormalised Perturbation Theory

Atmospheric Waves: Perturbation Theory

Information Security and Research Data

Information Theory and Security

Solution of the Deuteron using Perturbation Theory

Inflationary Theory of Primordial Cosmological Perturbation

Atmospheric Waves: Perturbation Theory

Perturbation theory

Information Theory and Security

Information Theory and the Security of Binary Data Perturbation