Adversaries and Information Leaks

Adversaries and Information Leaks Geoffrey Smith Florida International University November 7, 2007 TGC 2007 Workshop on the Interplay of Programming Languages and Cryptography

Motivation • Suppose that a program c processes some sensitive information. • How do we know that c will not leak the information, either accidentally or maliciously? • How can we ensure that c is trustworthy ? • Secure information flow analysis: Do a static analysis (e.g. type checking) on c prior to executing it.

Two Adversaries • The program c: • Has direct access to the sensitive information (H variables) • Behavior is constrained by the static analysis • The observer O of c’s public output: • Has direct access only to c’s public output (final values of L variables, etc.) • Behavior is unconstrained (except for computational resource bounds)

The Denning Restrictions • Classification: An expression is H if it contains any H variables; otherwise it is L. • Explicit flows: A H expression cannot be assigned to a L variable. • Implicit flows: A guarded command with a H guard cannot assign to L variables. • if ((secret % 2) == 0) • leak = 0; • else • leak = 1; H guard assignment to L variable

Noninterference • If c satisfies the Denning Restrictions, then (assuming c always terminates) the final values of L variables are independent of the initial values of H variables. • So observer O can deduce nothing about the initial values of H variables. • Major practical challenge: How can we relax noninterference to allow “small” information leaks, while still preserving security?

Talk Outline • Secure information flow for a language with encryption [FMSE’06] • Termination leaks in probabilistic programs [PLAS’07] • Foundations for quantitative information flow Joint work with Rafael Alpízar.

I. Secure information flow for a language with encryption • Suppose that E and D denote encryption and decryption with a suitably-chosen shared key K. • Programs can call E and D but cannot access K directly. • Intuitively, we want the following rules: • If e is H, then E(e) is L. • If e is either L or H, then D(e) is H. • But is this sound ? Note that it violates noninterference, since E(e) depends on e.

It isunsound if encryption is deterministic! Assume secret : H, mask : L, leak : L. leak := 0; mask := 2n-1; while mask ≠ 0 do ( if E(secret | mask) = E(secret) then leak := leak | mask; mask := mask >> 1 )

Symmetric Encryption Schemes • SE with security parameter k is a triple (K,E,D) where • K is a randomized key-generation algorithm • We write K <- K. • E is a randomized encryption algorithm • We write C <- EK(M). • D is a deterministic decryption algorithm • We write M := DK(C).

IND-CPA Security M1 M2 b? • The box contains key K and selection bit b. • If b=0, the left strings are encrypted. • If b=1, the right strings are encrypted. • The adversaryA wants to guess b. EK(LR(·,·,b)) M1 and M2 must have equal length. A C

IND-CPA advantage • Experiment Expind-cpa-b(A) K <- K; d <- AEK(LR(·,·,b)); return d • Advind-cpa(A) = Pr[Expind-cpa-1(A) = 1] – Pr[Expind-cpa-0(A) = 1]. • SE is IND-CPA secure if no polynomial-time adversary A can achieve a non-negligible IND-CPA advantage.

Our type system • Each variable is classified as H or L. • We just enforce the Denning restrictions, with modifications for the new constructs. • E(e) is L, even if e is H. • D(e1,e2) is H, even if e1 and e2 are L. • R (a random value) is L.

Leaking adversary B • B has a H variable h and a L variable l, and other variables typed arbitrarily. • h is initialized to 0 or 1, each with probability ½. • B can call E() and D(). • B tries to copy the initial value of h into l.

Leaking advantage of B • Experiment Expleak(B) K <- K; h0 <- {0,1}; h := h0; initialize other variables to 0; run BEK(·),DK(·); if l = h0 then return 1 else return 0 • Advleak(B) = 2 · Pr[Expleak(B) = 1] - 1

Soundness via Reduction • For now, drop D() from the language. • Theorem: Given a well-typed leaking adversary B that runs in time p(k), there exists an IND-CPA adversary A that runs in time O(p(k)) and such that Advind-cpa(A) ≥ ½·Advleak(B). • Corollary: If SE is IND-CPA secure, then no polynomial-time, well-typed leaking adversary B achieves non-negligible advantage.

Proof of Theorem • Given B, construct A that runs B with a randomly-chosen 1-bit value of h. • Whenever B calls E(e), A passes (0n, e) to its oracle EK(LR(·,·,b)). • So if b = 1, E(e) returns EK(e). • And if b = 0, E(e) returns EK(0n), a random number that has nothing to do with e! • If B terminates within p(k) steps and succeeds in leaking h to l, then A guesses 1. • Otherwise A guesses 0.

What is A’s IND-CPA advantage? • If b = 1, B is run faithfully. • Hence Pr[Expind-cpa-1(A) = 1] = Pr[Expleak(B) = 1] = ½·Advleak(B) + ½ • If b = 0, B is not run faithfully. • Here B is just a well-typed program with random assignment but no encryption.

More on the b = 0 case • In this case, we expect the type system to prevent B from leaking h to l. • However, when B is run unfaithfully, it might fail to terminate! • Some careful analysis is needed to deal with this possibility — Part II of talk! • But in the end we get Pr[Expind-cpa-0(A) = 1] ≤ ½ • So Advind-cpa(A) ≥ ½·Advleak(B), as claimed. □

Can we get a result more like noninterference? • Let c be a well-typed, polynomial-time program. • Let memories μ and ν agree on L variables. • Run c under either μ or ν, each with probability ½. • A noninterference adversaryO is given the final values of the L variables of c. • O tries to guess which initial memory was used.

A computational noninterference result • Theorem: No polynomial-time adversary O (for c, μ, and ν) can achieve a non-negligible noninterference advantage. • Proof idea: Given O, we can construct a well-typed leaking adversary B. • Note that Ocannot be assumed to be well typed! • But because O sees only the L variables of c, it is automatically well typed under our typing rules.

Construction of B initialize L variables of c according to μ and ν; if h = 0 then initialize H variables of c according to μ else initialize H variables of c according to ν; c; O; // O puts its guess into g l := g

Related work • Laud and Vene [FCT 2005]. • Work on cryptographic protocols: Backes and Pfitzmann [Oakland 2005], Laud [CCS 2005], … • Askarov, Hedin, Sabelfeld [SAS’06] • Laud [POPL’08] • Vaughan and Zdancewic [Oakland’07]

II. Termination leaks in probabilistic programs • In Part I, we assumed that all adversaries run in time polynomial in k, the key size. • This might seem to be “without loss of generality” (practically speaking) since otherwise the adversary takes too long. • But what if program c either terminates quickly or else goes into an infinite loop? • In that case, observer O might quickly be able to observe whether c terminates.

The Denning Restrictions and Nontermination • The Denning Restrictions allow H variables to affect nontermination: • “If c satisfies the Denning Restrictions, then (assuming c always terminates) the final values of L variables are independent of the initial values of H variables.” • Can we quantify such termination leaks? while secret = 0 do skip; leak := 1

Probabilistic Noninterference • Consider probabilistic programs with random assignment but no encryption. • Such programs are modeled as Markov chains of configurations (c,μ). • And noninterference becomes • The final probability distribution on L variables is independent of the initial values of H variables.

A program that violates probabilistic noninterference • If h = 0, terminates with l = 0 with probability ½ and loops with probability ½. • If h = 1, terminates with l = 1 with probability ½ and loops with probability ½. t <- {0,1}; if t =0 then while h = 1 do skip; l := 0 else while h = 0 do skip; l := 1 randomly assign 0 or 1 to t t : L h : H l : L

Approximate probabilistic noninterference • Theorem: If c satisfies the Denning restrictions and loops with probability at most p, then c’s deviation from probabilistic noninterference is at most 2p. • In our example, p = ½, and the deviation is |½ - 0| + |0 – ½| = 1 = 2p. probability that l = 1 when h =0 and when h = 1 probability that l = 0 when h = 0 and when h = 1

Stripped program c • Replace all subcommands that don’t assign to L variables with skip. • Note: c contains no H variables. t <- {0,1}; if t =0 then while h = 1 do skip; l := 0 else while h = 0 do skip; l := 1 t <- {0,1}; if t =0 then skip; l := 0 else skip; l := 1

The Bucket Property c’s buckets loop l = 0 l = 1 l = 2 Pour water from loop bucket into other buckets. c’s buckets loop l = 0 l = 1 l = 2

Proof technique • In prior work on secure information flow, probabilistic bisimulation has often been useful. • Here we use a probabilistic simulation [Jonsson and Larson 1991] instead. • We define a modification of the weak simulation considered by [Baier, Katoen, Hermanns, Wolf 2005].

Fast Simulation on a Markov chain (S,P) • A fast simulation R is a binary relation on S such that if s1 R s2 then the states reachable in one step from s1 can be partitioned into U and V such that • v R s2 for every v Є V • letting K = ΣuЄUP(s1,u), if K > 0 then there exists a function Δ : S x S -> [0,1] with • Δ(u,w) > 0 implies that u R w • P(s1,u)/K = ΣwЄSΔ(u,w) for all u Є U • P(s2,w) = ΣuЄUΔ(u,w) for all w Є S.

Proving the Bucket Property • Given R, a set T is upwards closed if sЄT and sRs’ implies s’ЄT. • Pr(s,n,T) is the probability of going from s to a state in T in at most n steps. • Theorem: If R is a fast simulation, T is upwards closed, and s1 R s2, then Pr(s1,n,T) ≤ Pr(s2,n,T) for every n. • We can define a fast simulation RL such that (c,μ) RL (c,μ), for any well-typed c.

Approximate noninterference (c,μ) at most p loop l = 0 l = 1 l = 2 (c,μ) = (c,ν) μ and ν agree on L variables loop l = 0 l = 1 l = 2 (c,ν) at most p loop l = 0 l = 1 l = 2

Remarks • Observer O’s ability to distinguish μ and ν by statistical hypothesis testing could be bounded as in [Di Pierro, Hankin, Wiklicky 2002]. • The Bucket Property is also crucial to the soundness proof of the type system considered in Part I of this talk.

III. Foundations for quantitative information flow • To allow “small” information leaks, we need a quantitative theory of information. • Quite a lot of recent work: • Clark, Hunt, Malacaria [2002, 2005, 2007] • Köpf and Basin [CCS’07] • Clarkson, Myers, Schneider [CSFW’05] • Lowe [CSFW’02] • Di Pierro, Hankin, Wiklicky [CSFW’02]

Research Steps • Define a quantitative notion of information flow. • Show that the notion gives appropriate security guarantees. • Devise static analyses to enforce a given quantitative flow policy. • Prove the soundness of the analyses. Here we’ll discuss only steps 1 and 2.

Our Conceptual Framework • Rather than trying to tackle the general problem, let’s consider important special cases to better see what’s going on. • Assume that secret h is chosen from some space S with some a prioridistribution. • Assume that c is a program that has only h as input and (maybe) leaks information about h to its unique public output l. • Assume that c is deterministic and total.

Then… [Köpf Basin 07] • There exists a function f such that l = f(h). • f induces an equivalence relation on S. • h1 ~ h2 iff f(h1) = f(h2) • So c partitions S into equivalence classes: f-1(l3) f-1(l1) f-1(l2)

What is leaked? • The observer O sees the final value of l. • This tells O which equivalence class h belonged to. • How bad is that? • Extreme 1: If f is a constant function, then there is just one equivalence class, and noninterference holds. • Extreme 2: If f is one-to-one, then the equivalence classes are singletons, and we have total leakage of h (in principle…).

Quantitative Measures • Consider a discrete random variable X whose values have probabilities p1, p2, p3,… • Assume pi ≥ pi+1 • Shannon Entropy H(X) = Σ pi log (1/pi) • “uncertainty” about X • expected number of bits to transmit X • Guessing Entropy G(X) = Σ i pi • expected number of guesses to guess X

Shannon Entropy applied to the partitions induced by c • Let’s assume that • |S| = n and h is uniformly distributed • The partition has r equivalence classes C1, C2, …, Cr and |Ci| = ni • H(h) = log n • “initial uncertainty about h” • H(l) = ∑ (ni/n) log (n/ni) • Plausibly, “amount of information leaked” • Extreme 1: H(l) = 0 • Extreme 2: H(l) = log n

How much uncertainty about h remains after the attack? • This can be calculated as a conditional Shannon entropy: • H(h|l) = ∑ (ni/n) H(Ci) = ∑ (ni/n) log ni • Extreme 1: H(h|l) = log n • Extreme 2: H(h|l) = 0 • There is a pretty equation relating these! • H(h) = H(l) + H(h|l) initial uncertainty information leaked remaining uncertainty

So is Step 1 finished? • “1. Define a quantitative notion of information flow” • In the special case that we are considering, it seems promising to define the amount of information leaked to be H(l), and the remaining uncertainty to be H(h|l). • This seems to be the literature consensus: • Clarke, Hunt, Malacaria • Köpf and Basin — also use G(l) and G(h|l) • Clarkson, Myers, Schneider (?) — Sec. 4.4, when the attacker’s belief matches the a priori distribution

What about Step 2? • “2. Show that the notion gives appropriate security guarantees.” • “Leakage = 0 iff noninterference holds” • Good, but this establishes only that the zero/nonzero distinction is meaningful! • More interesting is the Fano Inequality. • But this gives extremely weak bounds. • Does the value of H(h|l) accurately reflect the threat to h?

An Attack • Copy 1/10 of the bits of h into l. • l = h & 017777…7; • Gives 2.1log n = n.1 equivalence classes, each of size 2.9 log n = n.9. • H(l) = .1 log n • H(h|l) = .9 log n • After this attack, 9/10 of the bits are completely unknown.

Another Attack • Put 90% of the possible values of h into one big equivalence class, and put each of the remaining 10% into singleton classes: • if (h < n/10) l = h; else l = -1; • H(l) = .9 log (1/.9) + .1 log n ≈ .1 log n + .14 • H(h|l) = .9 log (.9n) ≈ .9 log n – .14 • Essentially the same as the previous attack! • But now O can guess h with probability 1/10.

A New Measure • With H(h|l), we can’t do Step 2 well! • Why not use Step 2 to guide Step 1? • Define V(h|l), the vulnerability of h given l, to be the probability that O can guess h correctly in one try, given l.

Calculating V(h|l) when h is uniformly distributed • As before, assume that • |S| = n and h is uniformly distributed • The partition has r equivalence classes C1, C2, …, Cr and |Ci| = ni • V(h|l) = Σ (ni/n) (1/ni) = r/n • So all that matters is the number of equivalence classes, not their sizes! r i=1

Examples • Noninterference case: r = 1, V(h|l) = 1/n • Total leakage case: r = n, V(h|l) = 1 • Copy 1/10 of bits: r = n.1, V(h|l) = 1/n.9 • Put 1/10 of h’s values into singleton classes: r = 1 + n/10, V(h|l) ≈ 1/10 • Put h’s values into classes, each of size 10: r = n/10, V(h|l) = 1/10 • Password checker: r = 2, V(h|l) = 2/n

Adversaries and Information Leaks

Adversaries and Information Leaks

Presentation Transcript

OFFSHORE LEAKS

FROM Adversaries to Allies:

Presidential Leaks

Extractors against classical and quantum adversaries

Lawyers: Advocates and Adversaries

Measuring Adversaries

Memory Leaks and Dangling Pointers

Memory Leaks and Dangling Pointers

Multiple Accidental Adversaries

Understanding the Adversaries

Route Leaks

Preventing Information Leaks in Email

Humor – Your Adversaries

TechTalk Leaks and Side Channels

FIGHTING ADVERSARIES IN NETWORKS

CSF Leaks - Diagnosis and Management

Everybody Leaks

Biography Leaks

Measuring Adversaries

Understanding the Adversaries

Roof Leaks - How to Find and Fix Common Roof Leaks