400 likes | 486 Views
Dynamic Enforcement of Knowledge-based Security Policies. Piotr (Peter) Mardziel. Joint work with Stephen Magill, Michael Hicks (UMD), and Mudhakar Srivatsa (IBM TJ Watson). Where is your information?. Bad + Good. Take back control. Photography. o ut = 24 ≤ Age ≤ 30 Æ Female?
E N D
Dynamic Enforcement of Knowledge-based Security Policies Piotr (Peter) Mardziel Joint work with Stephen Magill, Michael Hicks (UMD), and MudhakarSrivatsa (IBM TJ Watson)
Take back control Photography
out = 24 ≤ Age ≤ 30 Æ Female? ÆEngaged? Photography true
out = (ssn, ccn, fav-color) reject
The big problem: how to decide to run a query or not? Q1 Q2 Q3 out = (ssn, ccn, fav-color) out = 24 ≤ age ≤ 30 Æ female? Æengaged? out = age ?
Outline • Clarkson’s work on quantified information flow • (adversary) belief/knowledge and belief revision (Bayesian revision) • probability of adversary guessing secret • Suitable knowledge-based policy formulation • maintain bound while revealing query outputs or rejecting queries • Clarkson’s probabilistic semantics • Computational feasibility • approximation of knowledge • abstract interpretation (of probabilistic semantics) • soundness in terms of policy • Experimental results • compare to prob-scheme, enumeration-based probabilistic evaluation
Clarkson’s Quantitative Information Flow • Given belief, query, determine revised belief given the output of the query (Bayesian revision) • Pr[gender = male | output = o] • Pr[Bad] < t = • adversary holding belief has bounded probability of guessing my secret(s) in one try • sufficient: Pr[gender = male | output = o] < t • Assumption • initial belief is correct • our approach might help here (later) • adversary guesses by sampling Ex: Pr[gender = male] = 0.5 Pr[gender = female] = 0.5 = belief (probability distribution) over secret data
More about Clarkson’s semantics later • Let us consider an example • Flawed initial policy • Pr[my secret] < t • demonstrate problem with rejection • revise policy to address problem
Meet Bob Bob (born September 27, 1980) bday = 270 byear = 1980 Secret 0 ·bday· 364 1956 ·byear· 1992 = byear Policy Pr[bday] < 0.2 Pr[bday,byear] < 0.05 1992 Currently Pr[bday] = 1/365 Pr[bday,byear] = 1/(365*37) 1956 bday 0 364
1992 bday-query1 today := 260; if bday¸ today Æbday < (today + 7) then out := true else out := false 1956 0 364 1992 | (out = false) = 1956 Bob bday = 270 byear = 1980 0 364 267 259
bday-query1 today := 260; if bday¸ today Æbday < (today + 7) then out := true else out := false Potentially Pr[bday] = 1/358 < 0.2 Pr[bday,byear] = 1/(358*37) < 0.05 1992 | (out = false) = = 1956 Bob bday = 270 byear = 1980 0 364 267 259
Next day … 1992 bday-query2 today := 261; if bday¸ today Æbday < (today + 7) then out := true else out := false 1956 0 364 267 259 1992 | (out = false) = 1956 Bob bday = 270 byear = 1980 0 364 268 259
bday-query2 today := 261; if bday¸ today Æbday < (today + 7) then out := true else out := false Potentially Pr[bday] = 1/357 < 0.2 Pr[bday,byear] = 1/(357*37) < 0.05 1992 | (out = false) = 1956 Bob bday = 270 byear = 1980 0 364 268 259
Meet Bob’ bday-query2 today := 261; if bday¸ today Æbday < (today + 7) then out := true else out := false Bob’ bday = 267 byear = 1980 after first query (potentially) after second 1992 1992 1956 1956 0 364 Pr[bday] = 1 267 So reject? 259 0 267
Querier’s perspective Assume querier knows policy if bday != 267 if bday = 267 1992 1992 1956 will get answer will get reject 1956 0 364 268 259 0 267
Querier’s perspective | reject = • Solution? • Decide policy independently of secret • Revised policy • For every possible output o, for every possible bday, Pr[bday | out = o] < t • So the real bday in particular • Therefore Pr[bad] < t
bday-query2 today := 261; if bday¸ today Æbday < (today + 7) then out := true else out := false reject (regardless of what bday actually is)
Clarkson’s Probabilistic Interpretation • Given ± : States !R • probability distribution on program states (including secrets) • ±(¾) = probability of state ¾ • Given program S • Compute • «S¬± • probability distribution on resulting program states • ±Æ B • (sub)probability distribution of ± only on states consistent with B • ± | (out = true) • Bayesian revision, post belief
Probabilistic Interpretation • Semantics • «skip¬± = ± • «S1;S2¬± = «S2¬(«S1¬±) • «if B then S1 else S2¬± = «S1¬(±Æ B) + «S2¬(±Æ: B) • «pif p then S1 else S2¬± = «S1¬(p*±) + «S2¬((1-p)*±) • «x := E¬± = ±[x ! E] • «while B do S¬ = lfp(¸± . F(«S¬(±Æ B)) + (±Æ: B)) • Operations • p*± – scale probabilities by p • ±Æ B – remove mass inconsistent with B • ±1 + ±2 – combine mass from both • ±[x ! E] – transform mass
Subdistribution operations ±1 + ±2 – combine mass from both ±1 + ±2 = ¸¾. ±1(¾) + ±2(¾) ±Æ B – remove mass inconsistent with B ±Æ B = ¸¾. if «B¬¾ = true then ±(¾) else 0 «S¬ ± «y := y + 3¬ (± Æx · 5) «y := y - 3¬ (± Æx > 5) ±Æ x · 5 ±Æx > 5 ±1+±2 ± ±1 ± B = x ¸ y ± Æ B ±2 • «if x · 5 then y := y + 3 else y := y - 3¬±= «y := y + 3¬(±Æx · 5) + «y := y - 3¬(±Æx > 5)
Infeasibility • Computational trouble • ±[x ! E] = ¸¾ . ¿ |¿[x ! [[E]]¿] = ¾±(¿) • max¾±(¾) = ? (for policy check) • enumeration? • Sampling (prob-scheme, IBAL, …) • evaluate statement for some set of input states • poor probability bounds if evaluated on small subset of possible states (later) • prohibitive (time, memory) for large state space • Let’s try an approximation 15:00
Abstraction ± P • Approximate representation P • P abstracts a set of distributions, °(P) • sound probability bound • if ±2°(P) then max¾±(¾) ·maxprob(P) • ((S)) P – abstract interpretation • if±2°(P) then [[S]]±2°( ((S))P ) • P | (out = X) – abstract conditioning • if±2°(P) then (± | (out = X)) 2°(P | (out = X)) • (more) computationally feasible • let us revisit Bob to motivate a suitable abstraction
Representation 1992 Æ out = false P1 P2 P1 1956 0 364 1992 P1: 0 ·bday· 259, 1956 ·byear· 1992 p = 0.000074 s = 9620 m = 0.712 P2: 267 ·bday· 364, 1956 ·byear· 1992 p = 0.000074 s = 3626 m = 0.268 P1: 0 ·bday· 364, 1956 ·byear· 1992 p = 0.000074 s = 13505 (# of points) m = 1 (total mass) 1956 0 364 267 259 Can determine ( | out = false)
Æ out = false 1992 1992 1991 1981 1981 1971 1971 spec-byear-query age := 2011 – byear; if age = 20 Ç … Ç age = 60 then out := true else out := false; pif 0.1 then out := true 1961 1961 1956 1956 0 0 364 364 267 267 259 259 Æ out = true 1991 • pif p then S1 • evaluate S1 with probability p
Approximation 1992 1992 1981 1981 P1 1971 1971 P2 1961 1961 1956 1956 0 0 364 364 267 267 259 259 Æ out = true approximate P1: 0 ·bday· 259, 1992 ·byear· 1992 p = 0.0000074 s = 260 m = 0.0019 P2: 0 ·bday· 259, 1991 ·byear· 1991 p = 0.000074 s = 260 m = 0.019 … P18 P1: 0 ·bday· 259, 1956 ·byear· 1992 p · 0.000074 s = 9620 m = 0.141 P2: 267 ·bday· 364, 1956 ·byear· 1992 p · 0.000074 s = 3626 m = 0.053 1991 P1 P2 soundness
Approximation 1992 1992 1981 1981 P1 1971 1971 1961 1961 P2 1956 1956 0 0 364 364 267 267 259 259 Æ out = false approximate P1: 0 ·bday· 259, 1992 ·byear· 1992 p = 0.000067 s = 260 m = 0.019 P2: 0 ·bday· 259, 1982 ·byear· 1990 p = 0.000067 s = 2340 m = 0.173 … P10 P1: 0 ·bday· 259, 1956 ·byear· 1992 p = 0.000067 s = 8580 != size of region m = 0.572 P2: 267 ·bday· 364, 1956 ·byear· 1992 p = 0.000067 s = 3234 != size of region m = 0.215 1991 P1 P2 p and s only refer to possible (non-zero probability) points in region
Approximation 1992 1981 1971 1961 1956 0 364 267 259 approximate For each Pi, store region (polyhedron) upper bound on probability of each possible point upper bound on the number of points upper bound on the total probability mass (useful) 1991 Probabilistic Polyhedron P1 P2 Pr[A | B] = Pr[A Æ B] / Pr[B] Also store lower bounds on the above
Approximation ±1 ±2 2 ° ( ) pimin, pimax simin, simax mimin, mimax ±3 … ±2°(P) iff support(±) µ°(C) pmin·±(¾) ·pmax for every ¾2 support(±) smin· | support(±) | ·smax mmin· mass(±) ·mmax support(±) = {¾ | ±(¾) > 0} C policy max¾±(¾) ·pmax
Abstract Interpretation • Need • ((S)) P • define identically to [[S]] P but using abstract operations • if ±2°(P) then [[S]]±2°( ((S))P ) • Need Abstract operations • P1 + P2 • if ±i2°(Pi) for i = 1,2 then ±1 + ±22°(P1 + P2) • P Æ B • if ±2°(P) then ±Æ B 2°(P Æ B) • p*P • if ±2°(P) then p * ±2°(p*P) • …
Abstract operation example ±1 + ±2 – combine mass from both ±1 + ±2= ¸¾. ±1(¾) + ±2(¾) | C1| = 100 | C2 | = 40 C2 ±1+±2 ±1 P1 ±2 P2 C1 s2max = 30 s1max = 100 P3 = P1 + P2 s3max· s1max + s2max = 130 • What is the maximum number of possible points in the sum? • determine minimum overlap (10) 20 80 20 | C1ÅC2 | = 20 s3max = 120
± P Operations • P3 = P1 + P2 • C3 – convex hull of C1, C2 • s3max – what is the smallest overlap? • s3min – what is the largest overlap? • p3max – is overlap possible? • p3min – is overlap impossible? • m3max – simple sum m1max + m2max • m3min – simple sum m1min+ m2min • Other operations, similar, complicated formulas abound • Need to • count number of integer points in a convex polyhedra • Latte • maximize a linear function over integer points in a polyhedron • Latte • convex hull, intersection, affine transform • Parma PPL
Precision 1992 1992 • Extend abstraction to a set of Probabilistic Polyhedrons • ±2°({P1,P2}) iff± = ±1 + ±2 with ±12°(P1) and ±22°(P2) • similarly for more than two • max¾±(¾) ·ipimax • can do better with a bit more work • performance / precision tradeoff 1981 1981 1971 1971 1961 1961 1956 1956 0 0 364 364 267 259 1991 1991 P1 P1 P2
Implementation and Results 0 ·bday· 364 1956 ·byear· 1992 = byear 1992 1956 bday 0 364
bday-query1 today := 260; if bday¸ today Æbday < (today + 7) then out := true else out := false Policy Pr[bday,byear] · 0.05 • prob (our implementation) • prob-scheme (sampling/enumeration) • provides sound estimation after partial enumeration • measure time and bound on max¾±(¾) produced
Implementation and Results 1 pp 0 ·bday· 364 1956 ·byear· 1992 0 ·bday· 364 1910 ·byear· 2010 = = > 1 pp
Precision? byear-query1 then spec-byear-query age := 2011 – byear; if age = 20 Ç … Ç age = 60 then out := true else out := false; pif 0.1 then out := true 0 ·bday· 364 1956 ·byear· 1992 = ? • Limiting number of prob. polyhedra requires merging two into one at various points • Deciding which ones to merge is troublesome • likely reason for the strangeness above
Conclusions • Knowledge-based policies • quantitative information flow, probabilistic semantics • bound on probability of specific bad events (guess of secret) • Dynamic enforcement, via abstract interpretation • policy formulation safe for rejection • resistant to state space explosion • can be sound in respect to many initial distributions • alleviates the problem of determining what is the correct initial distribution • Drawback • restricted language, integer linear expressions • Potential (future work) • simpler domains (Octagons) can replace Polyhedra • increased performance?
Thank you! Go back.