330 likes | 453 Views
Constraint-Based Methods: Adding Algebraic Properties to Symbolic Models. Vitaly Shmatikov SRI International. One-Slide Summary. “Constraint solving” is a symbolic analysis method for cryptographic protocols Decidable without finite bounds on the attacker
E N D
Constraint-Based Methods:Adding Algebraic Properties toSymbolic Models Vitaly Shmatikov SRI International
One-Slide Summary • “Constraint solving” is a symbolic analysis method for cryptographic protocols • Decidable without finite bounds on the attacker • Big win over finite-state checking (FDR, Mur, etc.) • Only need to specify behavior of honest participants • Can be extended with algebraic theories for XOR, modular multiplication, Diffie-Hellman • Push-button procedure for finding both Dolev-Yao and algebraic attacks (e.g., Pereira-Quisquater) • Works only for a finite number of sessions • “Attack template” must be expressed as a symbolic execution trace
Protocol Analysis Techniques Protocol Analysis Techniques Formal Models Computational Models (no probabilities) Probabilistic poly-time Random oracle … Modal Logics Decidable Process Calculi Inductive Proofs … Finite-state Infinite message space, finite sessions Free attacker algebra Attacker algebra with equational theory
Protocol Analysis Meets Algebra • Dolev-Yao model uses “black-box” cryptography • Many crypto primitives are not black boxes • XOR: ab = ba; aa = 0 • Modular exponentiation: xy = yx; (xy)x-1 = y • Attacker can and will exploit algebraic properties • Ryan-Schneider attack on Bull’s recursive authentication protocol • Pereira-Quisquater attack on A-GDH.2 protocol • Goal: fully automated analysis of protocols with relevant algebraic theories • GDOI, group key management protocols, …
,ra ra,rb,rarb rbrc,rarc,rarb,rarbrc rbrcrzKaz, rarcrzKbz, rarbrzKcz A-GDH.2 Protocol [Ateniese, Steiner, Tsudik ’00] • Parties start with pairwise keys Kaz,Kbz,Kcz • The goal is to establish common session keyrarbrcrz p is prime q is prime divisor of p-1 is generator of cyclic sub- group of Z*p of order q A C B Z Computes session keyrarbrcrz as (rarcrzKbz)Kbz-1rb
Is This Protocol Secure? Suppose two sessions are run concurrently, and malicious C wants to learn the session key of the session from which he is excluded A B C A B C ra,rb, rarb qa, qb, qaqb ,ra ,qa rbrc, rarc, rarb, rarbrc rbrcrzKaz, rarcrzKbz, rarbrzKcz qbqzKaz, qaqzKbz Z Z Can the attacker who controls the network and participates in the 1st session learn the session key of the 2nd session?
Model Checking Approach • Two sources of infinite behavior • Multiple protocol sessions, multiple participant roles • Message space or data space may be infinite • Finite approximation • Assume finite number of participants • Example: 2 clients, 2 servers • Assume finite message space • Represent random numbers by r1, r2, r3, … • Do not allow encrypt(encrypt(encrypt(…))) This restriction is necessary (or the problem is undecidable) This is restriction is not necessary for fully automated analysis!
Infinite-State Protocol Model [Amadio and Lugiez ‘00] [Rusinowitch and Turuani ‘01] • Finite number of processes • Each process models a protocol role • Messages modeled as terms with variables • Variables represent data under attacker’s control • Attacker capabilities modeled by a term algebra • No artificial bounds on attacker computations • Generates an infinite space of possible attacker messages • Protocol analysis problem reduces to a decidable symbolic constraint solving problem • Easy-to-use, practical software for protocol analysis [Boreale ‘01] [Millen and Shmatikov ‘01]
Roles in A-GDH.2 Protocol A C B ,ra ra,rb,rarb • Variables represent terms unknown to the party who plays the role • Attacker can instantiate a variable with any value, but instantiation must be consistent in all terms where it occurs rbrc,rarc,rarb,rarbrc Z rbrcrzKaz, rarcrzKbz, rarbrzKcz B role Z role B ,X1 B X1,rb,X1rb B Y1,Y2Kbz,Y3 Z Z1,Z2,Z3,Z4 Z Z1rzKaz,Z2rzKbz,Z3rzKcz
B and Z from 1st session B from 2nd session Partial execution trace (there are finitely many) Symbolic Execution Trace Suppose two sessions are run concurrently, and malicious C wants to learn the session key of the session from which he is excluded A B C A B C ra,rb, rarb qa, qb, qaqb ,ra ,qa B ,X1 B X1,rb,X1rb Z Z1,Z2,Z3,Z4 Z Z1rzKaz,Z2rzKbz,Z3rzKcz B ,V1 B V1,qb,V1qb B W1,W2Kbz,W3 rbrc, rarc, rarb, rarbrc rbrcrzKaz, rarcrzKbz, rarbrzKcz qbqzKaz, qaqzKbz Z Z
Is There A Feasible Attack? • This attack is feasible if and only if the attacker can consistently instantiate all variables in the trace so that he can produce every message received by B and Z B ,X1 B X1,rb,X1rb Z Z1,Z2,Z3,Z4 Z Z1rzKaz,Z2rzKbz,Z3rzKcz B ,V1 B V1,qb,V1qb B W1,W2Kbz,W3 W2qb B will use this value as session key. If attacker can learn (and announce) it, the protocol is broken.
Symbolic Attack Traces • Attack is modeled as a symbolic execution trace • A trace is a sequence of message send and receive events • Attack trace ends in a violation (e.g., attacker learns the secret) • Messages contain variables, modeling data controlled by attacker • Adequate for trace-based security properties • Secrecy, authentication, some forms of fairness… • A symbolic trace may or may not have a feasible concrete instantiation • Finding whether such an instantiation exists is the main goal of symbolic (infinite-state) protocol analysis
From Attack Traces to Constraints • For each message sent by the attacker in the attack trace, create a symbolic constraint • mi is the message attacker needs to send • t1,…,tn are the messages observed by attacker up to this point • Attack is feasible if and only if all constraints are satisfiable simultaneously • There exists an instantiation such that imi can be derived from t1, …, tn in attacker’s term algebra mifrom t1, …, tn
,X1 Z1,Z2, Z3,Z4 ,V1 W1,W2Kbz, W3 from,kcz,X1,rb,X1rb, Z1rzKaz,Z2rzKbz,Z3rzKcz, V1,qb,V1qb W2qb from,kcz,X1,rb,X1rb, Z1rzKaz,Z2rzKbz,Z3rzKcz, V1,qb,V1qb Constraint Generation for A-GDH.2 B ,X1 B X1,rb,X1rb Z Z1,Z2,Z3,Z4 Z Z1rzKaz,Z2rzKbz,Z3rzKcz B ,V1 B V1,qb,V1qb B W1,W2Kbz,W3 W2qb from,kcz (attacker’s initial knowledge) from,kcz,X1,rb,X1rb from,kcz,X1,rb,X1rb, Z1rzKaz,Z2rzKbz,Z3rzKcz
Dolev-Yao Term Algebra Attacker’s term algebra is a set of derivation rules Tu Tv T[u,v] Tu Tv Tcryptu[v] vT Tu if u=v for some T[u,v] Tv T[u,v] Tu Tcryptu[v] Tu Tv Symbolic constraint m from t1, …, tn is satisfiable if and only if there is a substitution such that t1, …, tn m is derivable using these rules
Properties of Term Algebra • No restriction on structural size of terms • The closure of any term set under derivation rules is infinite • There is no a priori bound on attacker computations • Untyped • Attacker doesn’t have to comply with the protocol specification • Attacker may substitute a ciphertext for a random number, a key for an output of a hash function, etc. • Symmetric encryption with non-atomic keys • Can add an equational theory to model algebraic properties of cryptographic functions • XOR, modular exponentiation, blinded signatures, …
Solving Symbolic Constraints [Millen and Shmatikov CCS ’01] • Constraint reduction rules • Replace each mifrom Ti with one or more simpler constraints • Preserve essential properties of the constraint sequence • Nondeterministic reduction procedure • Structure-driven, but several rules may apply in any state • Exponential in the worst case (the problem is NP-complete) • The procedure is terminating and complete • If T m is derivable in attacker’s term algebra, • There exists reduction rule r=r() which is applicable tom from Tand produces somem’ from T’ such that • T’ m’ is derivable in attacker’s term algebra
Reduction Rules [m1,m2] from T m1from T m2from T cryptk[m] from T m from T k from T (pair) (enc) m from t, T ___ add mgu(t,m) to m from T, v m from T (un) (elim) m from cryptu[v], T u from cryptuv, T m from cryptu[v], v, T m from [u,v], T m from u, v, T (dec) (split)
Reduction Procedure Initial constraint sequence apply every possible reduction rule to first m from T where m is not a variable • • • • • • No rule is applicable v1from T1 • • • vNfrom TN or If reduction tree has at least one such sequence as a leaf, there is a solution, and attack trace is feasible
Symbolic Analysis Summary specified by the analyst Formal specification of protocol roles attacker is implicit! variables model attacker’s input fully automated Attack (violating execution trace) may nothave a feasible instantiation Sequence of symbolic constraints satisfiable if and only if there exists a feasible instantiation of attack trace Decidable constraint solving procedure
Let’s Add Algebraic Properties Verification of trace-based security properties … is decidable for protocols with XOR • Comon-Lundh and Shmatikov (LICS ’03) • Chevalier, Kϋsters, Rusinowitch, Turuani (LICS ’03) … reduces to a system of quadratic Diophantine equations for protocols with Abelian groups • Millen and Shmatikov (CSFW ’03) … is decidable for a restricted class of protocols with modular exponentiation • Chevalier, Kϋsters, Rusinowitch, Turuani (FST/TCS ’03) … is decidable for any well-defined protocol with products and modular exponentiation • Shmatikov (ESOP ’04)
Tuv Tv Attacker can’t take discrete logs or solve Diffie- Hellman problem Tuv Tuw Tuvw Attacker Term Algebra Dolev-Yao vT Tu T[u,v] Tu T[u,v] Tv Tcryptu[v] Tu Tv Tu Tv Tcryptu[v] Tu Tv Tuv Tu Tv T[u,v] Tu Tv Tuv Tu Tu-1 Associative:(x y) z = x (y z) Commutative: x y = y x Normalizationxx-1 1 x1 x rules: (x-1)-1 x (xy)-1 y-1x-1 x1 x (xy)z xyz
Key Insights For Decidability • In a well-defined protocol, honest participants don’t need to guess values of attacker inputs • Leads to a syntactic condition on usage of variables • If attacker can derive u from T, then there is a derivation which uses only subterms of T and u • If constraints are satisfiable, then there is an attack in which every variable is instantiated by a product of subterms drawn from a finite set
Origination Stability • Variable origination condition • If C is a constraint sequence generated from an execution trace, then there exists a linear ordering < on Vars(C) such that if x appears for the first time in mifromTiC, then x Vars(mi) and y Vars(Ti) y < x • This condition must be satisfied by C after any partial substitution • Rules out only ill-defined protocols AB XY BA X Requires B to split a product of two unknown values
ANALYSIS stage: All intermediate terms are products of subterms of T SYNTHESIS stage: Only pairing, encryption, multiplication, inverse & exponentiation used Normal Derivations t1T Tt1 tnT Ttn t2T Tt2 … … … Tv Tv1 Tvk … Tu Lemma: if Tu is derivable, then there is a normal derivation
Conservative Solutions • Conservative solution only uses subterms from the original, uninstantiated constraint sequence • x Subterms(x) Subterms(C) closed under , inverse and exponentiation • All subterms used in the conservative solution are drawn from a finite set which is known before any variables are instantiated • Lemma: if C has a solution, then C has a conservative solution • This lemma allows to derive a bound on the size of the attack
Symbolic Decision Procedure { u1fromT1 , …, unfromTn } • Monotonic: T1 … Tn • Satisfy the variable stability condition • Guess all equalities between subterms • Finite number of possible unifiers modulo AG • Guess the order in which subterms are derived • Replace exponentiation by and inverse • Reduce to a decidable system of quadratic Diophantine equations symbolic constraints generated from protocol Solvable iff a linear subsystem is solvable
Back to A-GDH.2 X1 from,kcz X1 fromkcz Z1 from-”-,X1,rb,X1rb rb-1Z1 fromkcz Z2 from-”- Only and inverse used in derivation. Reduces to system of Diophantine equations. rb-1Z2 fromkcz Z3 from-”- rb-1Z3 fromkcz Z4 from-”- rb-1Z4 fromkcz V1 from-”-, Z1rzKaz,Z2rzKbz, Z3rzKcz Z3-1rz-1kcz-1V1 fromkcz W2Kbz Z2-1rz-1W2 fromkcz from-”-, V1,qb,V1qb V1-1W2 fromkcz W2qb from-”- Key insight: under the Diffie-Hellman assumption, attacker can produce x from y if and only if he can produce y-1x (x=(y)y-1x)
Decidable Quadratic Equations Only and inverse used in each derivation u1X11…X1k1fromt11, …, tm1 u2X21…X2k2fromt21, …, tm2 … unXn1…Xnknfromtn1, …, tmn • Convert each constraint into a Diophantine equation • uiXi1…Xikfromti1, …, tim becomes uiXi1…Xik=ti1z1 …timzm for integer zj • If some tij is a variable, equation becomes quadratic, for example a2X =(ab)z1 a6=(ab)z2(bX)z3 • Equations associated with execution traces have special structure • If a variable occurs on the right, it must previously occur on the left • All terms used to construct the variable where it first occurred are available in every subsequent constraint
Intuition Behind Decidability a2X=(ab)z1 a6=(ab)z2(bX)z3 substitute X a6=(ab)z2(ba-2(ab)z1)z3 group (ab) terms together a6=(ab)z’(ba-2)z3 z’ = z2 + z1z3 Quadratic part always has a solution because z2 is unconstrained
Is There A Feasible Attack? Yes! B ,X1 B X1,rb,X1rb Z Z1,Z2,Z3,Z4 Z Z1rzKaz,Z2rzKbz,Z3rzKcz B ,V1 B V1,qb,V1qb B W1,W2Kbz,W3 W2qb Attacker can learn this value by clever variable instantiation
Attack on A-GDH.2 Suppose two sessions are run concurrently, and malicious C wants to learn the session key of the session from which he is excluded 1. Replace with 1 3. Replace with rbrzkcz A B A B ra,rb, rarb qb, qb, qaqb ,ra ,qb Attacks of this type can be found automatically from protocol specification rbrc, rarc, rarb, rarbrc rbrcrzKaz, rarcrzKbz, rarbrzKcz qbqzKaz, qaqzKbz Z Z 4. Replace with rbrzkbz 2. Replace with rb,rb,rb,rb Attack: B will use rbrzqb as session key, which attacker can compute as (rbrzkczqb)kcz-1
Decision Procedures • Free (“black-box”) algebra: decidable • Implemented as an easy-to-use analysis tool • XOR: decidable • All integer variables are equal to 0 or 1 • (Group) Diffie-Hellman: decidable • System of quadratic Diophantine equations, which is solvable if and only if a linear subsystem is solvable • Some restrictions (no products in exponentiation base) • Blind signatures, super-exponentiation, ... • Axiomatic models of various cryptographic primitives Current research