190 likes | 314 Views
Secret agents leave big footprints: how to plant a trapdoor in a cryptographic function and why you might not get away with it. GECCO 2003. John A Clark, Jeremy L Jacob and Susan Stepney Dept of Computer Science University of York, York YO10 5DD, England 16 July 2003.
E N D
Secret agents leave big footprints: how to plant a trapdoor in a cryptographic function and why you might not get away with it.GECCO 2003 John A Clark, Jeremy L Jacob and Susan Stepney Dept of Computer ScienceUniversity of York, York YO10 5DD, England 16 July 2003
A Research Exercise in Pure Evil • Do you feel frustrated and annoyed by people’s ability to use modern day crypto-algorithms when they have no intention whatsoever of supplying you with a secret key so you can listen in? • Don’t worry – there is a solution. • Get them to use an algorithm that looks secure but which only you know how to break. • Technical: it’s in the cost function! Different cost functions give different results. • Moral: Optimisation may be used and abused.
Conspiracy theory as motivation: Data Encryption Standard (DES) • The Data Encryption Standard is the most controversial cipher in history. • Developed on behalf of the US Govt.. • Based on previous IBM work. • Issued in 1976 as FIPS 46. • 56 bit key (64 in fact but there are check bits) is controversial: • key length was originally 128; • suspicion over NSA motives. • Criteria for the design were not revealed.
64 32 32 L Key Input R Shift Shift IP 56 32 Compression Perm Expansion Perm 48 L0 R0 48 Sixteen cycles Key' 48 S-box Substitution L16 R16 32 P-Box Perm R16 L16 L' R' Inverse IP S1 S2 S3 S4 S5 S6 S7 S8 Conspiracy theory as motivation: Data Encryption Standard (DES)
Conspiracy theory as motivation: Data Encryption Standard (DES) • Matters became amusing in 1994 • Theoretically promising method emerged in the late 80’s and early 90’s - differential cryptanalysis. • DES was surprisingly resilient to differential cryptanalysis. • Don Coppersmith wrote a paper (1994) that revealed some of the design criteria and stated that DES was resistant to differential cryptanalysis because it had been specifically designed so. • IBM (presumably from the NSA) knew about the method of attack 16 or more years before it was discovered and published by leading cryptography academics. • DES is more vulnerable to a later method (linear cryptanalysis) • Actually specialised FPGA hardware can now break DES in a few hours.
Conspiracy theory as motivation: Data Encryption Standard (DES) • Does DES have a trapdoor in it – a special property that can be exploited by people in the know? • We do not know. • It seems actually to be a rather good algorithm. • But the idea of having a secret trapdoor – now I like that. • How can we design cryptosystem that looks good but which I may know how to break? • How can we prevent the wrangling about honesty in design? • Let’s try heuristic search. Will illustrate principle on the simplest component –a single-valued Boolean function used in stream ciphers.
LSFR1 L1j LSFR2 L2j f LSFRn Zj Lnj Pj Cj Classical Stream Cipher Model Plaintext Stream PjKeystream ZjCipherstream Cj Combining Boolean function f. Receiver can generate key stream and recover plaintext say 32 Bit registers
0 0 0 1 -1 0 0 0 1 0 1 1 0 1 0 0 1 2 0 1 1 0 1 3 1 0 0 1 -1 4 1 0 1 0 1 5 1 1 0 1 -1 6 1 1 1 1 -1 7 Boolean Function Design • A Boolean function f:{0,1}n->{0,1} x f(x) f(x) Polar representation Will talk only about balanced functions where there are equal numbers of 1s and -1s. A move simply swaps a 1 and a –1. Functions are essentially represented as binary vectors
Public Goodness Property P Planting Trapdoors e.g. high non-linearity, low autocorrelation Trapdoor Property T Design Space
Optimisation • Suppose you have an effective optimisation based approach to getting functions with public property P. Let the cost function used be • Cost=honest(f) • Suppose you have an effective optimisation based approach to getting functions with trapdoor property T. Let the cost function used be • Cost=trapdoor(f) • We can combine the two • sneakyCost(f) = (1- l) honest(f)+l trapdoor(f) • l is the malice factor: l=0 truly honest; l=1=>wicked • Will you get caught out?
Example Trapdoor Function • We want to be able to tell whether an unknown trapdoor has been inserted. • Experiments have used a randomly generated vector as trapdoor. Closeness to this vector (measured by Hamming distance) represents a good trapdoor bias. • Want to investigate what happens when different malice factors are used. • We shall consider high non-linearity and low autocorrelation as public goodness measures.
You say you did, I say you didn’t Publicly good solutions, e.g. Boolean functions with same very high non-linearity Publicly good solutions with high trapdoor bias found by annealing and combined honest and trapdoor cost functions. Publicly good solutions found by annealing and honest cost function There appears nothing to distinguish the sets of solutions obtained – unless you know what form the trapdoor takes! Or is there…
n=8: Examples with non-linearity vs autocorrelation l=0.0 l=0.2 l=0.4 Non-linearity Autocorrelation Autocorrelation l=0.6 l=0.8 l=1.0
+1 -1 +1 +1 -1 +1 -1 -1 Vector Representations Different cost functions may give similar goodness results but may do so in radically different ways. Results using honest and dishonest cost functions cluster in different parts of the design space Basically distinguish using discriminant analysis. If you don’t have an alternative hypothesis then you can generate a family of honest results and ask how probable the offered one is.
Vector Representations For two groups G1 and G2. Calculate the mean vectors m1 and m2. Project m2 onto m1 and obtain the residual r. Now project each vector in each group onto the residual and take absolute value.
Games People Play • It seems possible to tell that something has been going on. • And we don’t need to know precisely what has been going on. • Since any design has a binary vector representation the technique is general. • Meta games: • Some variations on a theme can be attempted. If you know the means of detection you may be able to add a cost function component concerned with detectability • sneakierCost(f) = (1- l) honest(f)+l malice(f)+a detectability(f)
Conclusions • Optimisation based design process may be open and reproducible. • Optimisation can be abused. • Optimisation allows a family of representative designs to be obtained. • Designs developed against different criteria just look different. • The games just do not stop.
Coda • Search based approaches are not just for toy problems. • For several major criteria of interest search based approaches have equalled or bettered the combined best efforts of theoreticians for n<=8. • Have recently produced hitherto unattained results for n=9. • Disproved cryptological conjectures in the literature. • CEC Special strand on computer security. • Web page at www.cs.york.ac.uk/security(part of virtual library)
Bonus Track • You cane even tell which technique people have used. • Simulated Annealing andGAs may also give different types of solution. • Experiment: • Evolved a pseudo-random number generator as FPGA netlist • Randomness criteria as cost function components • Cost function components can act as classifiers too! View evolved programs as bit strings and feed them through the cost functions components used to evolve them.