170 likes | 183 Views
Heuristic Optimisation in Design and Analysis. John A Clark University of York, UK jac@cs.york.ac.uk. Overview. Basic idea. Brief introduction to heuristic optimisation techniques Examples in design and analysis of cryptosystems and also protocols Further work.
E N D
Heuristic Optimisation in Design and Analysis John A Clark University of York, UK jac@cs.york.ac.uk
Overview • Basic idea. • Brief introduction to heuristic optimisation techniques • Examples in design and analysis of cryptosystems and also protocols • Further work
Use of Optimisation Techniques • Combinatorial optimisation techniques can be used beneficially in a variety of design and analysis tasks: • Real cryptosystem design. • Real cryptanalysis. • Real protocol synthesis.
Design and Analysis as Optimisation • Let DS be the design space or search space • Let f(y) be a function over the design space that signifies how good (bad) a candidate y is. • measuring goodness we talk in terms of a fitness function • (measuring badness we talk in terms of a cost function) • Find z in DS such that f(z)=sup{f(y):y in DS} • Traditional techniques such as hill-climbing tend to get stuck in local optima. Need ability to escape from these to achieve global optimum.
Simulated Annealing • A local search technique. Current candidate x. At each temperature consider 1000 moves Always accept improving moves Temperature cycle Accept worsening moves probabilistically. Gets harder to do this the worse the move. Gets harder as Temp decreases.
Boolean Function Design • Boolean functions used as components of cryptosystems • Want to generate functions with ‘nice properties’, e.g. one which cannot be approximated well by any linear Boolean function g (since this can help cryptanalysis)
Boolean Function Design • Can calculate the non-linearity (i.e. fitness) of a given function b (by assessing how well it is approximated by each of the 2n linear functions. • Random generation does not perform well. • Random generation +hill-climbing gets improvements (e.g. with neighbouring functions obtained by altering the result of one b(x) value) • Genetic algorithms have been tried too. • Simulated annealing works very well and seems to be able to get other nice properties too (given suitable cost functions)
Boolean Function Design • Consider a Boolean function b of three variables A neighbour could be obtained by flipping one of result values.
Maliciously….. • Can use these techniques to generate cryptographic elements (e.g. S-Boxes) with good public properties using an honest fitness function • honestFit(x) • But also can try to hide useful (but privately known) properties using a malicious fitness function • trapFit(x) • Now take combination and do both at the same time • Want l as low as you can get away with for the next N years! The resulting good properties must still be obvious.
Publicly good solutions, e.g. Boolean functions with same very high non-linearity Publicly good solutions with high trapdoor bias found by annealing and combined honest and trapdoor cost functions. Publicly good solutions found by annealing and honest cost function Maliciously….. Using different cost functions results in solutions being found in different areas of the search space. You can actually tell whether someone has used the cost function they say they have
Cryptanalysis: Pointcheval’s Scheme • Zero knowledge protocol based on NP-hard problem A and the histogram are public. If you can recover secret s then the system is broken
Pointcheval’s Scheme • Need cost function to indicate how good an x-candidate vector y is. Examples of factors we might like to consider: • Non-negativity of Ay elements and histogram agreement • Could give negativity punishment of costNeg(y)=|-3|+|-1| =4 • Could give histogram punishment of costHist(y)=|3-2|+|1-0|| =2 • Now take weighted sum of these costs cost(y)=w1costNeg(y)+w2costHist(y)
Profiling Annealing • Simulated annealing can make progress with this scheme, typically getting solutions with around 80% of the vector entries correct (but don’t know which 80%!!!) • But this throws away a lot of information – better to monitor the search process as it cools down. • Observing the process shows that within a temperature cycle proportion of time spent by a variable taking a particular value (-1 say) may tend to be very high, e.g. 95%. • The search process is clearly intent on setting that variable to –1. Accept this and “fix the value” – don’t attempt to move it again. • 95% is a reasonable threshold. Can use 98% etc. • Allows efficiency gains since now we consider only non-fixed variables. Also seems to work better than standard annealing in terms of results
Thermo-statistical Trajectories • But there is a much stronger observation to make…. • Some variables are fixed by this process before others.Why? • Because it is difficult for those variables not to take their fixed values – the search process just doesn’t want to allow it. • There is something about the problem instance that encourages this… • The search process wants to take those values because THEY ARE THE CORRECT ONES. • With certain cost functions and problems the FIRST 50% OF VARIABLE VALUES FIXED IN THIS WAY ARE CORRECT. • Thus, within a few minutes you have half the key. Not always this successful but most cost functions and problems give 25%+ initial correctness.
Radical Viewpoint Analysis • Take different viewpoints on the same problem, i.e. different cost functions • cost1(y)=5 costNeg(y)+1 costHist(y) • cost2(y)=3 costNeg(y)+3 costHist(y) • cost3(y)=1 costNeg(y)+5 costHist(y) • The cost surface is now different in each case but we still have • cost=0 => problem solved. • Now use these to converge on candidate solutions • For suitable chosen functions results typically have between 75-92% correct values. • Now consider those values on which they agree. By taking a large number of different cost functions you can reduce the number of values on which they agree wrongly almost to 0. The rest on which they agree are correct.
Evolving Protocols • Recent IEEE S&P Oakland paper using genetic algorithms to evolve abstract protocols (with proofs!). • Fitness function is based on number of stated goals met at each message. • Random bits strings can be decoded as protocols expressed in BAN-logic formalism and executed. • When a receiver gets a message he uses BAN inference rules to update his belief state according to what he knows already and what is in the message. • this is a form of abstract execution
Future Work • Genetic Quantum Programming • Applications of quantum search seem to be based on known algorithms, e.g. Grover’s search. • We are currently investigating the evolution of quantum programs (essentially sequences of unitary transformations/matrices) to solve particular problems/evolve new algorithms. • Applications to the evolution of new quantum cryptanalysis techniques? • Integrating quantum search and traditional optimisation: • At its simplest let QS find a good starting point and then use traditional techniques to hill climb. Others possible. • Statistical profiling of traditional optimisation techniques – potentially a very rich seam to mine (both in analysis and design).