1 / 42

Markov chain model of machine code program execution and halting

Markov chain model of machine code program execution and halting. Riccardo Poli and Bill Langdon Department of Computer Science University of Essex. Halting problem. Logic states that whether or not programs halt is an undecidable problem (Turing) Probability gives answer:

carol
Download Presentation

Markov chain model of machine code program execution and halting

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Markov chain model of machine code program execution and halting Riccardo Poli and Bill Langdon Department of Computer Science University of Essex

  2. R. Poli - University of Essex Halting problem • Logic states that whether or not programs halt is an undecidable problem (Turing) • Probability gives answer: with probability 1, all programs without halt instruction do not terminate (Langdon & Poli)

  3. R. Poli - University of Essex Overview • Memory and loops make linear GP Turing complete, but what is the effect search space and fitness? • T7 computer • Experiments • Markov chain model • Implications

  4. R. Poli - University of Essex Introduction • Without memory and loops distribution of functionality for GP programs tends to a limit as programs get bigger • True for Turing complete programs?

  5. R. Poli - University of Essex T7 Minimal Turing Complete CPU • 7 instructions • Arithmetic unit is ADD. From + all other operations can be obtained. E.g. • Boolean logic • SUB, by adding complement • Multiply, by repeated addition (subroutines) • Conditional (Branch if oVerflow flag Set) • Move data in memory • Save and restore Program Counter (i.e. Jump) • Stop if reach end of program

  6. R. Poli - University of Essex T7 Architecture

  7. R. Poli - University of Essex Experiments • There are too many programs to test them all. Instead we gather statistics on random samples. • Chose set of program lengths 30 to 16777215 • Generate 1000 programs of each length • Run them from random start point with random input • Program terminates if it obeys the last instruction (which must not be a jump) • How many stop?

  8. R. Poli - University of Essex Almost all T7 Programs Loop

  9. R. Poli - University of Essex Model of Random Programs • Before any repeated instructions; • random sequence of instructions and • random contents of memory. • 1 in 7 instructions is a jump to a random location

  10. R. Poli - University of Essex Model of Random Programs • T7 instruction set chosen to have little bias. • I.e. every state is ≈equally likely. • Overflow flag set half the time. • So 50% of conditional jumps BVS are active. • (1+0.5)/7 instructions takes program counter to a random location. • Implies for long programs, lengths of continuous instructions (i.e. without jumps) follows a geometric distribution with mean 7/1.5=4.67

  11. R. Poli - University of Essex Program segment = random code ending with a random jump

  12. R. Poli - University of Essex Forming Loops:Segments model • Segments model assumes whole program is broken into N=L/4.67 segments of equal length of continuous instructions. • Last instruction of each is a random jump. • By the end of each segment, memory is re-randomised. • Jump to any part of a segment, part of which has already been run, will form a loop. • Jump to any part of the last segment will halt the program.

  13. R. Poli - University of Essex Probability of Halting • i segments run so far. Chance next segment will • Form first loop = i/N • Halt program = 1/N • (so 1-(i+1)/N continues) • Chance of halting immediately after segment i • 1/N× (1-2/N) (1-3/N) (1-4/N) … (1-i/N) • Total halting probability given by adding these gives ≈ sqrt(π/2N) = O(N-½)

  14. R. Poli - University of Essex Proportion of programs without loops falls as 1/sqrt(length) Segments model over, but gives 1/√x scaling.

  15. R. Poli - University of Essex Number of halting programsrises exponentially with length 10100 000 000

  16. R. Poli - University of Essex Average run time (non-looping) • Segments model allows us to compute a bound for runtime • Expected run time grows as O(N½)

  17. R. Poli - University of Essex Run time on terminating programs Run time of non-looping programs fits Markov prediction. Mean run time of all terminating programslength3/4 Max run time limited by small,12 bytes, memory becoming non-random

  18. Markov chain model

  19. R. Poli - University of Essex States • State 0 = no instructions executed, yet • State i = i instructions but no loops have been executed • Sink state = at least one loop was executed • Halt state = the last instruction has been successfully executed and program counter has gone beyond it.

  20. R. Poli - University of Essex Event diagram for program execution 1/2

  21. R. Poli - University of Essex Event diagram for program execution 2/2

  22. R. Poli - University of Essex p1 = probability of being the last instruction • Program execution starts from a random position • Memory is randomly initialised and, so, any jumps land at random locations • Then, the probability of being at the last instruction in a program is independent of how may (new) instructions have been executed so far. • So,

  23. R. Poli - University of Essex p2 = probability of instruction causing a jump • We assume that we have two types of jumps • unconditional jumps (prob. puj, where PC is given a value retrieved from memory or from a register • conditional jumps (prob. pcj) • Fag bit (which causes conditional jumps) is set with probability pf • The total probability that the current instruction will cause a jump is

  24. R. Poli - University of Essex p3= probability of new instruction after jump • Program counter after a jump is a random number between 1 and L • So, the probability of finding a new instruction is

  25. R. Poli - University of Essex p4 = probability of new instruction after non-jump • The more jumps we have executed the more fragmented the map of visited instructions will look. • So, we should expect p4 to decrease as a function of the number of jumps/fragments. • Expected number of fragments (jumps) in a program having reached state i

  26. R. Poli - University of Essex • Each block will be preceded by at least one unvisited instruction • So, the probability of a previously executed instruction after a non-jump is and

  27. R. Poli - University of Essex • A more precise model considers the probability of blocks being contiguous. • Expected number of actual blocks hence

  28. R. Poli - University of Essex State transition probabilities • These are obtained by adding up “paths” in the program execution event diagram E.g. looping probability

  29. R. Poli - University of Essex Less than L-1 instructions visited

  30. R. Poli - University of Essex L-1 instructions visited

  31. R. Poli - University of Essex Transition matrix • For example, for T7 and L = 7 we obtain 0 instructions 1 instructions 2 instructions 3 instructions 4 instructions 5 instructions 6 instructions loop halt loop 0 instructions 1 instructions 2 instructions 3 instructions 6 instructions halt 5 instructions 4 instructions

  32. R. Poli - University of Essex Computing future state probabilities • All is required is to take appropriate powers of the Markov matrix M

  33. R. Poli - University of Essex Examples For T7, L=7 and i=3 For T7, L=7 and i=L prob. looping in 3 instructions prob. halting in 3 instructions total halting probability

  34. R. Poli - University of Essex Efficiency • Computing halting probabilities requires a potentially exponentially explosive computation to perform (ML) • We reordered calculations to obtain very efficient models which allow us to compute • halting probabilities and • expected number of instructions executed by halting programs for L = 10,000,000 or more (see paper for details)

  35. R. Poli - University of Essex A good model? Halting probability

  36. R. Poli - University of Essex Instructions executed by halting programs

  37. R. Poli - University of Essex Improved model accounting for memory correlation

  38. R. Poli - University of Essex Search space characterisation • From earlier work we know that for halting programs, as the number of instructions executed grows, functionality approaches a limiting distribution. • The expected number of instructions actually executed by halting Turing complete programs indicates how close the distribution is to the limit. • E.g. for T7, very long programs have a tiny subset of their instructions executed (e.g., 1,000 instructions in programs of L = 1,000,000).

  39. R. Poli - University of Essex Effective population size • Often programs that do not terminate are wasted fitness evaluations and are given zero fitness • The initial population is composed of random programs of which only a fraction p(halt) are expected to halt and so have fitness > 0. • We can use the Markov model to predict the effective population sizePopsize× p(halt)

  40. R. Poli - University of Essex Controlling p(halt) by varying jump probability or program length L=10 L=100 L=1000 L=10000

  41. R. Poli - University of Essex Aborting non-terminating programs • The model can also be used to decide after how many instructions to abort evaluation time limit = m× expected instructions inhalting programs • The GP runtime (at generation 0)

  42. R. Poli - University of Essex Conclusions • Experiment show that halting probability scales as 1/sqrt(length) • Markov chain model of program execution (and halting) is practical and accurate • The halting probability  0 with length, so… with probability 1, a program does not halt • However, Turing complete GP possible if appropriate parameter settings and/or fitness functions are used.

More Related