MAT 7003 : Mathematical Foundations (for Software Engineering) J Paul Gibson, A207

MAT 7003 : Mathematical Foundations (for Software Engineering) J Paul Gibson, A207 paul.gibson@it-sudparis.eu http://www-public.it-sudparis.eu/~gibson/Teaching/MAT7003/ Computability http://www-public.it-sudparis.eu/~gibson/Teaching/MAT7003/L7-Computability.pdf TSP: MathematicalFoundations

Computable Functions and Computing Machines (Computers) Alan Turing: 1937 he published a theory of computable functions in: On Computable Numbers, with an Application to the Entscheidungsproblem He reformulated Kurt Gödel's 1931 results on the limits of proof and computation, replacing Gödel's universal arithmetic-based formal language with Turing machines He went on to prove that there was no solution to the Entscheidungsproblem by first showing that the halting problem for Turing machines is undecidable: it is not possible to decide, in general, algorithmically whether a given Turing machine will ever halt. (This proof depended on the notion of a Universal machine) His proof was published subsequent to Alonzo Church's equivalent proof in respect to his lambda calculus (1936) In 1999, Time Magazine named Turing as one of the 100 Most Important People of the 20th Century for his role in the creation of the modern computer. TSP: MathematicalFoundations

Computable Functions and Computing Machines (Computers) A Turing machine that is able to simulate any other Turing machine is called a Universal Turing machine (UTM, or simply a universal machine). A more mathematically-oriented definition with a similar "universal" nature was introduced by Alonzo Church (and his student Stephen Kleene) at roughly the same time. Since then, many computational models – including some very simple models - have been shown to be computationally equivalent to the Turing machine; such models are said to be Turing complete. Computability is the study of the limits of these machines TSP: MathematicalFoundations

Computable Functions and Computing Machines (Computers) • Informally the Church–Turing thesis states: • that if an algorithm (a procedure that terminates) exists then there is an equivalent Turing machine, recursively-definable /recursively enumerable function, or applicable λ-function, for that algorithm. • Or • Every effectively calculable function is a computable function, where a function is effectively calculable if its values can be found by some purely mechanical process • Because all the different attempts at formalizing the concept of "effective calculability/computability" have yielded equivalent results, it is now universally accepted that the Church–Turing thesis is correct. TSP: MathematicalFoundations

Computable Functions and Computing Machines – Typographical/Term Rewrite Systems (TRSs) A TRS is a formal system based on the ability to generate a set of strings following a simple set of syntactic rules. Each rule is calculable --- the generation of a new string from an old string by application of a rule always terminates A TRS may produce an infinite number of strings TRSs can be as powerful as any computing machine (Turing equivalent) TSP: MathematicalFoundations

Typographical Re-write Systems (TRS) • TRSs are simple to implement (simulate) using other computational models • Using TRSs we introduce the following concepts: • proof, • theorem, • decision procedure, • meta-analysis, • structural induction, • necessary and sufficient, • isomorphism, • meaning and consistency Don’t worry … they are very simple to understand …. TSP: Mathematical Foundations

Case Study 1 --- The MUI TRS (thanks to Douglas Hofstadter) • Alphabet = {M,I,U} • Strings: any sequence of characters found in the alphabet • Axiom: MI • Generation Rules: for all strings such that x and y are strings of MUI or ‘ ‘ : • 1) xI can generate xIU • 2) Mx can generate Mxx • 3) xIIIy can generate xUy • 4) xUUy can generate xy A theorem of a TRS is any string which can be generated from the axioms (or any other theorem) A proof of a theorem corresponds to the set of rules which have been followed to generate that theorem TSP: Mathematical Foundations

Case Study 1 --- The MUI TRS (proof procedure) • Alphabet = {M,I,U} • Strings: any sequence of characters found in the alphabet • Axiom: MI • Generation Rules: for all strings such that x is a string of MUI or x =‘’ : • 1) xI can generate xIU • 2) Mx can generate Mxx • 3) xIIIy can generate xUy • 4) xUUy can generate xy Question: can you prove the theorem MUIIU? Question: can we automate the process of testing for theoremhood of a given string in a finite period of time? True or False Input string machine Such a machine would be a decision procedure of MUI TSP: Mathematical Foundations

Case Study 1 --- The MUI TRS (decision tree) • Alphabet = {M,I,U} • Strings: any sequence of characters found in the alphabet • Axiom: MI • Generation Rules: for all strings such that x is a string of MUI or x =‘’ : • 1) xI can generate xIU • 2) Mx can generate Mxx • 3) xIIIy can generate xUy • 4) xUUy can generate xy • Is this a decision procedure for the MUI machine? … • Construct a tree of strings, starting with the axiom at the root. Any application rule constitutes a branch of the tree. To decide if a given string is a theorem it is sufficient to keep extending the tree until the string is found. Task: construct the top (1st 3 layers) of such a tree TSP: Mathematical Foundations

Case Study 1 --- The MUI TRS (meta-reasoning) • Alphabet = {M,I,U} • Strings: any sequence of characters found in the alphabet • Axiom: MI • Generation Rules: for all strings such that x is a string of MUI or x =‘’ : • 1) xI can generate xIU • 2) Mx can generate Mxx • 3) xIIIy can generate xUy • 4) xUUy can generate xy Question: is IIIIUUUIIIUUUI a theorem of the system? Question: can you prove your answer is correct? Note: only through meta-reasoning can we do this !! TSP: Mathematical Foundations

Case Study 1 --- The MUI TRS (more meta-reasoning) • Alphabet = {M,I,U} • Strings: any sequence of characters found in the alphabet • Axiom: MI • Generation Rules: for all strings such that x is a string of MUI or x =‘’ : • 1) xI can generate xIU • 2) Mx can generate Mxx • 3) xIIIy can generate xUy • 4) xUUy can generate xy The meta-property that all theorems start with an M is called a necessary but not sufficient property of theorem-hood. Question: before we move on … is MU a theorem of MUI? Now we move onto a more practical TRS ... TSP: Mathematical Foundations

Case Study 2 --- The pq- TRS Alphabet = {p,q,-} Axiom: for any such x such that x is a possibly empty sequence of ‘-’s, xp-qx- is an axiom Generation Rules: for any x,y,z which are possibly empty sequences of ‘-’s, if xpyqz is a theorem then xpy-qz- is a theorem Question: is there a decision procedure for this formal system? Hint: all re-write rules lengthen the string so …? TSP: Mathematical Foundations

Case Study 2 --- The pq- TRS Alphabet = {p,q,-} Axiom: for any such x such that x is a possibly empty sequence of ‘-’s, xp-qx- is an axiom Generation Rules: for any x,y,z which are possibly empty sequences of ‘-’s, if xpyqz is a theorem then xpy-qz- is a theorem • Why is the pq- TRS practical? • Because it provides us with a formal model of a mathematical property: the addition of integers --- • --p---q----- is a theorem and “2+3=5” is true • --p-q-- is a non-theorem and “2+1=2” is false TSP: Mathematical Foundations

Case Study 2 --- The pq- TRS interpretation • If we interpret • p as plus • q as equals • and a sequence of n ‘-’s as the integer n • then we have • a means of checking x+y=z for all non-negative integers x,y and z • We say that pq- is consistent (under the given interpretation) because all theorems are true after interpretation • We say that pq- is complete if all true statements (in the domain of interpretation) can be generated as theorems in the system. • We say that the interpretation is isomorphic to the system if the system is both complete and consistent TSP: Mathematical Foundations

Case Study 2 --- The pq- TRS extension • The pq- system is isomorphic to a very limited domain of interpretation (but maybe that is all that is required!) • Normally, to widen a domain we can • add an axiom • add a generating rule • For example, what happens if we add the axiom: • xp-qx. • Using this, we can generate many new theorems! • Question: with this new axiom what about completeness and consistency? • Answer: the new, extended system is not consistent with our interpretation. TSP: Mathematical Foundations

Case Study 2 --- The extended pq- TRS reinterpreted • After extension, • --p--q--- is now a theorem but 2+1=2 is not true • To solve this problem we can re-interpret for consistency --- • interpet q as “ >= “ • However, we have now lost completeness --- • “2+5 >= 4” is true (in our domain of interpretation) but • --p-----q---- is a non-theorem • Note: this is a big problem of mathematics (c.f Church) --- • it is not possible to have a complete, decidable system of mathematical properties which is consistent • if all the theorems that can be checked are consistent then there are some things which we would like to be able to prove as theorems which the system is not strong enough for us to do TSP: Mathematical Foundations

Case Study 3 --- A tq- TRS • Question: • can you define a TRS for modelling the multiplication of two integers • can you show that it is complete and consistent • Interpretation: • t as times • q as equals • sequences of ‘-’s as integers TSP: Mathematical Foundations

BACK TO THE SOFTWARE PROCESS … Imagine you were asked to implement a function, f say, to calculate the ith prime number. Thus, given the primes to be 2,3,5,7,11,13,17,19,…, f(1) =2, f(2) = 3, f(3) =5, … I assume you could all code this directly in C++ (Java, Prolog …) How many of you could prove your code was correct? Where would you even start? First: formalise requirements ‘automagically’ Second: transform requirements into design and prove transformation to be correct Third: keep correctly transforming design until it is directly implementable Fourth: implement it ‘automagically’ TSP: Mathematical Foundations

BACK TO THE SOFTWARE PROCESS … First: formalise requirements ‘automagically’ 1) formally define primes 2) formally define lists (and lengths and orderings) 3) formally define the list of ordered primes Second: transform requirements into design and prove transformation to be correct 4) design an algorithm to check that the length of the list ( l say) up to your result (r, say) is such that f(l) = r. Third: nothing to do?? Fourth: implement it ‘automagically’ Where/ how you do this is part of the decision making process. TSP: Mathematical Foundations

A TRS for formally defining if a number is prime Note: easier to do in other formal languages/methods because the necessary concepts (like integers and lists are part of the language) But, with the TRS we define just what we need and use it only where needed. In software process it is this targetting (with the minimum force necessary) which is best … Question: can you write a TRS for deciding if a given number is prime? Hint: if not, try to break the problem down into bits For the lists model/properties we should (but don’t have to) move up a level of abstraction! We introduce Abstract Data Types…. IMHO the most powerful and universally applicable software process formal methods tool. TSP: Mathematical Foundations

From TRSs to Abstract Data Types (ADTs) • ADTs are a very powerful specification technique which exist in many forms (languages). • These languages are often given operational semantics in a way similar to TRSs (in fact, they are pretty much equivalent) • Most ADTs have the following parts --- • A type which is made up from sorts • Sorts which are made up of equivalent sets • Equivalent sets which are made up of expressions • For example, the integer type could be made up of • sorts integer and boolean • 1 equivalence set of the integer sort could be {3, 1+2, 2+1, 1+1+1} • 1 equivalence set of the boolean sort could be {3=3, 1=1, not(false)} TSP: Mathematical Foundations

Case Study 4: A simple ADT specification TYPE integer SORTS integer, boolean OPNS 0:-> integer succ: integer -> integer eq: integer, integer -> boolean +: integer, integer -> integer EQNS forall x,y: integer 0 eq 0 = true; succ(x) eq succ(y) = x eq y; 0 eq succ(x) = false; succ(x) eq 0 = false; 0 + x = x; succ(x) + y = x + (succ(y)); ENDTYPE TSP: Mathematical Foundations

Case Study 4: A simple ADT specification TYPE integer SORTS integer, boolean OPNS 0:-> integer succ: integer -> integer eq: integer, integer -> boolean +: integer, integer -> integer EQNS forall x,y: integer 0 eq 0 = true; succ(x) eq succ(y) = x eq y; 0 eq succ(x) = false; succ(x) eq 0 = false; 0 + x = x; succ(x) + y = x + (succ(y)); ENDTYPE • Question: how do we show, for example --- • 1+2 = 3, • 3+2 = 4+1, • 2+2 != 3+2 TSP: Mathematical Foundations

Case Study 4: A simple ADT specification TYPE integer SORTS integer, boolean OPNS 0:-> integer succ: integer -> integer eq: integer, integer -> boolean +: integer, integer -> integer EQNS forall x,y: integer 0 eq 0 = true; succ(x) eq succ(y) = x eq y; 0 eq succ(x) = false; succ(x) eq 0 = false; 0 + x = x; succ(x) + y = x + (succ(y)); ENDTYPE Note: this model is complete and consistent with respect to the modelling of the addition of integers (like the TRS pq-) Question: extend this model to include multiplication TSP: Mathematical Foundations

Case Study 4: An equivalent ADT specification Consider changing the original specification to make explicit the fact that x+y = y +x, for all integer values of x and y: TYPE integer SORTS integer, boolean OPNS 0:-> integer succ: integer -> integer eq: integer, integer -> boolean +: integer, integer -> integer EQNS forall x,y: integer 0 eq 0 = true; succ(x) eq succ(y) = x eq y; 0 eq succ(x) = false; succ(x) eq 0 = false; 0 + x = x; succ(x) + y = x + (succ(y)); x+y = y+x; ENDTYPE Note: this does not change the meaning of the specification but it may affect the implementation of the evaluation of expressions TSP: Mathematical Foundations

Case Study 4: Evaluation termination • If expressions are evaluated as left to right re-writes (as they often are) then evaluation may not terminate: • 3 +4 = 4+3 may be re-written as • 4+3 = 3+4 which may be re-written as • 3+4 = 4+3 … • Consequently, there are 3 important properties of ADT specifications: • completeness • consistency • evaluation termination/convergence TSP: Mathematical Foundations

Case Study 4: Incompleteness, inconsistency and termination • Not having enough equations can make a specification incomplete. For example, the integer ADT specification would be incomplete without the equation: • 0 eq 0 = true • Having too many equations can make a specification inconsistent. For example, the integer ADT specification is inconsistent if we add the equation: • x + succ(0) = x • but adding the equation: • x + succ(0) = succ(x) • would not introduce inconsistency (just redundancy) • Changing the equations may affect termination: • 0 + x = x to x + 0 = x • would introduce non-termination to the original ADT specification TSP: Mathematical Foundations

Case Study 5 --- A Set ADT specification TYPE Set SORTS Int, Bool OPNS empty:-> Set str: Set, int -> Set add: Set, int -> Set contains: Set, int -> Bool EQNS forall s:Set, x:Int contains(empty, int) = false; x eq y => contains(str(s,x), y) = contains(s,y); not (x eq y) => contains(str(s,x), y) = contains(s,y); contains(s,x) => add(s,x) = s; not(contains(s,x)) => add(s,x) = str(s,x) ENDTYPE • Notes: • use of str and add • preconditions • completeness? • consistency? • Question: • add operations for -- • remove • union • equality TSP: Mathematical Foundations

Case Study 6: Set verification • We would like to verify the following properties: • e  (S-e) = true • e  S1  S2  e S1  e  S2 Proof technique: structural induction on the ADT specification Question: try it yourselves to see how it goes ... Invariant Property: verify also that a set never contains any repeated elements TSP: Mathematical Foundations

Back to the Primes Proof • Question: • write an ADT specification of a list of integers • include a means of verifying that it is ordered • include a function for returning the length • All that is left to do is plug the two parts together and we have a formal specification (and implementation) of our prime problem requirements. Wecan look atthisproblemwithinanother model of computation: automata TSP: Mathematical Foundations

Computable Functions and Computing Machines: Finite Automata Finite automata are computing devices that accept/recognize regular languages and are used to model operations of many systems we find in practice. A classic example is of a vending machine. For example, consider a simple vending machine that accepts only nickels and dimes and requires a payment of 15 cents: QUESTION: how many different paths to a terminating/accepting state? TSP: MathematicalFoundations

Computable Functions and Computing Machines: Finite Automata Definition: deterministic finite automaton (DFA) • The set Q in the above definition is simply a set with a finite number of elements. Its elements can, however, be interpreted as a state that the system (automaton) is in. • The transition function is also called a next state function meaning that the automaton moves into the state (q, a) if it receives the input symbol a while in state q. • Note that δ is a function. Thus for each state q of Q and for each symbol a of ∑ , δ (q, a) must be specified. • If the finite automaton is in an accepting state when the input ceases to come, the sequence of input symbols given to the finite automaton is "accepted" TSP: MathematicalFoundations

Computable Functions and Computing Machines: Finite Automata DFAs are often represented by digraphs called (state) transition diagram. • The vertices (denoted by single circles) of a transition diagram represent the states of the DFA • The arcs labeled with an input symbol correspond to the transitions. • An arc ( p , q ) from vertex p to vertex q with label σ represents the transition δ(p, σ ) = q . • The accepting states are indicated by double circles. • Transition functions can also be represented by transition tables. TSP: MathematicalFoundations

Computable Functions and Computing Machines: Finite Automata Examples of finite automaton TSP: MathematicalFoundations

Computable Functions and Computing Machines: Finite Automata Examples of finite automaton TO DO: Draw the DiGraph corresponding to this transition table TSP: MathematicalFoundations

Computable Functions and Computing Machines: Finite Automata Examples of finite automaton TO DO: Draw the transition table corresponding to this digraph TSP: MathematicalFoundations

Computable Functions and Computing Machines: Finite Automata • A finite automaton as a machine A finite automaton can also be thought of as the device consisting of a tape and a control circuit which satisfy the following conditions: • The tape has the left end and extends to the right without an end. • The tape is divide into squares in each of which a symbol can be written prior to the start of the operation of the automaton. • The tape has a read only head. • The head is always at the leftmost square at the beginning of the operation. • The head moves to the right one square every time it reads a symbol. It never moves to the left. When it sees no symbol, it stops and the automaton terminates its operation. • There is a finite control which determines the state of the automaton and also controls the movement of the head. TSP: MathematicalFoundations

Computable Functions and Computing Machines: Finite Automata A finite automaton as a machine TSP: MathematicalFoundations

Computable Functions and Computing Machines: Finite Automata The Chomsky hierarchy: linking machines to languages Type-0 grammars (unrestricted grammars) include all formal grammars. They generate exactly all languages that can be recognized by a Turing machine. Type-1 grammars (context-sensitive grammars) generate the context-sensitive languages. The languages described by these grammars are exactly all languages that can be recognized by a linear bounded automaton (a nondeterministic Turing machine whose tape is bounded by a constant times the length of the input.) Type-2 grammars (context-free grammars) generate the context-free languages. These languages are exactly all languages that can be recognized by a non-deterministic pushdown automaton. Context-free languages are the theoretical basis for the syntax of most programming languages. Type-3 grammars (regular grammars) generate the regular languages. These languages are exactly all languages that can be decided by a finite state automaton. Regular languages are commonly used to define search patterns and the lexical structure of programming languages. TSP: MathematicalFoundations

Computable Functions and Computing Machines: Finite Automata The differences arise out of the different types of production rules that are allowed: TSP: MathematicalFoundations

Computable Functions and Computing Machines: Finite Automata Finite Automata and Regular Languages - some definitions TSP: MathematicalFoundations

Computable Functions and Computing Machines: Finite Automata Finite Automata and Regular Languages TSP: MathematicalFoundations

Computable Functions and Computing Machines: Finite Automata Finite Automata, Regular Languages and Regular Expressions • We use the following operations to construct regular expressions: • Boolean or - A vertical bar (or +) separates alternatives. For example, gray|grey can match "gray" or "grey". • Grouping - Parentheses are used to define the scope and precedence of the operators For example, gray|grey and gr(a|e)y are equivalent patterns which both describe the set of "gray" and "grey". • Quantification - A quantifier after a token (such as a character) or group specifies how often that preceding element is allowed to occur. The most common quantifiers are the question mark ?, the asterisk * (derived from the Kleene star), and the plus sign +. • ?The question mark indicates there is zero or one of the preceding element. For example, colou?r matches both "color" and "colour". • *The asterisk indicates there are zero or more of the preceding element. For example, ab*c matches "ac", "abc", "abbc", "abbbc", and so on. • +The plus sign indicates that there is one or more of the preceding element. For example, ab+c matches "abc", "abbc", "abbbc", and so on, but not "ac". TSP: MathematicalFoundations

Computable Functions and Computing Machines: Finite Automata Finite Automata, Regular Languages and Regular Expressions Kleene’s Theorem TSP: MathematicalFoundations

Computable Functions and Computing Machines: Finite Automata Finite Automata, Regular Languages and Regular Expressions • QUESTIONS: TO DO • 1 Find the shortest string that is not in the language represented by the regular expression a*(ab)*b*. • 2 For the two regular expressions given below: • (a) find a string corresponding to r2 but not to r1 and (b) find a string corresponding to both r1 and r2. • r1 = a* + b* r2 = ab* + ba* + b*a + (a*b)* • 3 Let r1 and r2 be arbitrary regular expressions over some alphabet. Find a simple (the shortest and with the smallest nesting of * and +) regular expression which is equal to each of the following regular expressions • (a) (r1 + r2 + r1r2 + r2r1)*(b) (r1(r1 + r2)*)+ TSP: MathematicalFoundations

Computable Functions and Computing Machines: Finite Automata Finite Automata, Regular Languages and Regular Expressions QUESTION: Can we build a regular expression to check that a string is a palindrome? ANSWER: NO … can you think about why this is the case? TSP: MathematicalFoundations

MAT 7003 : Mathematical Foundations (for Software Engineering) J Paul Gibson, A207