570 likes | 686 Views
::ICS 804:: Theory of Computation - Ibrahim Otieno - iotieno@uonbi.ac.ke +254-0722-429297 SCI/ICT Building Rm. G15. Course Outline. Mathematical Preliminaries Turing Machines Recursion Theory Markov Algorithms Register Machines Regular Languages and finite-state automata
E N D
::ICS 804:: Theory of Computation - Ibrahim Otieno - iotieno@uonbi.ac.ke +254-0722-429297 SCI/ICT Building Rm. G15
Course Outline • Mathematical Preliminaries • Turing Machines • Recursion Theory • Markov Algorithms • Register Machines • Regular Languages and finite-state automata • Aspects of Computability
Last week: Register Machines • Register machines • Register machines and formal languages • Model-independent characterization of computational feasibility
Course Outline • Mathematical Preliminaries • Turing Machines • Additional Varieties of Turing Machines • Recursion Theory • Markov Algorithms • Register Machines • Regular Languages and finite-state automata • Aspects of Computability
Regular Languages and finite-state automata • Regular Expressions and regular languages • Deterministic FSA • Non-deterministic FSA • Finite-state automata with epsilon moves • Generative grammars • Context-free, Context-sensitive languages • Chomsky Hierarchy
Characterizing formal languages • Plain words: the language of all and only those words over ={a,b} of length 2 (aa,bb,ab,ba,bb) • Set abstraction: {w|w* and |w| = 2} • New way: regular expressions denote languages
Regular Expressions Denote Languages • a* denotes language {an|n 0} • (a).(b) or just ab denotes unit language {ab} • a*b* denotes {anbm|n,m 0} • a2b3 denotes {aabbb} • an not regular expression • a+bb denotes {anbb|n 1} • a?bb denotes {anbb| 0 n 1} • a|b denotes {a,b}
Exercise • a* denotes language {an|n 0} • (a).(b) or just ab denotes unit language {ab} • a*b* denotes {anbm|n, m 0} • a2b3 denotes {aabbb} • an not regular expression • a+bb denotes {anbb|n 1} • a?bb denotes {anbb| 0 n 1} • a|b denotes {a,b} What does ((a.b)*)|((b.a)*) mean? Give a few examples
Exercise • a* denotes language {an|n 0} • (a).(b) or just abdenotes unit language {ab} • a*b* denotes {anbm|n, m 0} • a2b3 denotes {aabbb} • an not regular expression • a+bb denotes {anbb|n 1} • a?bb denotes {anbb| 0 n 1} • a|b denotes {a,b} What does ((a.b)*)|((b.a)*) mean? Give a few examples The language containing all and only even-length words consisting of alternating a’s and b’s {,ab,ba,abab,baba,…}
Kleene-Closure Operator (recap) • Symbol *: certain unary operation on languages Given language L L* = def {w| for some n 0, w is the concatenation of n words of L} L*: is the result of concatenating 0 or more words of L
Language forming operations (recap) • Binary concatenation operation: . L1.L2 = def. {w1w2|w1 L1 & w2 L2} The language that results from taking a word from L1 and appending to it a word from L2
Definition - Regular Expression • is regular expression (over S) and denotes language • e is regular expression (over S) and denotes language {e} • If s is in * then s itself is a regular expression and denotes language {s} • Suppose s and r are regular expressions that denote languages Lr and Ls, then (a) (r|s) is a regular expression that denotes LrLs (b) (r.s) is a regular expression that denotes Lr.Ls (c) (r*) is a regular expression that denotes (Lr)* • No expression is a regular expression unless it is obtainable from (i) – (iv)
Definition What about (r+) What about (r?)
Definition What about (r+) = ((r*).r) What about (r?) = (|r)
Notation • Usually forget about parentheses: (ab) = ab • (a|b|c): 3-word language {a,b,c} • parentheses > superscript > concatenation > alternation ab* = a(b*) (ab)* a|ba = (a|(b.a)) ((a|b).a)
Regular Languages • Let L be a language over alphabet S, i.e., L *. Then L is said to be a regular language if L is denoted by some regular expression over S • Let S be a finite alphabet and L1 and L2 regular languages over S. Then L1 L2, L1.L2, and L1* are also regular languages
Remarks • if S is a finite alphabet and w is any word over *. Then unit language {w} is regular. • if S is a finite alphabet. Then any finite languageover S is regular.
Finite State Automata • New model of computation: analysis of the kind of computation that requires a fixed (finite) amount of memory for arbitrary input • Also called finite-state machines
Deterministic Finite-State Automata • = {a,b} • Vertices and arcs • Labels of arcs are members of • No tapes, but input • Input: (possibly empty) word over • e.g. abb
Deterministic Finite-State Automata • Accepting configuration: FSA halts in state q1 • The FSA accepts word abb • e.g. aba • q2: trap state • L = {abn|n0}
Determinism • For each state/symbol pair, FSA M has exactly one instruction • FSA M has at least one instruction. This makes M fully defined • Determinism means that, within any state diagram for FSA, the path labeled by given word w is unique: for word w S*, there is exactly one path starting at q0 and labeled by w
Exercise • Which regular language is accepted by this FSA? • What are the accepting states? • Is an accepted word? • What is the trap state? • Is the trap state a sink? • Is the language finite?
Exercise • Which regular language is accepted by this FSA? a(a|b)a? • What are the accepting states? q2 and q3 • Is an accepted word? no • What is the trap state? q4 • Is the language finite? yes
Exercise • Which regular language is accepted by this FSA? • What are the accepting states? • Is an accepted word? • What is the trap state? • Is the language finite?
Exercise • Which regular language is accepted by this FSA? (aba)* • What are the accepting states? q0 • Is an accepted word? yes • What is the trap state? q3 • Is the language finite? no
Alternate description dM(q0,a) = q1 dM(q0,b) = q2 dM(q1,b) = q1 dM(q1,a) = q2 dM(q2,a) = q2 dM(q2,b) = q2
Formal Definition A deterministic FSA is a quintuple S,Q,qinit,F, dM • S is the input alphabet • Q is a finite, nonempty set of states • qinit Q is the initial state or start state • F Q is a (possibly empty) set of accepting or terminal states • dM: Q Q transition function (total and single valued)
Word Acceptance A deterministic finite-state automaton M accepts word w S* if there is a unique path starting at qinit and labeled by w that leads to some member of F
Language Acceptance • The language accepted by M is the set of all and only those words over that are accepted by M • L(M) for the language accepted by M. • FSAs are language acceptors only
Non-determinism • Cf. Turing Machines • Existence of alternative instructions for a given state/symbol pair
A Nondeterministic Machine a q0 q1 b b a L = (ab)* {a} = (ab)*|a q2 a b L = (ab)*
Non-determinism • Nondeterministic FSA are usually easier to design but run the risk of accepting unintended words • dM: QSQ is a transition mapping • Assumed to be total but permitted to be multi-valued • Cf. difference between function and mapping!
Formal Definition A nondeterministic FSA is a quintuple S,Q, qinit,F,dM • S is the input alphabet • Q is a finite, nonempty set of states • qinitQ is the initial state or start state • F Q is a (possibly empty) set of accepting or terminal states • dM: Q Q transition mapping (total and possibly multi-valued)
Word Acceptance • Wordw S* is accepted by FSA M provided there exists some path, labeled by w, in the state diagram of M leading from qinit to a terminal state Cf. deterministic definition of word acceptance: unique path
Language Acceptance • The language accepted by a nondeterministic FSA is the set of words accepted by M.
Nondeterminism determinism • Nondeterministic FSA are easier to design • For every nondeterministic FSA, there exists an equivalent deterministic FSA • We can automatically convert the nondeterministic FSA to an equivalent deterministic FSA through subset construction
Epsilon moves • Executing arcs labeled do not advance input • -arcs may or may not introduce nondeterminism
Example b c a 0 1 2 a c a b c • (a*b*c*) b 3 a c b
Equivalence Result • Let M be FSA with e-moves. Then there exists a FSA M´ with no e-moves such that L(M) = L(M´)
Non-determinism -moves do not necessarily imply nondeterminism a 0 1 2
Regular languages The family of regular languages is identical to the family of FSA-acceptable languages !!!!!!!!!!!!!!!!!
Generative Grammars • Alternative characterization of the family of (regular) languages Example with just 2 productions (1) SaSb (2) S • Generates all words of form anbn for n 0 e.g. aaabbb S aSb (1) aaSbb (1) aaaSbbb (1) aaabbb (2)
Definition • empty productions • grammar terminals (usually lowercase) • terminal alphabet S • grammar non-terminals (usually uppercase) • Non-terminal alphabet G • start symbol S in G • production set
Second Example (1) SaaXcc (2) XaXc (3) Xb • Generates all words of form anbcn for n 2 e.g. aaaabcccc S aaXcc (1) aaaXccc (2) aaaaXcccc (2) aaaabcccc (3)
Third Example (1) SaS´bc (2) S (3) S´ aS´bC (4) S´ (5) CbbC (6) Cccc • Generates language {anbncn|n 0} e.g. aaabbbccc S aS’bc (1) aaS’bCbc (2) aaaS’bCbCbc (2) aaabCbCbc (4) aaabbCCbc (5) aaabbCbCc (5) aaabbbCCc (5) aaabbbCcc (6) aaabbbccc (6)