870 likes | 965 Views
Learning Restricted Restarting Automata. Presentation for the ABCD workshop in Prague, March 27 – 29, 2009 Peter Černo. About the presentation.
E N D
Learning Restricted Restarting Automata Presentation for the ABCD workshop in Prague, March 27 – 29, 2009 Peter Černo
About the presentation • PART I: We present a specialized program which allows an easy design and testing of restarting automata and provides specialized tools for learning finite automata and defining languages.
About the presentation • PART I: We present a specialized program which allows an easy design and testing of restarting automata and provides specialized tools for learning finite automata and defining languages. • PART II: We introduce two restricted models of restarting automata: RACL and RA∆CL together with their properties and limitations.
About the presentation • PART I: We present a specialized program which allows an easy design and testing of restarting automata and provides specialized tools for learning finite automata and defining languages. • PART II: We introduce two restricted models of restarting automata: RACL and RA∆CL together with their properties and limitations. • PART III: We demonstrate learning of these restricted models by another specialized program.
About the presentation • PART I: We present a specialized program which allows an easy design and testing of restarting automata and provides specialized tools for learning finite automata and defining languages. • PART II: We introduce two restricted models of restarting automata: RACL and RA∆CL together with their properties and limitations. • PART III: We demonstrate learning of these restricted models by another specialized program. • PART IV: We give a list of some open problems and topics for future investigations.
Restarting Automaton • Is a system M = (Σ, Γ, I) where: • Σ is an input alphabet, Γ is a working alphabet • I is a finiteset of meta-instructions: • Rewriting meta-instruction(EL, x → y, ER), where x, y ∊ Γ* such that |x| > |y|, and EL, ER⊆ Γ*are regular languages called left and right constraints. • Accepting meta-instruction(E, Accept), where E ⊆ Γ* is a regular language.
Language of Restarting Automaton • Rewriting meta-instructions of Minduce a reducing relation ⊢M⊆ Γ* x Γ*such that: for each u, v ∊ Γ*, u ⊢M vif and only if there exist an instruction i = (EL, x → y, ER)in Iand words u1, u2∊ Γ* such that u = u1xu2,v = u1yu2, u1∊ ELandu2 ∊ ER.
Language of Restarting Automaton • Rewriting meta-instructions of Minduce a reducing relation ⊢M⊆ Γ* x Γ*such that: for each u, v ∊ Γ*, u ⊢M vif and only if there exist an instruction i = (EL, x → y, ER)in Iand words u1, u2∊ Γ* such that u = u1xu2,v = u1yu2, u1∊ ELandu2 ∊ ER. • Accepting meta-instructions of Mdefine simple sentential forms SM= set of words u ∊ Γ*, for which there exist an instruction i = (E, Accept)in Isuch that u ∊ E.
Language of Restarting Automaton • Rewriting meta-instructions of Minduce a reducing relation ⊢M⊆ Γ* x Γ*such that: for each u, v ∊ Γ*, u ⊢M vif and only if there exist an instruction i = (EL, x → y, ER)in Iand words u1, u2∊ Γ* such that u = u1xu2,v = u1yu2, u1∊ ELandu2 ∊ ER. • Accepting meta-instructions of Mdefine simple sentential forms SM= set of words u ∊ Γ*, for which there exist an instruction i = (E, Accept)in Isuch that u ∊ E. • The input language of M is defined as: L(M) = {u ∊ Σ*| ∃v ∊ SM: u ⊢M* v}, where⊢M*is a reflexive and transitive closure of ⊢M.
Example • How to create a restarting automaton that recognizes the language L = {aibicjdj | i, j > 0}. • Accepting meta-instructions: • Reducing (or rewriting) meta-instructions:
Example • Suppose that we have a word: aaabbbccdd
Example • aaabbbccdd‒R0→ aabbccdd
Example • aaabbbccdd‒R0→ aabbccdd‒R1→ aabbcd
Example • aaabbbccdd‒R0→ aabbccdd‒R1→ aabbcd‒R0→ abcd • abcdis accepted by A0, so the whole word aaabbbccdd is accepted. • Note that abcdis a simple sentential form.
Capabilities and features • Design a restarting automaton. The design of restarting automaton consists of stepwise design of accepting and reducing meta-instructions. You can save (load) restarting automaton to (from) an XML file. • Test correctly defined restarting automaton: • The system is able to give you a list of all words that can be obtained by reductions from a given word w. • The system is able to give you a list of all reduction paths from one given word to another given word. • Start a server mode, in which the client applications can use services provided by the server application. • You can use specialized toolsto define formal languages. • You can also save, load, copy, paste and view an XML representation of the actual state of every tool.
Learning Languages • There are several tools that are used to define languages: • DFA Modeler: allows you to enter a regular language by specifying its underlying deterministic finite automaton. • LStar Algorithm: encapsulates Dana Angluin’s L* algorithm that is a machine learning algorithm which learns deterministic finite automaton using membership and equivalence queries. • RPNI Algorithm: encapsulates a machine learning algorithm which learns deterministic finite automaton based on a given set of labeled examples. • Regular Expression: allows you to enter a regular language by specifying the regular expression. • SLT Language: allows you to design a regular language by specifying a positive integer k and positive examples using the algorithm for learning k-SLT languages.
Pros and Cons • Pros: • The application is written in C# using .NET Framework 2.0. It works both on Win32 and UNIX platforms. • The application demonstrates that it is easy to design and work with restarting automata. • Any component of the application can be easily reused in another projects. It safes your work. • Cons: • The application is a toy that allows you only to design simple restarting automata recognizing only simple formal languages with small alphabets consisting of few letters. On large inputs the computation can take a long time and it can produce a huge output.
PART II: RACL • k-local Restarting Automaton CLEARING(k-RACL) M = (Σ, I) • Σis a finite nonempty alphabet, ¢, $ ∉ Σ • Iis a finite set of instructions (x, z, y), x∊ LCk, y ∊ RCk, z ∊ Σ+ • left context LCk = Σk∪¢.Σ≤k-1 • right context RCk = Σk∪Σ≤k-1.$
PART II: RACL • k-local Restarting Automaton CLEARING(k-RACL) M = (Σ, I) • Σis a finite nonempty alphabet, ¢, $ ∉ Σ • Iis a finite set of instructions (x, z, y), x∊ LCk, y ∊ RCk, z ∊ Σ+ • left context LCk = Σk∪¢.Σ≤k-1 • right context RCk = Σk∪Σ≤k-1.$ • A word w = uzvcan be rewritten to uv (uzv→ uv) if and only if there exist an instruction i = (x, z, y) ∊ I such that: • x⊒ ¢.u(xis a suffix of ¢.u) • y⊑ v.$ (y is a prefix of v.$)
PART II: RACL • k-local Restarting Automaton CLEARING(k-RACL) M = (Σ, I) • Σis a finite nonempty alphabet, ¢, $ ∉ Σ • Iis a finite set of instructions (x, z, y), x∊ LCk, y ∊ RCk, z ∊ Σ+ • left context LCk = Σk∪¢.Σ≤k-1 • right context RCk = Σk∪Σ≤k-1.$ • A word w = uzvcan be rewritten to uv (uzv→ uv) if and only if there exist an instruction i = (x, z, y) ∊ I such that: • x⊒ ¢.u(xis a suffix of ¢.u) • y⊑ v.$ (y is a prefix of v.$) • A word wis accepted if and only if w →* λwhere →*is reflexive and transitive closure of →.
PART II: RACL • k-local Restarting Automaton CLEARING(k-RACL) M = (Σ, I) • Σis a finite nonempty alphabet, ¢, $ ∉ Σ • Iis a finite set of instructions (x, z, y), x∊ LCk, y ∊ RCk, z ∊ Σ+ • left context LCk = Σk∪¢.Σ≤k-1 • right context RCk = Σk∪Σ≤k-1.$ • A word w = uzvcan be rewritten to uv (uzv→ uv) if and only if there exist an instruction i = (x, z, y) ∊ I such that: • x⊒ ¢.u(xis a suffix of ¢.u) • y⊑ v.$ (y is a prefix of v.$) • A word wis accepted if and only if w →* λwhere →*is reflexive and transitive closure of →. • We define the class RACLas ⋃k≥1k-RACL.
Why RACL? • This model was inspired by the Associative Language Descriptions (ALD) model • By Alessandra Cherubini, Stefano Crespi-Reghizzi, Matteo Pradella, Pierluigi San Pietro • See: http://home.dei.polimi.it/sanpietr/ALD/ALD.html • The more restricted model we have the easier is the investigation of its properties. Moreover, the learning methods are much more simple and straightforward.
Example • Language L = {anbn | n ≥ 0}.
Example • Language L = {anbn | n ≥ 0}. • 1-RACL M = ({a, b}, I)where the instructions Iare: • R1 = (a, ab, b) • R2 = (¢, ab, $)
Example • Language L = {anbn | n ≥ 0}. • 1-RACL M = ({a, b}, I) where the instructions Iare: • R1 = (a, ab, b) • R2 = (¢, ab, $) • For instance: • aaaabbbb‒R1→ aaabbb
Example • Language L = {anbn | n ≥ 0}. • 1-RACL M = ({a, b}, I) where the instructions Iare: • R1 = (a, ab, b) • R2 = (¢, ab, $) • For instance: • aaaabbbb‒R1→ aaabbb ‒R1→ aabb
Example • Language L = {anbn | n ≥ 0}. • 1-RACL M = ({a, b}, I) where the instructions Iare: • R1 = (a, ab, b) • R2 = (¢, ab, $) • For instance: • aaaabbbb‒R1→ aaabbb ‒R1→ aabb ‒R1→ ab
Example • Language L = {anbn | n ≥ 0}. • 1-RACL M = ({a, b}, I) where the instructions Iare: • R1 = (a, ab, b) • R2 = (¢, ab, $) • For instance: • aaaabbbb‒R1→ aaabbb ‒R1→ aabb ‒R1→ ab ‒R2→ λ • Now we see that the word aaaabbbb is accepted because aaaabbbb→* λ.
Example • Language L = {anbn | n ≥ 0}. • 1-RACL M = ({a, b}, I) where the instructions Iare: • R1 = (a, ab, b) • R2 = (¢, ab, $) • For instance: • aaaabbbb‒R1→ aaabbb ‒R1→ aabb ‒R1→ ab ‒R2→ λ • Now we see that the word aaaabbbb is accepted because aaaabbbb→* λ. • Note that λis always accepted because λ→* λ.
Some Theorems • Theorem: For every finiteL ⊆ Σ*there exist 1-RACL M such that L(M) = L. • Proof. Suppose L = {w1, …, wn}. Consider I = {(¢, w1, $), …, (¢, wn, $)}. ∎
Some Theorems • Theorem: For every finiteL ⊆ Σ*there exist 1-RACL M such that L(M) = L. • Proof. Suppose L = {w1, …, wn}. Consider I = {(¢, w1, $), …, (¢, wn, $)}. ∎ • Theorem: For all k≥1ℒ(k-RACL) ⊆ ℒ((k+1)-RACL).
Some Theorems • Theorem: For every finiteL ⊆ Σ*there exist 1-RACL M such that L(M) = L. • Proof. Suppose L = {w1, …, wn}. Consider I = {(¢, w1, $), …, (¢, wn, $)}. ∎ • Theorem: For all k≥1ℒ(k-RACL) ⊆ ℒ((k+1)-RACL). • Theorem: For each regular language L there exist a k-RACL M such that L(M) = L∪{λ}.
Some Theorems • Theorem: For every finiteL ⊆ Σ*there exist 1-RACL M such that L(M) = L. • Proof. Suppose L = {w1, …, wn}. Consider I = {(¢, w1, $), …, (¢, wn, $)}. ∎ • Theorem: For all k≥1ℒ(k-RACL) ⊆ ℒ((k+1)-RACL). • Theorem: For each regular language L there exist a k-RACL M such that L(M) = L∪{λ}. • Proof. Based on pumping lemma for regular languages. • For each z ∊ Σ*, |z|=n there exist u, v, wsuch that |v|≥1, δ(q0, uv) = δ(q0, u); the word v can be crossed out. • We add corresponding instruction iz = (¢u, v, w). • For each accepted z ∊ Σ<nwe add instruction iz = (¢, z, $). ∎
Some Theorems • Lemma: Let M be RACL, i = (x, z, y) its instruction and w = uv such that x⊒ ¢.uand y⊑ v.$. Then uv∊ L(M) ⇒ uzv∊ L(M). • Proof. uzv―i→ uv →* λ. ∎
Some Theorems • Lemma: Let M be RACL, i = (x, z, y) its instruction and w = uv such that x⊒ ¢.uand y⊑ v.$. Then uv∊ L(M) ⇒ uzv∊ L(M). • Proof. uzv―i→ uv →* λ. ∎ • Theorem: Languages L1∪ L2 and L1.L2 • where L1 = {anbn | n≥0}and L2 = {anb2n | n≥0} are not accepted by any RACL. • Proof by contradiction, based on the previous lemma.
Some Theorems • Lemma: Let M be RACL, i = (x, z, y) its instruction and w = uv such that x⊒ ¢.uand y⊑ v.$. Then uv∊ L(M) ⇒ uzv∊ L(M). • Proof. uzv―i→ uv →* λ. ∎ • Theorem: Languages L1∪ L2 and L1.L2 • where L1 = {anbn | n≥0}and L2 = {anb2n | n≥0} are not accepted by any RACL. • Proof by contradiction, based on the previous lemma. • Corollary: RACL is not closed underunion and concatenation.
Some Theorems • Lemma: Let M be RACL, i = (x, z, y) its instruction and w = uv such that x⊒ ¢.uand y⊑ v.$. Then uv∊ L(M) ⇒ uzv∊ L(M). • Proof. uzv―i→ uv →* λ. ∎ • Theorem: Languages L1∪ L2 and L1.L2 • where L1 = {anbn | n≥0}and L2 = {anb2n | n≥0} are not accepted by any RACL. • Proof by contradiction, based on the previous lemma. • Corollary: RACL is not closed underunion and concatenation. • Corollary: RACL is not closed under homomorphism. • Consider {anbn | n≥0} ∪ {cnd2n | n≥0}and homomorphism defined as: a ↦ a, b ↦ b, c ↦ a, d ↦ b. ∎
Some Theorems • Theorem: The language L1 = {ancbn | n ≥ 0} ∪ {λ}is not accepted by any RACL. • Theorem: The languages: • L2 = {ancbn | n ≥ 0} ∪ {ambm | m ≥ 0} • L3 = {ancbm | n, m ≥ 0} ∪ {λ} • L4 = {ambm | m ≥ 0} are recognized by 1-RACL.
Some Theorems • Theorem: The language L1 = {ancbn | n ≥ 0} ∪ {λ}is not accepted by any RACL. • Theorem: The languages: • L2 = {ancbn | n ≥ 0} ∪ {ambm | m ≥ 0} • L3 = {ancbm | n, m ≥ 0} ∪ {λ} • L4 = {ambm | m ≥ 0} are recognized by 1-RACL. • Corollary: RACL is not closed under intersection. • Proof. L1 = L2∩ L3. ∎
Some Theorems • Theorem: The language L1 = {ancbn | n ≥ 0} ∪ {λ}is not accepted by any RACL. • Theorem: The languages: • L2 = {ancbn | n ≥ 0} ∪ {ambm | m ≥ 0} • L3 = {ancbm | n, m ≥ 0} ∪ {λ} • L4 = {ambm | m ≥ 0} are recognized by 1-RACL. • Corollary: RACL is not closed under intersection. • Proof. L1 = L2∩ L3. ∎ • Corollary: RACL is not closed under intersection with a regular language. • Proof. L3is a regular language. ∎
Some Theorems • Theorem: The language L1 = {ancbn | n ≥ 0} ∪ {λ}is not accepted by any RACL. • Theorem: The languages: • L2 = {ancbn | n ≥ 0} ∪ {ambm | m ≥ 0} • L3 = {ancbm | n, m ≥ 0} ∪ {λ} • L4 = {ambm | m ≥ 0} are recognized by 1-RACL. • Corollary: RACL is not closed under intersection. • Proof. L1 = L2∩ L3. ∎ • Corollary: RACL is not closed under intersection with a regular language. • Proof. L3is a regular language. ∎ • Corollary: RACL is not closed under difference. • Proof. L1 = (L2– L4) ∪ {λ}. ∎
Parentheses • The following instruction of 1-RACL M is enough for recognizing the language of correct parentheses: • (λ, ( ), λ)
Parentheses • The following instruction of 1-RACL M is enough for recognizing the language of correct parentheses: • (λ, ( ), λ) • Note: This instruction represents a set of instructions: • ({¢}∪Σ, ( ), Σ∪{$}), where Σ = {(, )} and • (A, w, B) = {(a, w, b) | a∊A, b∊B}.
Parentheses • The following instruction of 1-RACL M is enough for recognizing the language of correct parentheses: • (λ, ( ), λ) • Note: This instruction represents a set of instructions: • ({¢}∪Σ, ( ), Σ∪{$}), where Σ = {(, )} and • (A, w, B) = {(a, w, b) | a∊A, b∊B}. • Note: We use the following notation for the (A, w, B): A B w
Arithmetic expressions • Suppose that we want to check correctness of arithmetic expressions over the alphabet Σ = {α, +, *, (, )}. • For example α+(α*α+α)is correct, α*+αis not. • The priority of the operations is considered.
Arithmetic expressions • Suppose that we want to check correctness of arithmetic expressions over the alphabet Σ = {α, +, *, (, )}. • For example α+(α*α+α) is correct, α*+αis not. • The priority of the operations is considered. • The following 1-RACL M is sufficient: ¢ + ( α ( ¢ + * ( α ( ¢ $ α+ ()+ α* ()* α () α ) $ + ) α ) $ + * ) ( ) +α +() *α *() α ()
Nondeterminism • Assume the following instructions: • R1 = (bb, a, bbbb) • R2 = (bb, bb, $) • R3 = (¢, cbb, $) and the word: cbbabbbb.
Nondeterminism • Assume the following instructions: • R1 = (bb, a, bbbb) • R2 = (bb, bb, $) • R3 = (¢, cbb, $) and the word: cbbabbbb. Then: • cbbabbbb―R1→ cbbbbbb ―R2→ cbbbb―R2→ cbb ―R3→ λ.