230 likes | 383 Views
New Approaches to DFA learning. Cristina Bibire Research Group on Mathematical Linguistics, Rovira i Virgili University Pl. Imperial Tarraco 1, 43005, Tarragona, Spain E-mail: cristina.bibire@estudiants.urv.es. Introduction Learning from queries Learning from corrections
E N D
New Approaches to DFA learning Cristina Bibire Research Group on Mathematical Linguistics, Rovira i Virgili University Pl. Imperial Tarraco 1, 43005, Tarragona, Spain E-mail: cristina.bibire@estudiants.urv.es
Introduction • Learning from queries • Learning from corrections • Learning from examples • Incremental algorithm • Further Research
Introduction • This research is focused on learning DFA within two important frameworks: learning from queries and learning from examples: • Angluin's query learning algorithm • learning from corrections algorithm - correction queries are replacing membership queries (it is possible to learn DFA from corrections and that the number of queries are reduced considerably). • a comprehensive study of the most important state merging strategies developed so far • new incremental learning algorithm which allow us to learn new information from additional data that may later become available. (incremental learning is possible in the presence of a characteristic sample.)
Learning from queries • Learning from queries was introduced by Dana Angluin in 1987. She was the first who proved learnability of DFA via queries. • In query learning, there is a teacher that knows the language and has to answer correctly specific kind of queries asked by the learner. In Angluin’s algorithm, the learner asks two kinds of queries: • membership query • - consists of a string s; the answer is YES or NO depending on whether s is the member of the unknown language or not. • equivalence query • - is a conjecture, consisting of a description of a regular set U. The answer is YES if U is equal to the unknown language and is a string s in the symmetric difference of U and the unknown language otherwise.
Learning from corrections In Angluin's algorithm, when the learner asks about a word in the language, the teacher's answer is very simple, YES or NO.
Learning from corrections • In Angluin's algorithm, when the learner asks about a word in the language, the teacher's answer is very simple, YES or NO. • Our idea was to introduce a new type of query: • correction query • - it consists of a string s; the teacher has to return the smallest string s' such that s.s' belongs to the target language.
Learning from corrections • In Angluin's algorithm, when the learner asks about a word in the language, the teacher's answer is very simple, YES or NO. • Our idea was to introduce a new type of query: • correction query • - it consists of a string s; the teacher has to return the smallest string s' such that s.s' belongs to the target language. • Formally, for a string , • where is the left quotient of by : • where is an automaton accepting .
Learning from corrections Closed, consistent observation tables An observation table is called closed if An observation table is called consistent if
Learning from corrections Closed, consistent observation tables An observation table is called closed if An observation table is called consistent if For any , denotes the finite function from E to {0,1} defined by For any , denotes the finite function from E to defined by Remark 1 C(α)=βγ implies C(αβ)=γ Remark 2 C(α)=φ implies C(αβ)=φ
Learning from corrections Comparative results for different languages using L* and LCA
Learning from corrections Comparative results for different languages using L* and LCA
Learning from examples • TB algorithm • Gold’s algorithm • RPNI • Traxbar • EDSM • W-EDSM • Blue-fringe • SAGE
Incremental learning algorithm An incremental learning algorithm is introduced for learning new information from additional data that may later become available. The proposed algorithm is capable of incrementally learn new information without forgetting previously acquired knowledge and without requiring access to the original database.
0 q0 q2 q3 q3 q2 q1 1 1 IA 0 0 0 1 Incremental learning algorithm 1 0 q0 1 1 0 q1 + 0 0 0 1 0
Incremental learning algorithm There are a lot of questions to be answered: • does it produce the target DFA? • does it improve on the running time? • is it useful on real life applications? • what are the conditions to fulfil in order to work properly? etc.
Incremental learning algorithm There are a lot of questions to be answered: • does it produce the target DFA? • does it improve on the running time? • is it useful on real life applications? • what are the conditions to fulfil in order to work properly? etc. • We denote by: • = the set of all automata having the alphabet
Incremental learning algorithm There are a lot of questions to be answered: • does it produce the target DFA? • does it improve on the running time? • is it useful on real life applications? • what are the conditions to fulfil in order to work properly? etc. • We denote by: • = the set of all automata having the alphabet
Lemma 1 It is not always true that: Lemma 2 It is not always true that: Lemma 3 It is not always true that: Incremental learning algorithm
Further Research on LCA • To prove that the number of CQs is always smaller than the number of MQs • To prove that the number of EQs is always less or equal • To prove the following conjectures: • To show that we have improved on the running time • CQs are more expensive than MQs. How much does this affect the total running time?
Further Research on IA • To determine the complexity of the algorithm and to test it on large/sparse data • To determine how much time and resources we save using this algorithm instead of the classical ones • To design an algorithm to deal with new introduced negative samples • To find the answer to the question: when is the automaton created with this method weakly equivalent with the one obtained with the entire sample? • To improve the software in order to be able to deal with new samples