Attempts to extend correction queries

31st of October 2005Seminar IV Attempts to extend correction queries Cristina Bibire Research Group on Mathematical Linguistics, Rovira i Virgili University Pl. Imperial Tarraco 1, 43005, Tarragona, Spain E-mail: cristina.bibire@estudiants.urv.es

Correction queries • PAC learning of DFA • Learning CFL • Learning WFA • Redefining the correcting string • References

Learning from corrections The correcting string of s in the language L is the smallest string s' (in lex-length order) such that s.s' belongs to L. The answer to a correction query for a string consists of its correcting string. Myhill-Nerode theorem: The number of states in the smallest DFA accepting L is equal to the number of equivalence classes in .

Learning from corrections

How can we extend CQ? ? PAC learning of DFA with CQ Learning CFL with CQ Learning WFA with CQ Redefining the correcting string ? ? ?

PAC learning of DFA with CQ • We assume that there is some probability distribution Pr on the set of all strings over the alphabet Σ and let L be an unknown regular set • The Learner has access to information about L by means of two oracles: • C(x) returns the correcting string for x • Ex( ) is a random sampling oracle that selects a string x from Σ* according to the distribution Pr and returns the pair (x, C(x)). • In addition, the Learner is given the accuracyε and the confidenceδ. • Definition: We say that the language L1 is an ε-approximation of the language L2 provided that: • If A is a DFA, it is said to be an ε-approximation of the set L if L(A) is an ε-approximation of L.

PAC learning of DFA with CQ • If A is an ε-approximation of L, then the probability of finding a discrepancy between L(A) and L with one call of the random sampling oracle Ex( ) is at most ε. • The approximate learner LCAapprox is obtained by modifying LCA. A correction query of the string x is satisfied by a call to C(x). Each conjecture is tested by a number of calls to Ex( ). • If any of the calls to Ex( ) returns a pair (t, C(t)) such that: • - C(t)=λ but A(S,E,C) rejects it or • - C(t)≠λ but A(S,E,C) accepts it • then t is said to be a counterexample and LCAapprox proceeds as LCA • If none of the calls to Ex( ) returns a counterexample, then LCAapprox halts and outputs A(S,E,C)

PAC learning of DFA with CQ • How many calls to Ex( ) does LCAapprox make to test a given conjecture? • accuracy and confidence parameters, ε and δ • how many previous conjectures have been tested • Let • If i previous conjectures have been tested then LCAapprox makes [ri] calls to Ex( ). • Theorem. If n is the number of states in the minimum DFA for the target language L, then LCAapprox terminates after O(n+(1/ε) (ln(1/δ)n+n2)) calls to Ex( ) oracle. Moreover, the probability that the automaton output by LCAapprox is an ε-approximation of L is at least 1-δ.

PAC learning of DFA with CQ • Sketch of the proof: • the total number of counterexamples is at most n-1, so the total number of calls to Ex( ) is at most • the probability that LCAapprox will terminate with an automaton that is not an ε-approximation of L is:

How can we extend CQ? √ ? PAC learning of DFA with CQ Learning CFL with CQ Learning WFA with CQ Redefining the correcting string ? ? ?

Learning CFL • The setting • There is an unknown CFG G in Chomsky normal form. The Learner knows the set T of terminal symbols, the set N of nonterminal symbols and the start symbol S of G. The Teacher is assumed to answer two types of questions: • MEMBER(x,A) – if the string x can be derived from the non-terminal A in the grammar G, the answer is yes; otherwise, it is no • EQUIV(H) – if H is equivalent to G, the answer is yes; otherwise, it replies with a counterexample t.

Learning CFL • The Learner LCF • LCF can explicitly enumerate all the possible productions of G in polynomial time (in |T| and |N|). Initially LCF places all possible productions of G in the hypothesized set of productions P. • The main loop of LCF asks an EQUIV(H) question for the grammar H=(T,N,S,P). • if H is equivalent to G, then LCF halts and outputs H • otherwise, it “diagnoses” the counterexample t returned, which results in removing at least one production from P; the main loop is then repeated.

How can we extend CQ? √ PAC learning of DFA with CQ Learning CFL with CQ Learning WFA with CQ Redefining the correcting string ? ? ?

Learning WFA Let be a field and be a function. Associate with an infinite matrix with rows indexed by strings in and columns indexed by strings in . The entry of contains the value f(x.y). The function is called a power series and its Hankel matrix. If we have an WFA A we can associate a function and vice versa, for every function there exists a smallest WFA A such that . Theorem [Carlyle, Paz 1971] Let such that and let F be the corresponding Hankel matrix. Then, the size r of the smallest WFA A such that satisfies r=rank(F).

Learning WFA • Let f be a target function. The learning algorithm may ask the oracle two types of query: • EQ(h): if h is equivalent to f on all input assignments then the answer to the query is yes; otherwise, the answer is no and it receives a counterexample z ( ). • MQ(z): the oracle has to return f(z) • The algorithm learns a function f using its Hankel matrix, F. Because of the mentioned theorem, it is enough to keep a sub-matrix of F of full rank. Therefore the learning algorithm can be viewed as a search for appropriate r rows and r columns.

Learning WFA • The algorithm • Initialize: • Define a hypothesis h • Let • For every , define a matrix such that • For every , define • Ask an equivalence query EQ(h) • If the answer is yes, halt and output h • Otherwise, the answer is no and we receive a counterexample z • Using MQ find a string w.σ, prefix of z such that • (a) • (b) • Go to (2)

How can we extend CQ? √ PAC learning of DFA with CQ Learning CFL with CQ Learning WFA with CQ Redefining the CQ ? ?

Redefining the correcting string • Hamming distance (only for strings of the same length). For two strings s and t, H(s, t) is the number of places in which the two string differ, i.e., have different characters.

q3 q2 0 q0 q1 0 1 1 1 1 0 0 Redefining the correcting string • Hamming distance (only for strings of the same length). For two strings s and t, H(s, t) is the number of places in which the two string differ, i.e., have different characters.

0, 1 0, 1 q2 q3 q2 0 q0 q1 q0 q1 0, 1 0 1 1 1 1 0 0 Redefining the correcting string • Hamming distance (only for strings of the same length). For two strings s and t, H(s, t) is the number of places in which the two string differ, i.e., have different characters.

Redefining the correcting string • Hamming distance (only for strings of the same length). For two strings s and t, H(s, t) is the number of places in which the two string differ, i.e., have different characters.

q3 q2 0 q0 q1 0 1 1 1 1 0 0 Redefining the correcting string • Hamming distance (only for strings of the same length). For two strings s and t, H(s, t) is the number of places in which the two string differ, i.e., have different characters.

q2 q2 q3 0 q0 q1 0 1 1 1 1 0 0 0, 1 1 q0 q1 0, 1 0 Redefining the correcting string • Hamming distance (only for strings of the same length). For two strings s and t, H(s, t) is the number of places in which the two string differ, i.e., have different characters.

q3 q2 0 q0 q1 0 1 1 1 1 0 0 Redefining the correcting string • Hamming distance

1 1 1 q2 0 0 q0 q1 0 Redefining the correcting string • Hamming distance

Redefining the correcting string • Levenshtein (or edit) distance. It counts also when one has a character whereas the other does not. • For two characters a and b, define: • Assume we are given two strings s and t of length n and m, respectively. We are going to fill an (n+1)×(m+1) array d with integers such that the low right corner element d(n+1, m+1) will furnish the required values of the Levenshtein distance Lev(s, t). • The definition of entries of d is recursive. • First set and • For other pairs i, j use

1 1 1 q2 0 0 q0 q1 0 Redefining the correcting string • Levenshtein distance

1 1 1 q2 q2 0 0 q0 q1 0 1 1 1 0 0 q0 q1 0 Redefining the correcting string • Levenshtein distance

1 1 1 q2 0 0 q0 q1 0 Redefining the correcting string • Levenshtein distance

How can we extend CQ? √ PAC learning of DFA with CQ Learning CFL with CQ Learning WFA with CQ Redefining the correcting string ?

References • D. Agluin. Learning Regular Sets from Queries and Counter-examples. Information and Computation 75, 87-106 (1987) • L. Lee. Learning of Context-Free Languages: A Survey of the Literature. Harvard University Technical Report TR-12-1996 (written in 1994) • C. de la Higuera. Learning Stochastic Finite Automata from Experts. In Proceedings of the 4th International Colloquium on Grammatical Inference, Lecture Notes In Computer Science 1433,79-89 (1998) • F. Bergadano, N. Bshouty, A. Beimel, E. Kushilevitz and S. Varricchio. Learning Functions Represented as Multiplicity Automata. Journal of the ACM 47, 506-530 (2000) • http://www.cut-the-knot.org/do_you_know/Strings.shtml

Thank You!

Attempts to extend correction queries

Attempts to extend correction queries

Presentation Transcript

World Record attempts

Recent Attempts to Catalog Events

Introduction to Queries

Attempts to Form A New Nation

CORRECTION

Correction to Syllabus

Suicide Attempts

Government Attempts to Reduce Gender Inequalities

8.6 Extend

Communist Attempts to Control Thought

The Power of Correction Queries

Attempts to Create Human Like Robots

Early Attempts to Abolish Slavery

Behavioral Ecology attempts to understand behavior

Attempts to Detect X-ray Albedo

Freezing to extend the season

CORRECTION

Englands First attempts

How to Limit WordPress Login Attempts

Extend Flow