1 / 32

Grammatical inference: an introduction

Grammatical inference: an introduction. Colin de la Higuera University of Nantes. Nantes. @ wikipedia. Acknowledgements.

knoton
Download Presentation

Grammatical inference: an introduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Grammatical inference: an introduction Colin de la Higuera University of Nantes

  2. Nantes @wikipedia

  3. Acknowledgements • Pieter Adriaans, Hasan IbneAkram, Anne-Muriel Arigon, Leo Becerra-Bonache, Cristina Bibire, Alex Clark, Rafael Carrasco, Paco Casacuberta, Pierre Dupont, Rémi Eyraud, Philippe Ezequel, Henning Fernau, Jeffrey Heinz, Jean-Christophe Janodet, Satoshi Kobayachi, Laurent Miclet, Thierry Murgue, Tim Oates, Jose Oncina, Frédéric Tantini, Franck Thollard, SiccoVerwer, Enrique Vidal, Menno van Zaanen,... http://pagesperso.lina.univ-nantes.fr/~cdlh/ http://videolectures.net/colin_de_la_higuera/

  4. Practical information • Grammatical Inferenceis module X9IT050 • 18 hours • http://pagesperso.lina.univ-nantes.fr/~cdlh/X9IT050.html • Exam: to bedecided

  5. Someuseful links • The Grammatical Inference Software Repository https://logiciels.lina.univ-nantes.fr/redmine/projects/gisr/wiki • Talks on http://videolectures.net • A book • Articles • Start here: http://pagesperso.lina.univ-nantes.fr/~cdlh/X9IT050.html

  6. What I plan to talk about • 11/9/2013 An introduction to grammatical inference. About what learning a language means, how we can measure success • 18/9/2013 An introduction to grammatical inference. A motivating example • 25/9/2013 Learning: identifying or approximating? • 2/10/2013 Learning from text • 9/10/2013 Learning from text: the window languages • 16/10/2013 Learning from an informant: the RPNI algorithm and variants • 23/10/2013 Learning distributions: why? How should we measure success? About distances between distributions • 6/11/2013 Learning distributions: learning the weights given a structure. EM, Gibbs sampling and the spectral methods • 13/11/2013 Learning distributions: state merging techniques • 20/11/2013 Active learning 1 About active learning • 27/11/2013 Active learning 2 The MAT algorithm • 4/12/2013 Learning transducers • 11/12/2013 Learning probabilistic transducers • 18/12/2013 Exam

  7. Outline (of this first talk) • What is grammatical inference about? • Why is it a difficult task? • Why is it a useful task? • Validation issues • Some criteria

  8. 1 Grammatical inference is about learning a grammar given information about a language • Information is strings, trees or graphs • Information can be (typically) • Text: only positive information • Informant: labelled data • Actively sought (query learning, teaching) Above lists are not limitative

  9. The functions/goals • Languages and grammars from the Chomsky hierarchy • Probabilistic automata and context-free grammars • Hidden Markov Models • Patterns • Transducers

  10. The Chomsky hierarchy Recursivelyenumerablelanguages Context sensitive languages Regular languages Context-free languages

  11. The Chomsky hierarchyrevisited • Regular languages • Recognized by DFA, NFA • Generated by regulargrammars • Described by regular expressions • Context-free languages • Generated by CF grammars • Recognized by Stackautomata • Context-sensitive languages • CS grammars (parsingis not in P) • Turing machines • Parsingisundecidable

  12. Otherformalisms • Topologicalformalisms • Semilinearlanguages • Hyperplanes • Balls of strings

  13. Distributions of strings • A probabilisticautomatondefines a distribution over the strings

  14. Fuzzyautomata • An automatonwillsaythat string w belongs to the languagewithprobability p • The differencewith the probabilisticautomataisthat • The total sum of probabilitiesmaybedifferentthan 1 (mayevenbeinfinite) • The fuzzyautomatoncannotbeused as a generator of strings

  15. The data: examples of strings A string in Gaelic and its translation to English: • Tha thu cho duaichnidh ri èarr àirde de a’ coisich deas damh • You are as ugly as the north end of a southward traveling ox

  16. http://www.flickr.com/photos/popfossa/3992549630/ • Time series pose the problem of the alphabet: • An infinite alphabet? • Discretizing? • An ordered alphabet

  17. GIORGIO BERNARDI, REGINA GOURSOT, EDDA RAYKO, RENÉ GOURSOT, BAYA CHERIF-ZAHAR, AND ROBERTA MELIS http://www.scopenvironment.org/downloadpubs/scope44/chapter05.html

  18. >A BAC=41M14 LIBRARY=CITB_978_SKB AAGCTTATTCAATAGTTTATTAAACAGCTTCTTAAATAGGATATAAGGCAGTGCCATGTA GTGGATAAAAGTAATAATCATTATAATATTAAGAACTAATACATACTGAACACTTTCAAT GGCACTTTACATGCACGGTCCCTTTAATCCTGAAAAAATGCTATTGCCATCTTTATTTCA GAGACCAGGGTGCTAAGGCTTGAGAGTGAAGCCACTTTCCCCAAGCTCACACAGCAAAGA CACGGGGACACCAGGACTCCATCTACTGCAGGTTGTCTGACTGGGAACCCCCATGCACCT GGCAGGTGACAGAAATAGGAGGCATGTGCTGGGTTTGGAAGAGACACCTGGTGGGAGAGG GCCCTGTGGAGCCAGATGGGGCTGAAAACAAATGTTGAATGCAAGAAAAGTCGAGTTCCA GGGGCATTACATGCAGCAGGATATGCTTTTTAGAAAAAGTCCAAAAACACTAAACTTCAA CAATATGTTCTTTTGGCTTGCATTTGTGTATAACCGTAATTAAAAAGCAAGGGGACAACA CACAGTAGATTCAGGATAGGGGTCCCCTCTAGAAAGAAGGAGAAGGGGCAGGAGACAGGA TGGGGAGGAGCACATAAGTAGATGTAAATTGCTGCTAATTTTTCTAGTCCTTGGTTTGAA TGATAGGTTCATCAAGGGTCCATTACAAAAACATGTGTTAAGTTTTTTAAAAATATAATA AAGGAGCCAGGTGTAGTTTGTCTTGAACCACAGTTATGAAAAAAATTCCAACTTTGTGCA TCCAAGGACCAGATTTTTTTTAAAATAAAGGATAAAAGGAATAAGAAATGAACAGCCAAG TATTCACTATCAAATTTGAGGAATAATAGCCTGGCCAACATGGTGAAACTCCATCTCTAC TAAAAATACAAAAATTAGCCAGGTGTGGTGGCTCATGCCTGTAGTCCCAGCTACTTGCGA GGCTGAGGCAGGCTGAGAATCTCTTGAACCCAGGAAGTAGAGGTTGCAGTAGGCCAAGAT GGCGCCACTGCACTCCAGCCTGGGTGACAGAGCAAGACCCTATGTCCAAAAAAAAAAAAA AAAAAAAGGAAAAGAAAAAGAAAGAAAACAGTGTATATATAGTATATAGCTGAAGCTCCC TGTGTACCCATCCCCAATTCCATTTCCCTTTTTTGTCCCAGAGAACACCCCATTCCTGAC TAGTGTTTTATGTTCCTTTGCTTCTCTTTTTAAAAACTTCAATGCACACATATGCATCCA TGAACAACAGATAGTGGTTTTTGCATGACCTGAAACATTAATGAAATTGTATGATTCTAT

  19. http://bandelestudio.com/tutoriel-mao-sur-la-creation-musicale/http://bandelestudio.com/tutoriel-mao-sur-la-creation-musicale/

  20. http://fr.wikipedia.org/wiki/Philippe_VI_de_France

  21. <book> <part> <chapter> <sect1/> <sect1> <orderedlist numeration="arabic"> <listitem/> <f:fragbody/> </orderedlist> </sect1> </chapter> </part> </book>

  22. <?xml version="1.0"?><?xml-stylesheet href="carmen.xsl" type="text/xsl"?><?cocoon-process type="xslt"?> <!DOCTYPE pagina [<!ELEMENT pagina (titulus?, poema)><!ELEMENT titulus (#PCDATA)><!ELEMENT auctor (praenomen, cognomen, nomen)><!ELEMENT praenomen (#PCDATA)><!ELEMENT nomen (#PCDATA)><!ELEMENT cognomen (#PCDATA)><!ELEMENT poema (versus+)><!ELEMENT versus (#PCDATA)>]> <pagina><titulus>Catullus II</titulus><auctor><praenomen>Gaius</praenomen><nomen>Valerius</nomen><cognomen>Catullus</cognomen></auctor>

  23. And also • Business processes • Bird songs • Images (contours and shapes) • Robot moves • Web services • Malware • …

  24. 2 What does learning mean? • Suppose we write a program that can learn grammars… are we done? • A first question is: « why bother? » • If my programme works, why do something more about it? • Why should we do something when other researchers in Machine Learning are not?

  25. Motivating reflection #1 • Is 17 a random number? • Is 0110110110110101011000111101 a random sequence? (Is grammar G the correct grammar for a given sample S?)

  26. Motivating reflection #2 • In the case of languages, learning is an ongoing process • Is there a moment where we can say we have learnt a language?

  27. Motivating reflection #3 • Statement “I have learnt” does not make sense • Statement “I am learning” makes sense • At least when learning over infinite spaces

  28. What usually is called “having learnt” • That the grammar / automaton is the smallest, best (re a score)  Combinatorial characterisation • That some optimisation problem has been solved • That the “learning” algorithm has converged (EM)

  29. What is not said • That having solved some complex combinatorial question we have an Occam, Compression, MDL, Kolmogorov complexity like argument which gives us some guarantee with respect to the future • Computational learning theory has got such results

  30. Why should we bother and those working in statistical machine learning not? • Whether with numerical functions or with symbolic functions, we are all trying to do some sort of optimisation • The difference is (perhaps) that numerical optimisation works much better than combinatorial optimisation! • [they actually do bother, only differently]

More Related