1 / 31

Martin Kay Stanford University

Ling 138/238. Martin Kay Stanford University. Introduction to. Computational Linguistics. 30 Introduction Oct 1 Complexity; String search 6 Knuth-Morris-Pratt; Boyer Moore; 8 Suffix Trees 13 Tagging; Alignment 15 20 Chomsky Hierarchy; Regular Expressions 22

kyran
Download Presentation

Martin Kay Stanford University

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ling 138/238 Martin Kay Stanford University Introduction to Computational Linguistics

  2. 30 Introduction Oct 1 Complexity; String search 6 Knuth-Morris-Pratt; Boyer Moore; 8 Suffix Trees 13 Tagging; Alignment 15 20 Chomsky Hierarchy; Regular Expressions 22 27 Finite-state automata 39

  3. Nov 3 Morphology 5 10 Context-free grammar 12 17 Unification, HPSG, LFG 19 24 Machine Translation 26 Dec 1 Summary; Wrap-up 3

  4. Linguistics 138/238 Martin Kay KAY@csli.stanford.edu 740 3043 Margaret Jacks 124 Office hours: TuTh 4.15-5.45 p.m.

  5. Prerequisites and Expectations • No prerequisites • Classroom participation • Occasional readings • Learn Prolog • Laboratory sessions • Homework Problems • Project

  6. Project • Learn something new about language • Significant programming • Group work • Modifying or amplifying existing code A HMM-based tagger A searcher for tagged text Implementation of Suffix trees Morphological analysis Named-entity recognition

  7. Intellectual Relations Relation to • Linguistics • Psychology • Artificial Intelligence • Computer Science Abstract Process

  8. Computational Linguistics as Science Computing as Inspiration

  9. Ideas from Computing Search Divide and Conquer Guides and Oracles Nondeterminism Dynamic Programming Scheduling, agendas Compilation Unification Automata Theory Co-routining and parallelism Top-down vs. bottom-up Complexity

  10. Ideas from Computing Search Nondeterminism Dynamic Programming

  11. A Maize Search Nondeterminism Dynamic Programming Keep you right hand on the wall

  12. Out! Backup! Backup! Backup! A Maize Search Nondeterminism Dynamic Programming

  13. Nondeterminism Search Nondeterminism Dynamic Programming • A process is nondeterministic if there are points in it when a choice must be made, but the information necessary to make the choice is not available. • Solution: Pick one of the alternatives. If it does not work out, come back and pick another one. • Note: the information required to make the choice was available after all!

  14. p o u r f 1 2 3 4 o 2 1 2 3 r 3 2 2 2 DynamicProgramming Search Nondeterminism Dynamic Programming Chalons Metz 192 266 161 Paris 458 Strasbourg 619 288 234 115 620 344 Mulhouse 276 Dijon

  15. The CKY Chart Search Nondeterminism Dynamic Programming people np np np s s s like prep pp pp v vp vp the det np np French adj n n n drink n vp Context free: All phrase with the same — Coverage, and — Category enter into larger phrases as a single item

  16. Ideas from Computing Unification

  17. Unification Unification Attribute Report 1 Report 2 Combined Report eyes blue blue blue hair black or brown brown or red brown accent Italian Italian wife see below see below see below children Ahemed & Angela Rebecca & Angela Ahmed, Angela & Rebecca age middle 48 Middle Wife eyes brown brown weight 247 lbs 112 Kg 247 lbs disposition surly surly

  18. Unification Unification Attribute Report 1 Report 2 Combined Report eyes blue blue blue hair black or brown brown or red brown accent Italian Italian wife see below see below see below children Ahemed & Angela Rebecca & Angela Ahmed, Angela & Rebecca age middle 48 Middle Wife eyes brown grey FAIL weight 247 lbs 112 Kg 247 lbs disposition surly surly

  19. English Agreement Unification The dogsleeps The dogssleep The dog slept The dogs slept The sheep sleeps The sheep sleep The sheep slept The sheep that was in the barn slept The sheep that were in the barn slept

  20. German Case Unification Der Junge sah den Lehrer Den Lehrer sah der Junge Das Mädchen sah der Junge der Junge sah das Mädchen Die Lehrerin sah den Lehrer Die Lehrerin sah das Mädchen

  21. Ideas from Computing Finite-State Methods

  22. Finite-State Methods in Language Processing Finite-State Methods The Application of a branch of mathematics • The regular branch of automata theory to a branch of computational linguistics in which what is crucial is (or can be reduced to) • Properties of string sets and string relations with • A notion of bounded dependency

  23. Finite Languges Dictionaries Compression Phenomena involving bounded dependency Morpholgy Spelling Hyphenation Tokenization Morphological Analysis Phonology Approximations to phenomena involving mostly bounded dependency Syntax Phenomena that can be translated into the realm of strings with bounded dependency Syntax Applications Finite-State Methods

  24. Ideas from Computing Complexity

  25. The Chomsky Hierarchy Complexity Grammar Language Automaton Type 0 Recursively Turing Machines Enumerable Sets Context-sensitive Context-sensitive Nondeterministic linear space bound Turing Machines Context-free Context-free Nondeterministic push- down automata LR(k) Deterministic Context- Deterministic push-down free automata Regular Expressions Regular Sets Finite-state automata Left (Right) Linear

  26. Computation and Psychology Sentence Processing

  27. Computational Linguistics as Engineering Computing as Power

  28. Tools for Linguists • TLF, OED • Corpus Linguistics • Field Notes • Grammar Testing

  29. Translation • MT, Translator's Tools • Alignment, Dictionaries, Term Banks • Normalization and Tuning

  30. Other Applications • Writer's Tools • Spelling • Dictionary, Thesaurus • Grammar • Natural Language Interfaces • Information Storage and Retrieval

  31. CL & AI • Text, Meaning, and Interpretation Linguistics ??? • • • • • • • • • • Text Interpretation Meaning

More Related