150 likes | 298 Views
NLTK & Python Day 8. LING 681.02 Computational Linguistics Harry Howard Tulane University. Course organization. NLTK should be installed on the computers in this room!. NLPP §2 Accessing text corpora and lexical resources. §2.2 Conditional frequency. Practice. Do "Your Turn" up to p. 55
E N D
NLTK & PythonDay 8 LING 681.02 Computational Linguistics Harry Howard Tulane University
Course organization • NLTK should be installed on the computers in this room! LING 681.02, Prof. Howard, Tulane University
NLPP §2 Accessing text corpora and lexical resources §2.2 Conditional frequency
Practice • Do "Your Turn" up to p. 55 • Exercises 2.8.2-4, 2.8.8 LING 681.02, Prof. Howard, Tulane University
NLPP §2 Accessing text corpora and lexical resources §2.3 More Python: Reusing code
Creating a program with a text editor • Create the monty.py program. LING 681.02, Prof. Howard, Tulane University
Other IDEs • Eclipse (Java Dev) + Pydev plugin • http://www.eclipse.org/downloads/ • Mac users should use Cocoa version • http://pydev.org/index.html • Xcode Tools now supports Python • It is part of optional installation on DVD. • You have to register as a developer to download it from http://developer.apple.com/ LING 681.02, Prof. Howard, Tulane University
Functions • What might you want to put in your program? • Why, a function, of course! • A function takes an input to produce an output or return value: >>> def my_function_name(my_inputs) ... # calculate my_output ... return my_output ... LING 681.02, Prof. Howard, Tulane University
Modules and higher • As you accumulate functions, you will want to store them somewhere. • Save them all in the same text file with the .py suffix, i.e. my_mod.py, called a module and • import them as needed: • from my_mod import my_function_name • Hierarchy • function < module < package < library LING 681.02, Prof. Howard, Tulane University
NLPP §2 Accessing text corpora and lexical resources §2.4 Lexical resources
Lexical resources • What is a lexicon? • a collection of words and/or phrases, sometimes with additional information such as part of speech or meaning • What is a lexical entry? • A headword/lemma, along with that other info saw1 [verb] past tense of see saw2 [noun] cutting instrument LING 681.02, Prof. Howard, Tulane University
More corpora • Wordlist corpora • words • Names Corpus • Do ex. 2.8.8 • CMU Pronouncing Dictionary • Do ex. 2.8.12 • Comparative wordlists • Swadesh wordlist • Shoebox/Toolbox LING 681.02, Prof. Howard, Tulane University
NLPP §2 Accessing text corpora and lexical resources §2.5 WordNet
Semantic relations • Synonym • Synonyms are grouped into synsets in WordNet • look at code • Do Your turn LING 681.02, Prof. Howard, Tulane University
Next time Q/P2 Do two of Ex. 2.8.16-19 Start NLPP §3