110 likes | 136 Views
Introduction to NLTK. Text Analytics Giuseppe Attardi Università di Pisa. Installing NLTK. Download and Install http://nltk.org/install.html Download NLTK data >>> import nltk >>> nltk.download (). Jupyter Notebook.
E N D
Introduction to NLTK Text Analytics Giuseppe Attardi Università di Pisa
Installing NLTK • Download and Install • http://nltk.org/install.html • Download NLTK data >>> import nltk >>> nltk.download()
Jupyter Notebook • Register with your UniPi credentials to activate your free account for a G Suite at: • this page. • Astart your Jupyter Notebook here: • https://attardi-4.di.unipiit:8000/
NLTK • Suite of classes for several NLP tasks • Parsing, POS tagging, classifiers… • Several text processing utilities, corpora • Brown, Penn Treebank corpus… • Your data was divided into sentences using ‘punkt’
NLTK • Text material • Raw text • Annotated Text • Tools • Part of speech taggers • Semantic analysis • Resources • WordNet, Treebanks
Linguistic Tasks • Part of Speech Tagging • Parsing • Word Net • Named Entity Recognition • Information Retrieval • Sentiment Analysis • Document Clustering • Topic Segmentation • Classification • Authoring • Machine Translation • Summarization • Information Extraction • Spoken Dialog Systems • Natural Language Generation • Word Sense Disambiguation
‘import nltk’ • You will need to import the necessary modules to create objects and call member functions • import ~ include objects from pre-built packages • FreqDist, ConditionalFreqDist are in nltk.probability • PlaintextCorpusReader is in nltk.corpus
Basic NLTK usage • Load the notebook ‘Intro to NLTK’ using: • File > Open > Text Anaytics > Intro to NLTK • Explore the examples by advancing through them with the button ►
Exercise 1. • Run examples from Chapter 1 of NLTK book: • http://nltk.googlecode.com/svn/trunk/doc/book/ch01.html
Exercise 2. • Run examples from Chapter 3 of NLTK book • http://nltk.googlecode.com/svn/trunk/doc/book/ch03.html