380 likes | 1.15k Views
( **Natural Language Processing Using Python: - https://www.edureka.co/python-natural... ** ) <br>This PPT will provide you with detailed and comprehensive knowledge of the two important aspects of Natural Language Processing ie. Stemming and Lemmatization. It will also provide you with the differences between the two with Demo on each. Following are the topics covered in this PPT: <br><br>Introduction to Big Data <br>What is Text Mining? <br>What is NLP? <br>Introduction to Stemming <br>Introduction to Lemmatization <br>Applications of Stemming & Lemmatization <br>Difference between stemming & Lemmatization <br><br>Follow us to never miss an update in the future. <br><br>Instagram: https://www.instagram.com/edureka_learning/ <br>Facebook: https://www.facebook.com/edurekaIN/ <br>Twitter: https://twitter.com/edurekain <br>LinkedIn: https://www.linkedin.com/company/edureka
E N D
Agenda edureka! 1. What is Natural Language Processing? Anonymous 2. NLP Components Made by Code 3. Stemming 4. Lemmatization 5. Applications of Stemming & Lemmatization 6. The differences between the Two Powered by Citizens of the Internet Trust!
edureka! The Human Language 6500
edureka! Percentage The 21stCentury Unstructured Structured
edureka! What is Text Mining ? Text Mining / Text Analytics is the process of deriving meaningful information from natural language text
edureka! Text Mining and NLP As, Text Mining refers to the process of deriving high quality information from the text . The overall goal is, essentially to turn text into data for analysis, via application of Natural Language Processing (NLP)
edureka! What is NLP ? NLP: Natural Language Processing is a part of computer science and artificial intelligence which deals with human languages.
Anonymous Stemming Tokenization Lemmatization POS Tags Named Entity Recognition Chunking
edureka! Stemming Stemming Lemmatization Lemmatization 1960’s
edureka! Stemming Stemming Lemmatization Lemmatization 1960’s
edureka! miss
edureka! misses
edureka! missing
edureka! NLTK
edureka! NLTK NLTK
edureka! Stemming Stemmingis the process of reducing inflection in words to their “root” forms such as mapping a group of words to the same Stem
edureka! Stemming Stemmingis the process of reducing inflection in words to their “root” forms such as mapping a group of words to the same Stem Porter 1979 Lancaster 1990
edureka! Stemming Porter1979 • Suffix Stripping • 5 Rules • Step By Step
edureka! Stemming Lancaster1990 • Paice-Husk stemmer • Iterative Algorithm • Over Stemming may occur
edureka! Stemming a Document Steps to stem a Document 1. 2. Read the document line by line 3. Tokenize the line 4. Stem the words 5. Output the stemmed words Take a document as the input.
edureka! Other Stemmmers 1. 2. Dutch 3. English 4. French 5. German 6. Hungarian 7. Italian 8. Norwegian 9. Porter 10. Portuguese 11. Romanian 12. Russian 13. Spanish 14. Swedish Danish • Snowball Stemmers Snowball Stemmers • ISRI Stemmer ISRI Stemmer • RSLPS Stemmer RSLPS Stemmer
edureka! Lemmatization • Groups together different inflected forms of a word, called Lemma • Somehow similar to Stemming, as it maps several words into one common root • Output of Lemmatisation is a proper word • For example, a Lemmatiser should map gone, going and went into go
edureka! Applications of Stemming & Lemmatization Sentimental Analysis Document Clustering Information Retrieval
edureka! Stemming Lemmatization Actual Language Word Might not be an Actual Language Word Predefine Steps Uses WordNet Corpus