1 / 10

Text mining : Finding nuggets in mountains of textual data

Author : Jochen Dijrre , Peter Gerstl , Roland Seiffert Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , San Diego, California, August 15-18, 1999, 398-401. Presented by Xxxxxx. Text mining : Finding nuggets in mountains of textual data.

iria
Download Presentation

Text mining : Finding nuggets in mountains of textual data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Author : JochenDijrre, Peter Gerstl, Roland Seiffert Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, California, August 15-18, 1999, 398-401. Presented by Xxxxxx Text mining : Finding nuggets in mountains of textual data

  2. Outline • Motivation • Methodology • Feature Extraction • Clustering and Categorizing • Data Mining VS Text Mining • Conclusion

  3. Motivation • Problem: Most of data in a company is unstructured or semi-structured • Examples: • Letters • Emails • Phone transcripts • Contracts

  4. Definition and Application • Text mining: The discovery by computer of new, previously unknown information, by automatically extracting information from different written resources. • Applications: • Summarizing documents • Discovering/monitoring relations among people • Customer profile analysis • Trend analysis • Documents summarization

  5. Methodology • Aspect 1: Knowledge Discovery • Aspect 2: Information Distillation Approaches: • Extraction • Analysis

  6. Feature Extraction • Recognize and classify significant vocabulary items from the text • Categories of vocabulary • Proper names • Multiword terms • Abbreviations • Relations • Other useful things

  7. Clustering Model

  8. Categorization Model

  9. Data Mining VS Text Mining

  10. Conclusion • Introduction of text mining • Differences between data mining and text mining • Overview of IBM’s Intelligent Miner for Text • The tools and methods used in the past

More Related