100 likes | 254 Views
Author : Jochen Dijrre , Peter Gerstl , Roland Seiffert Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , San Diego, California, August 15-18, 1999, 398-401. Presented by Xxxxxx. Text mining : Finding nuggets in mountains of textual data.
E N D
Author : JochenDijrre, Peter Gerstl, Roland Seiffert Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, California, August 15-18, 1999, 398-401. Presented by Xxxxxx Text mining : Finding nuggets in mountains of textual data
Outline • Motivation • Methodology • Feature Extraction • Clustering and Categorizing • Data Mining VS Text Mining • Conclusion
Motivation • Problem: Most of data in a company is unstructured or semi-structured • Examples: • Letters • Emails • Phone transcripts • Contracts
Definition and Application • Text mining: The discovery by computer of new, previously unknown information, by automatically extracting information from different written resources. • Applications: • Summarizing documents • Discovering/monitoring relations among people • Customer profile analysis • Trend analysis • Documents summarization
Methodology • Aspect 1: Knowledge Discovery • Aspect 2: Information Distillation Approaches: • Extraction • Analysis
Feature Extraction • Recognize and classify significant vocabulary items from the text • Categories of vocabulary • Proper names • Multiword terms • Abbreviations • Relations • Other useful things
Conclusion • Introduction of text mining • Differences between data mining and text mining • Overview of IBM’s Intelligent Miner for Text • The tools and methods used in the past