80 likes | 281 Views
Text Mining: Main Steps. Some analysis can be done without a document-term matrix. Obtain unstructured text. Analyze & DataViz. Convert to Structured Data. External Social Media Twitter, Web, Blogs, Patents, Reviews. Internal Records, Emails, Surveys Notebooks. Document
E N D
Text Mining: Main Steps Some analysis can be done without a document-term matrix Obtain unstructured text Analyze & DataViz Convert to Structured Data External Social Media Twitter, Web, Blogs, Patents, Reviews Internal Records, Emails, Surveys Notebooks Document Term Matrix Or Term-Document Matrix Frequency, Associations, Sentiment, Categorize
Unstructured Structured “Document” can refer to one word, a phrase, a title, tweet, email, paragraph, page, etc. Term-Document Matrix
Unstructured Analytics Various online tools exist for some text-based analysis and dataviz Word Frequency Sentiment Analysis
Word Frequency http://www.textfixer.com/tools/online-word-counter.php Similar Results
http://www.wordle.net/ http://www.wordle.net/advanced Alternatively, can obtain word frequency and enter them into Advanced section. Entering all the cold weather tweets not surprisingly has these outweighing everything Online Word Counter data pasted into Excel. @concatenate(A1,”:”,B1) used. • Limit number of words by using “Layout” and “Maximum Words” • Remove words, by right-clicking on them and using the resulting popup menu. This will re-layout the Wordle without the selected word.