210 likes | 272 Views
This systematic framework identifies sentiment from user-generated content on social media by considering peer influence, user preference, profile, and textual analysis. The methodology combines collaborative filtering algorithms and state-of-the-art textual sentiment identification to classify sentences into positive, negative, or objective categories with an accuracy of ~86%. The study sheds light on user social effects often overlooked in sentiment analysis, providing valuable insights for intelligent marketing decisions.
E N D
Click to Add Title ASystematicFrameworkforSentimentIdentificationbyModelingUserSocialEffects KunpengZhang AssistantProfessor DepartmentofInformationandDecisionSciences UniversityofIllinoisatChicago kzhang6@uic.edu
Agenda • Introduction • Problemstatement • Methodology • Experimentsandresults • Conclusionandfuturework A World-Class Education, A World-Class City
Co-authors • YiYang,Ph.D.studentatNorthwesternUniversity • AaronSun,ResearchScientist,SamsungResearchAmerica • HengchangLiu,AssistantProfessoratUniversityofScienceandTechnologyofChina A World-Class Education, A World-Class City
Introduction • User generated content on social media platforms • Data analysis for intelligent marketing decisions • Voice of consumers • Positive / negative aspects A World-Class Education, A World-Class City
ProblemStatement • Given a sentence (usually, it is user-generated content on social media platforms, such as comments on Facebook, tweets on Twitter, review on Amazon.com, etc.), we classify it into one of three categories: • Positive: directly or indirectly praise something, e.g. “I love it! (^_^)” • Negative: directly or indirectly criticize something, e.g. “We don’t like it at all. ” • Objective: No sentiments, or express a fact. e.g. “Apple will release a new iPhone in next two months.” A World-Class Education, A World-Class City
Previous Work • Bag-of-word approaches • Collecting keywords [5, 7, 21, 26] • Rule-based methods • From the perspective of language characteristics [6, 22] • Machine learning based methods • Sentence-level and document-level [7, 8, 10, 29] • However, • None of them considers user social effects… A World-Class Education, A World-Class City
Methodology • Systematic framework • Classification problem • 4 major features: • Peer influence • User preference • User profile • Textual sentiment A World-Class Education, A World-Class City
Methodology 1 – User Preference (UserPref) • User preference can somehow reflects user sentiments. • Item-based collaborative filtering on user-item matrix • Row: user (millions) • Column: brand (thousands) • The element mij is 1 if user i “likes” brand j, otherwise 0 m11, m12,…………,m1n m21, m22, …………,m2n …………… mm1,mm2, ……….., mmn Note: “like” – like a brand on Facebook, following a brand on Twitter, give a high rating for a product on Amazon, etc. A World-Class Education, A World-Class City
Methodology 1 – User Preference (UserPref) • Two important issues using collaborative filtering • Data sparsity • Integrate multiple low-lever items into fewer high-lever items • “Mac” and “iPhone” “Computer and Electronics” • Similarity calculation and preference prediction • Which similarity measure is better? • Cosine, Pearson correlation, Tanimoto correlation,log-likelihood based, Euclidean distance-based. • Weighted sum strategy to approximate user preference A World-Class Education, A World-Class City
Methodology 2 – Peer Influence (PeerInf) • Herding behavior in social psychology. • We assume that if most of previous comments in one discussion are positive, it is likely to give a positive comment, and similarly for the negative case. • We randomly pick 1, 000 posts from 5 different Facebook pages and 1, 000 discussion threads from 5 different airlines on the Flyertalk.com forum. The average number of comments per post and per thread is 794 and 32, respectively. • The sentiments are identified by the state-of-the-art textual algorithm. A World-Class Education, A World-Class City
Methodology 2 – Peer Influence A World-Class Education, A World-Class City
Methodology 2 – Peer Influence Modeling A World-Class Education, A World-Class City
Methodology 3 – User Profile (GenCat) • Female are more positive than male and fashion page has a higher percentage of positive sentiments than politician page on Facebook and Twitter. A World-Class Education, A World-Class City
Methodology 4 – Textual Sentiment (TextSent) • State-of-the-art textual sentiment identification algorithm • Ensemble method integrating three individual algorithms • Semantic rules based on language characteristics • Numeric strength computing • Bag-of-word • Accuracy: ~86% A World-Class Education, A World-Class City
Experiments and Results • Data collection • Facebook: posts, comments, likes, user profile • Twitter: tweets, follower, user profile • Amazon: product and reviews • Flyertalk (airline discussion forum): discussions • Data cleaning • Remove spam users A World-Class Education, A World-Class City
Experiments and Results • The features of learning model for 4 datasets and their differences. Topic is modified based on the raw Facebook category. “×”: missed; “√”: existing. A World-Class Education, A World-Class City
Experiments and Results • Similarity measure check. • MAE and RMSE to compare the average estimated error between real preference and predicted preference • Hadoop-based collaborative filtering implemented by Mahout. • Takes 34 and 21 minutes to approximate user preferences for Facebook and Twitter • Can NOT complete in 10 hours for single CPU. A World-Class Education, A World-Class City
Experiments and Results • Facebook data • Twitter data • Amazon.com data A World-Class Education, A World-Class City
Experiments and Results • Classification accuracy (SS: semantic + syntactic features used in [28]) A World-Class Education, A World-Class City
Conclusion and Future Work • We propose a systematic framework to identify social media sentiments by modeling user social effects: user preference, peer influence, user profile, and textual sentiment itself. • However, • More networked data could be incorporated. • More efficient algorithms to calculate user preference. A World-Class Education, A World-Class City
Thank you A World-Class Education, A World-Class City