100 likes | 125 Views
Twitter Sentiment Analysis for Product Promotion. Jake Johnson, Prashant K Thakur, Dawson Canby CS 455: Introduction to Distributed Systems Computer Science Department, Colorado State University. Background Information.
E N D
Twitter Sentiment Analysis for Product Promotion Jake Johnson, Prashant K Thakur, Dawson Canby CS 455: Introduction to Distributed Systems Computer Science Department, Colorado State University
Background Information • Social media can be used to promote products and services to millions of people. • It can also be used to examine the public’s response to these promotions. • Companies that can analyze the response to their social media marketing campaigns will be able to create more effective marketing strategies.
Problem Characterization • Twitter is a social network that is designed to encourage conversation and the spreading of information. • We want to provide companies with a way to analyze the public’s opinions about their product or service in real time and on a large scale. • requires real time streaming of tweets about a product or service • requires accurate analysis of the sentiment of tweets about the product or service • We want to perform analytic tasks that allow the company to make marketing decisions based on the demographics of individuals who respond positively and negatively to previous marketing attempts.
Methodology - Software Stack • The Twitter Streaming API allows monitoring of tweets in real time. • allows filtering of tweets by keywords, hashtags, and usernames • tweet data gives us insights about how popular, controversial, or viral a tweet is • Used “tweepy” python library • Filter example : languages=["en"], track=['#GoTS8', '#GoT', "#GameOfThrones", "@GameOfThrones", "@gameofthrones", "@GAMEOFTHRONES" ] • The Stanford CoreNLP library performs sentiment analysis of plain text. • determines whether a sentence has a positive, negative, or neutral sentiment associated with it • Kafka provides distributed, resilient, fast, scalable, fault-tolerant replacement for traditional message brokers. • Data inside Kafka is immutable.
Methodology • Spark Consumes the data from Kafka • Batch size of 20 seconds • Created DStream of TweeterParser Objects • Converted to DataSet • Used Spark SQL to query for relating different fields extracted from tweets.
Performance and Benchmarks - Streaming Tweets • Checked the number of tweets received per minute - highly dependent on the topic we were watching. • Usually 5 - 500 tweets per minute • Twitter streaming API states that only about 1% of new tweets matching a filter will be streamed to an application • more specific filters do not usually decrease the number of incoming tweets (as long as enough people are talking about a subject)
Performance and Benchmarks - Sentiment Analysis Stanford coreNLP Sentiment Analysis Live Demo • Negative sentence: “Game of Thrones is the absolute worst show to exist.” • Positive sentence: “Daenerys is my favorite character!”
Performance and Benchmarks - Per State Analysis Average Sentiment for Quebec, Ontario, and Saskatchewan • Viewing results of the sentiment analysis state by state allows for targeted marketing changes • Targeting states with lower average sentiment scores can increase the effectiveness of advertisements • Less advertising should be needed for states with already high sentiment • We also created popularity and viral scores for tweets so companies could see which opinions are most popular QC ON SK
Insights and Conclusions • Tweet data is often too incomplete to be useful - most users do not want to share their location data when they tweet. • Tweets are often full of colloquial language, emojis, and humor/sarcasm - this makes it difficult to perform accurate sentiment analysis. • Certain companies can generate more buzz on social media because of the nature of their product. • tweets about products in the domain of pop culture (especially movies and television) generate more activity than those about less exciting (but still innovative) products • many companies make an effort to create humorous Twitter accounts to attract a wider audience