100 likes | 203 Views
The Sentimental Hyperlink. Spring 2006 Hypertext Term Project Paul Logasa Bogen II, Travis Gadberry, Shaobo Qiu. Google’s PageRank. Simulates a random surfer. Each link is a vote for the usefulness of a website for keyterms near and in the link as well as the destination page.
E N D
The Sentimental Hyperlink Spring 2006 Hypertext Term Project Paul Logasa Bogen II, Travis Gadberry, Shaobo Qiu
Google’s PageRank • Simulates a random surfer. • Each link is a vote for the usefulness of a website for keyterms near and in the link as well as the destination page. • The page evenly distributes its rank between its outgoing links. • Processes iterates until a semi-stable state is found.
The Problem? • “John Pike of Global Security was interviewed on NBC's Today Show about the hoax, and has a transcript of the show on his site.”* • “Bart Sibrel's website with a "smoking gun" video proving we never went to the Moon... sold for a price. Surprise!”* • Both sentences are from the same site but with completely different implications! • *See http://www.badastronomy.com/bad/misc/apollohoax.html
Google’s Response • They do not see a problem. • Why? – Laissez Fair ranking • Massive amounts of data will even out all anomalous behavior. • Why is this a mistake? • Google Bombing (http://en.wikipedia.org/wiki/Google_bomb) • Counter-intuitive.
Back to the Moon Hoax • “Bart Sibrel's website with a "smoking gun" video proving we never went to the Moon... sold for a price. Surprise!” • The second quote has these snippets: • “smoking gun” • for a price • Surprise! • These snippets are not saying Bart Sibrel’s website is a good source of information. • In fact it is saying the opposite!
Sentiment Analysis • What is different between the first and the second? • Sentiment! • Information Retrieval, Textual Studies, and Document Engineering have investigated sentiment and have produces a number of methods for analyzing it.
So what is new here? • Hyperlinks! Hyperlinks! Hyperlinks! • The current communities using sentiment analysis are not analyzing hypertexts as such. • They are either analyzing traditional text or single nodes of a hypertext. • We want to analyze the sentiment of the hyperlink itself!
How? • Google already mines the source context and the destination for keywords. • It would be of little added complexity to determine the sentiment of the source context as a weighting factor. • We could then weight on a scale from [-1,1] to indicate a positive or negative sentiment.
Applications? • We foresee two direct applications • An improved search algorithms less vulnerable to “miserable failure”-style Google Bombs • A Firefox extension to allow surfers the ability to quickly see what links are considered more positively by the author.