150 likes | 395 Views
Twitter NLP. Named Entity Recognition in Tweets: TwitterNLP. Ludymila Lobo . Ludymila Lobo. Resources. Reading material
E N D
Twitter NLP Named Entity Recognition in Tweets:TwitterNLP Ludymila Lobo Ludymila Lobo
Resources • Reading material • Named Entity Recognition in Tweets, RITTER, Alan, CLARK, Sam, Mausam and ETZIONI, Oren. Obtained on Association for Computational Linguistics website, at https://aclweb.org/anthology/D/D11/D11-1141.pdf • http://www.academia.edu/1128304/Shallow_parsing_as_part-of-speech_tagging • Twitter NLP Tool • https://github.com/aritter/twitter_nlp • AplicationwithTwitter NLP • statuscalendar.com • CollectingTweets • https://dev.twitter.com • http://www.webdevdoor.com/jquery/twitter-feed-authentication-search • https://github.com/abraham/twitteroauth • http://sourceforge.net/projects/xampp/ http://www.webdevdoor.com/jquery/twitter-feed-authentication-search/
Why Twitter? • Big amount of data (even more than Library of Congress -Washington D.C.)*, with 151 millions of itens • Real time information, some times more up-to-date than articles. • http://pt.wikipedia.org/wiki/Library_of_Congress *Hachman (2011)
Challenges • Noisy and informal nature • Diversity of entities (companies, products, bands, teams, movies, etc), that are not relatively frequent, which makes a sample of Tweets with a few examples • Lack of context • http://twitter.com
Tool • https://github.com/aritter/twitter_nlp • Unzip file, on Linux terminal type: • sh build.sh
Tool • statuscalendar.com
How it works Chunking (shallow parsing) POS (Part of Speech) ->NLP, clustering @paulwalk o It b-np 's b-vp theb-np view i-np fromb-pp whereb-advp I b-np 'm b-vp living i-vp for b-pp twob-np weeks i-np best ADJ ADV NP V betterADJ ADV V DET close ADV ADJ V N cutV N VN VD evenADV DET ADJ V grantNP N V hit V VD VN N DET
How it works POS (Part of Speech) ->NLP, clustering Capitalization classifier: Predicts whether or not a tweet is informatively capitalized (using SVM learning) NER (Named Entity Recognition) Chunking (shallow parsing) Tom Hanks was awesome in Forrest Gump actor movie
Tool @cityofcalgary: Free swimming and golf tomorrow for @cbc Sports Day in Canada #yyc #sportsday http://ow.ly/2G4sf @cityofcalgary/O :/O Free/O swimming/O and/O golf/O tomorrow/O for/O @cbc/O Sports/B-other Day/I-other in/O Canada/B-geo-loc #yyc/O #sportsday/O http://ow.ly/2G4sf/O Adam Beyer: Swedish Techno Pioneer: When it comes to his own DJing and sound, he's slightly more diverse and likes... Adam/B-personBeyer/I-person:/O Swedish/O Techno/O Pioneer/O :/O When/O it/O comes/O to/O his/O own/O DJing/O and/O sound/O ,/O he/O 's/Oslightly/O more/O diverse/O and/O likes/O
https://dev.twitter.com How to retrieve data from Twitter?
<?php session_start(); require_once("twitteroauth/twitteroauth/twitteroauth.php"); //Path to twitteroauthlibrary $search = "wpi OR #WPI"; $notweets = 50; $consumerkey = “123456"; $consumersecret = “123456"; $accesstoken = "123456"; $accesstokensecret = “123456"; functiongetConnectionWithAccessToken($cons_key, $cons_secret, $oauth_token, $oauth_token_secret) { $connection = newTwitterOAuth($cons_key, $cons_secret, $oauth_token, $oauth_token_secret); return $connection; } $connection = getConnectionWithAccessToken($consumerkey, $consumersecret, $accesstoken, $accesstokensecret); $search = str_replace("#", "%23", $search); $tweets = $connection->get("https://api.twitter.com/1.1/search/tweets.json?q=".$search."&count=".$notweets); echojson_encode($tweets); ?> http://www.webdevdoor.com/jquery/twitter-feed-authentication-search/
How to retrieve data from Twitter? • Authentication library https://github.com/abraham/twitteroauth Download and include in the same folder as the code
How to retrieve data from Twitter? http://sourceforge.net/projects/xampp/
How to retrieve data from Twitter? Copytheproject folder to C:\xampp\htdocs
How to retrieve data from Twitter? http://localhost/TwitterStreams/tweet.phpon a browser