Weather and Tweets UCML 2013

Weather and Tweets UCML 2013 Members: Vinh Dang, Wai I Iong, Matthew Dudley, Jiyuan Li

Background • Analyzing tweets related to the weather • whether it has a positive, negative, or neutral sentiment. • whether the weather occurred in the past, present, or future. • and what kind of weather the tweet references.

The data • Training set: (http://www.kaggle.com/c/crowdflower-weather-twitter) • contains tweets, locations, and a confidence score for each of 24 possible labels. • about 78000 attributes

The data Labels: • s1 + s2 + s3 + s4 + s5 = 1 • w1 + w2 + w3 + w4 = 1 • k1 + k2 + … + k15 may be greater than 1wd

The data • Testing set: • contains the id, tweet, state and location • no “sentiment”, “when”, or “kind” labels • which is where we are heading to • about 42000 attributes

Data Preprocessing • Data “normalizing” • convert html code into character (Ex: &gt → >) • examples: • convert all the hyperlinks in testing set into “{link}” • examples: • Tokenizing For example: “What a bright sunny!” “[what, a, bright, sunny, !]” • SQLite (for storing data)

Methodology • Bags of Words • tf-idf • Approach: 1) Regression SVM (SVR) 2) Ridge Regression

Error Measurement

Result • Our result: • SVR RMSE = 0.26149 • Ridge RMSE = 0.16997 • Others: • The winner: 0.14314 • Start line (all zeros): 0.31957

Result • A better approach (Testing data VS. Actual results) • Review of Labels

Reference • CrowdFlower (2013) “Partly Sunny with a Chance of Hashtags.”, Kaggle, Retrieved from http://www.kaggle.com/c/crowdflower-weather-twitter. • Chih-Chung Chang and Chih-Jen Lin, LIBSVM : a library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1--27:27, 2011. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm • Scikit-learn: Machine Learning in Python, Pedregosa et al., JMLR 12, pp. 2825-2830, 2011.

Question? The End

Weather and Tweets UCML 2013

Weather and Tweets UCML 2013

Presentation Transcript

HEMS Weather Summit 2013

Tweets

Tweets

UCML-AULC survey of IWLP activity in universities in the UK (2012 – 2013)

2013/14 Cold Weather Operations

Historical Tweets

TKAM Tweets

Writer identification in offline handwriting UCML 2013

The Year Abroad: The UCML campaign and funding outcomes

2013 Severe Weather Workshops

UCML plenary, 4 July 2014

Welcome to UCML

Your First Tweets

2013 Spring Weather Outlook

Existential Tweets

2013 NWS Severe Weather Workshops

Weather 2013

Tweets - How to Delete your Old Tweets Fast

Likes Tweets and SEO

Twitter’s Longer Tweets