80 likes | 153 Views
Analysis of Social Media MLD 10-802, LTI 11-772. William Cohen 9 - 25 -12. Some projects from 2010’s course. Q&A sites and their community basis: Two-student study of AskMetaFilter Q&A site, a sping -off from MetaFilter (a long-standing community site)
E N D
Analysis of Social MediaMLD 10-802, LTI 11-772 William Cohen 9-25-12
Some projects from 2010’s course • Q&A sites and their community basis: • Two-student study of AskMetaFilter Q&A site, a sping-off from MetaFilter (a long-standing community site) • Modeled after similar analyses of Yahoo! Answers, and other Q/A sites but AskMetaFilter is possibly different, since it’s built around an existing community. • Research question: How do Q&A sites built around communities differ from those that accrue communities? • Data available: "Pete[r] Landwehr" <plandweh@cs.cmu.edu>
Some projects from 2010’s course • TED: Comments worth understanding • Two-student project to analyze the 2600+ person community of commentors on 667 TED talks • Explored • (Sub)community detection algorithms • Tag assignment (for folksonomy tags on talks) • Recommending commentors from ASR transcript of talks. • Data available (aasish@cs.cmu.edu)
Some projects from 2010’s course • Topic Models for Twitter • One-student project (w/ help from outside) • Approximately: LDA with each user a document • Used for link prediction
Some projects from 2010’s course • Topic Models for Politics • Hierarchical topic model for text in classes • Also, community analysis of a political dataset • Data available (dong.p.ng@gmail.com)
Some projects from 2010’s course • Structure, Tie Persistence and Event Detection in Large Phone and SMS Networks • Three-student project that used a single large dataset and considered several types of analysis • Can you detect usual calling patterns (e.g., holidays) • Can you predict properties like reciprocity, average call duration, node degree from other easily measurable properties? • Can you predict whether social interactions will persist (vs be transient) over time? • Data not available (except thru Heinz School) • Paper at KDD workshop
Some projects from 2010’s course • Language and Geography in Twitter • One-student project • Course project was exploratory: • Predicting location from various signals (e.g., unambiguous location names) • Identifying events from tweets • Followup project published at EMNLP 2010 • Later cited on SlashDot, ArsTechnica, …, NPR • Data available at http://www.ark.cs.cmu.edu/GeoText/
Additional datasets to know about • Tae Yano and Noah Smith – political blogs and comments http://www.ark.cs.cmu.edu/blog-data/ • Tae Yano and Noah Smith – biased vs unbiased sentences in political blogshttp://sites.google.com/site/amtworkshop2010/data-1 • Tae is interested in talking to/working with people doing a project with either of these