1 / 8

Analysis of Social Media MLD 10-802, LTI 11-772

Analysis of Social Media MLD 10-802, LTI 11-772. William Cohen 1-27-010. Some projects from 2010’s course. Q&A sites and their community basis: Two-student s tudy of AskMetaFilter Q&A site, a sping -off from MetaFilter (a long-standing community site)

chipo
Download Presentation

Analysis of Social Media MLD 10-802, LTI 11-772

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Analysis of Social MediaMLD 10-802, LTI 11-772 William Cohen 1-27-010

  2. Some projects from 2010’s course • Q&A sites and their community basis: • Two-student study of AskMetaFilter Q&A site, a sping-off from MetaFilter (a long-standing community site) • Modeled after similar analyses of Yahoo! Answers, and other Q/A sites but AskMetaFilter is possibly different, since it’s built around an existing community. • Research question: How do Q&A sites built around communities differ from those that accrue communities? • Data available: "Pete[r] Landwehr" <plandweh@cs.cmu.edu>

  3. Some projects from 2010’s course • TED: Comments worth understanding • Two-student project to analyze the 2600+ person community of commentors on 667 TED talks • Explored • (Sub)community detection algorithms • Tag assignment (for folksonomy tags on talks) • Recommending commentors from ASR transcript of talks. • Data available (aasish@cs.cmu.edu)

  4. Some projects from 2010’s course • Topic Models for Twitter • One-student project (w/ help from outside) • Approximately: LDA with each user a document • Used for link prediction

  5. Some projects from 2010’s course • Topic Models for Politics • Hierarchical topic model for text in classes • Also, community analysis of a political dataset • Data available (dong.p.ng@gmail.com)

  6. Some projects from 2010’s course • Structure, Tie Persistence and Event Detection in Large Phone and SMS Networks • Three-student project that used a single large dataset and considered several types of analysis • Can you detect usual calling patterns (e.g., holidays) • Can you predict properties like reciprocity, average call duration, node degree from other easily measurable properties? • Can you predict whether social interactions will persist (vs be transient) over time? • Data not available (except thru Heinz School) • Paper at KDD workshop

  7. Some projects from 2010’s course • Language and Geography in Twitter • One-student project • Course project was exploratory: • Predicting location from various signals (e.g., unambiguous location names) • Identifying events from tweets • Followup project published at EMNLP 2010 • Later cited on SlashDot, ArsTechnica, …, NPR • Data available at http://www.ark.cs.cmu.edu/GeoText/

  8. Additional datasets to know about • Tae Yano and Noah Smith – political blogs and comments http://www.ark.cs.cmu.edu/blog-data/ • Tae Yano and Noah Smith – biased vs unbiased sentences in political blogshttp://sites.google.com/site/amtworkshop2010/data-1 • Tae is interested in talking to/working with people doing a project with either of these

More Related