1 / 16

Dynamics of Conversations

This research aims to analyze the properties of conversations in online social networks and design a generative model to produce these properties. The study includes the analysis of threads in Yahoo! Groups, Usenet, and Twitter, and proposes a baseline model and an improved time-based model to generate thread structures and determine authorship.

msmithers
Download Presentation

Dynamics of Conversations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Dynamics of Conversations Ravi Kumar, Y! Research Mohammad Mahdian, Y! Research Mary McGlohon, CMU

  2. Motivation • Online social networks are a major source people turn to for information. • Our goal: To understand the dynamics of information dissemination in social networks. • What are properties of conversations (threads)? • Can we design a generative model to produce these properties?

  3. Our methods: Analyzing threads • Analyze threads in online groups: Yahoo! Groups, Usenet, and Twitter. • Take each thread, represent as a subgraph. • Perform network measures on the set of subgraphs. KDD 2011 in Fiji? FakeKDD-mailing-list Subject: KDD11 venue Author Message Alice “KDD2011 in Fiji?” Bob “Great idea!” Cal “Too much sunshine” Alice “OK,…” Too much sunshine  Great idea! OK, you can stay home.

  4. Data • Yahoo! Groups • Public, moderated, active groups • 13,000 groups, 14.9 million messages • Usenet • 100 high-activity groups for 1 year • 22 million messages • Twitter • Sampled for one month • 69 million messages

  5. Observation: Size vs Depth • Q: If more replies occur in the thread, where do they join? • A: The depth of threads grows sub-linearly, but super-logarithmically in the size of the thread.

  6. Observation: Degree of threads • Q: Does the number of responses a message gets depend on its depth? • A: The degree distribution does change according to level in the thread.

  7. Observation: Authorship • Q: As a thread increases in size, how many new authors join? • A: There is a power-law relationship between: • Size of thread and maximum activity from one author. • Size of thread and number of authors participating.

  8. Baseline model: Branching Process • Each node has some k children with probability distribution p. • Pros • Conceptually simple • Cons • Not generative • Will not produce behavior by levels • Will not have heavy-tail depth distribution • Does not take into account recency effects log(p(k)) log(k)

  9. Time model • Thread grows in discrete time steps • At each time tick, may stop thread • Or add message in reply to some node • New attachment point is chosen based on degree and recency of parent node v • p(child, v) = a dv+ trv , a ≥ 0, 0 ≤ t ≤ 1 • Pros • Will have both preferential attachment and recency effects t=1, r=0 d=0

  10. Time model • Thread grows in discrete time steps • At each time tick, may stop thread • Or add message in reply to some node • New attachment point is chosen based on degree and recency of parent node v • p(child, v) = a dv+ trv , a ≥ 0, 0 ≤ t ≤ 1 • Pros • Will have both preferential attachment and recency effects t=1, r=1 d=1 t=2 r=0 d=0

  11. Time model • Thread grows in discrete time steps • At each time tick, may stop thread • Or add message in reply to some node • New attachment point is chosen based on degree and recency of parent node v • p(child, v) = a dv+ trv , a ≥ 0, 0 ≤ t ≤ 1 • Pros • Will have both preferential attachment and recency effects t=1, r=2 d=2 t=3 r=0 d=0 t=2 r=1 d=0

  12. Time model • Thread grows in discrete time steps • At each time tick, may stop thread • Or add message in reply to some node • New attachment point is chosen based on degree and recency of parent node v • p(child, v) = a dv+ trv , a ≥ 0, 0 ≤ t ≤ 1 • Pros • Will have both preferential attachment and recency effects • Using only one will produce either “busy” or “skinny” trees (stars/chains) t=1, r=2 d=2 t=3 r=0 d=0 t=2 r=1 d=0

  13. Time model with identity • After modeling time model, assign identities to nodes • Use Polya urn-like process • Either pick new author, or pick author from further up in the chain (except parent)

  14. Model: Size vs. depth • Simulation: • Data:

  15. Model: Degree, author activity • Simulation: • Data:

  16. Conclusion • We examined several properties of conversations in 3 large sets of data • Showed that the thread structure and degree of a node are inter-related • We proposed models to generate these properties • Baseline birth process model • An improved model, depending on time and determining authorship

More Related