170 likes | 381 Views
Dynamics of Conversations. ACM SIGKDD ’10 By Ravi Kumar, Mohammad Mahdian, & Mary McGlohon. Presented by Annie T. Chen on March 29, 2011. Overview. RQ: What is the structure of online conversations? Method Proposed a simple mathematical model for the structure of conversations
E N D
Dynamics of Conversations ACM SIGKDD ’10 By Ravi Kumar, Mohammad Mahdian, & Mary McGlohon Presented by Annie T. Chen on March 29, 2011.
Overview • RQ: What is the structure of online conversations? • Method • Proposed a simple mathematical model for the structure of conversations • Added to it to account for factors such as recency and author identity that may affect conversations. • Compared the predictions of these models back to the empirical data for three datasets: Usenet groups, Yahoo! Groups, and Twitter
Properties of Conversations • Size and depth of thread • Depth: length of the maximum path from the root to a leaf in a thread • Size is roughly quadratic to depth • Degree distribution p • Close to power law: p(k) k- for some >2
Branching Process Model (BP-Model) - 1 • The Galton-Watson branching process is a classic model for generating a random tree. • At each ith step in the process, each node generates a certain number of children according to the distribution p • p(k): fraction of nodes with k children in the data • Zi: number of children at the ith level of the thread • let =E[p], the mean of the distribution p
Branching Process Model (BP-Model) - 2 • According to the definition of a branching process, it can be shown that: E[Z] = (1-)-1 • Since < 1 for all datasets, the branching process dies out. Empirical Simulated
Branching Process Model (BP-Model) - 3 • Problems with the BP-Model • Model is not generative (degree distributions are stipulated) • Model does not capture the depth distributions that are observed in reality • Number of children is determined by a single distribution • Timestamps are left out
T-Model • Concept: new messages receive more attention than old ones • Probability of the decision to add a child to v is proportional to some function h(degv, rv) of degree and recency of v • Probability of death is proportional to a constant • h(degv, rv) = degv+rv for constants >=0 and (0,1) • Thus, both degree and recency play a role in generating different types of threads
TI-Model - 1 • The TI-Model was developed to model author identity. • Concept: authors tend to respond to responses to their own earlier messages. • Based on the polya urn model • Original polya urn problem: • Initially, an urn has x balls of color 1 and y balls of color 2. At each time t, one ball is drawn out and returned to the urn with another ball of the same color. • “Rich get richer” process
TI-Model - 2 • New message v arrives with u=parent(v) • “Identity copying” effect an author on path(parent(u)) random author Empirical Simulated
Examples • Usenet • Yahoo! Groups • Twitter
Usenet • Empirical • Simulated
Usenet • High : Higher degree of preferential attachment • Top ones tended to be politically related • High : High recency effect • Lower traffic groups had a higher recency effect
Usenet • Identity copying rates • High (low copying rate): new authors tend to join in often • Low (high copying rate): tendency for authors of posts to have previously already authored a post
Yahoo! Groups • Groups with “bushy” threads and high recency effects
Twitter • Groups with “bushy” threads and high recency effects
Conclusion • Employed various mathematical models to simulate patterns in online conversations • Strengths: • Incorporated time and author identity in the models • Were able to predict patterns that were found in actual datasets • Weaknesses / further directions: • Explanatory power: how well do these models explain differences between conversational environments and/or networks? • Could incorporate other elements of conversation: • Topics • Structural/semantic components of messages • Actor characteristics/roles • How well do these models emulate different types of communication tools, e.g. Twitter?
References • Aldous, D. (2003). Lecture 2: Branching Processes. Accessed March 29, 2011 at http://www.stat.berkeley.edu/~aldous/Networks/lec2.pdf. • Kumar, R., Mahdian, M., & McGlohon, M. (2010). Dynamics of conversations. ACM SIGKDD 2010. • Zhu, T. (2009). Nonlinear Polya Urn Models and Self-Organizing Processes. Accessed March 29, 2011 at http://www.math.upenn.edu/grad/dissertations/tongzhudissertation.pdf.