1 / 17

A Random Walk Approach to Modeling the Dynamics of the Blogosphere

A Random Walk Approach to Modeling the Dynamics of the Blogosphere. Alex X. Liu Dept. of Computer Science and Engineering Michigan State University Joint work with M. Zubair Shafiq. Background. Important niche of online social networks Blogosphere consists of two networks

lulu
Download Presentation

A Random Walk Approach to Modeling the Dynamics of the Blogosphere

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Random Walk Approach to Modeling theDynamics of the Blogosphere Alex X. Liu Dept. of Computer Science and Engineering Michigan State University Joint work with M. ZubairShafiq

  2. Background • Important niche of online social networks • Blogosphere consists of two networks • Blog network (Nodes = Blogs, Edges = Hyperlinks) • Post network (Node = Posts, Edges = Hyperlinks) Blog network Post network Blogosphere

  3. Motivation • Modeling the evolution dynamics of blogosphere • How do blogs produce posts? • What are underlying mechanisms? • Applications • Advertising • Forecasting • Studying effect of probing for improving platform design

  4. Problem Statement • Model • Generative model of individual bloggers in the blogosphere • Replicated for all bloggers • Allowed to execute over a given period of time • Requirements • Only use local mechanisms • Intuitive and realistic • Evaluation • Ground truth: properties of real-world blogosphere • Temporal properties, e.g. inter-posting time • Topological properties, e.g. degree distribution

  5. Limitations of Prior Art • Only Topological properties • Second space [Karandikar08ICWSM] • Ad hoc in nature • Multiple input parameters • Kronecker graphs [Leskovec07ICML] • Not specifically designed for blogosphere • Only Temporal properties • Randomized blogspace [Kumar03WWW) • Mostly focuses on burstiness property • Both Topological and temporal properties • Zero-crossing [Gotz09ICWSM] • No parameters, cannot control properties • Uses global properties, e.g. total number of in-links to a blog

  6. Proposed Approach • Random walk process • Different variants of the random walk process • Emulate the topological and temporal characteristics of individual bloggers Two dimensional One dimensional

  7. Flow Chart of the Proposed Model • Series of random walks for each blogger • Random walk 1 • Post at zero-crossing • Random walk 2 • Select new blogs to link (explore) • Random walk 3 • Select previously linked blogs to link (exploit) • Random walk 4 • Select post to link • Random walk 5 • Link to posts referred by the selected post

  8. Proposed Model • Random walk 1 (one dimensional random walk) • Temporal dynamics of a blogger's posting behavior • Publish a new post at zero crossing • Reproduce burstiness, self-similarity in publishing behavior • Slope of entropy plot ≈ 0.7 < 1

  9. Proposed Model • Random walk 2 (random walk on blog graph) • Select new blogs to link (Explore) • Starts at a randomly chosen node of the blog graph • Blog reached at the end of random walk is selected • Random walk 3 (random walk on blog graph) • Select previously linked blogs to link (Exploit) • Starts at the corresponding node of the blog graph • Blog reached at the end of random walk is selected

  10. Proposed Model • Random walk 4 (random walk on post graph) • To select a post from the selected blog • Ordered in the reverse-chronological order • Initiates at the latest post • Post reached at the end of random walk is selected • Random walk 5 (random walk on post graph) • Blogger recursively refers to some out-links of the selected post (link expansion) • Starts at the post selected in random walk 4 • Post reached at the end of random walk is selected

  11. Inter-posting times • Definition: time between two consecutive posts • The distribution of inter-posting times follows power-law • Implication: blogging activity is characterized by long periods of inactivity separated by short periods of activity

  12. Blog in-degree • Definition: count of in-linking blogs • The distribution of blog in-degree times follows power-law • Implication: only a few blogs receive large number of in-links and a majority of blogs remain unnoticed

  13. Post in-degree • Definition: count of in-linking posts • The distribution of post in-degree times follows power-law • Implication: only a few posts receive large number of in-links and a majority of posts remain unnoticed

  14. PageRank • Definition: PageRank or Eigenvector centrality assign importance weight assigned to every node in a network • The distribution of PageRank times follows power-law • Implication: only afew posts are highly cited

  15. Future Work • Study affect of varying the length of random walk on structural properties of generated blogospheres • Structural properties: • Transitivity • Average clustering coefficient • Average shortest path length • Size of largest connected component

  16. Conclusions • Propose random walk based model to simulate the evolution of blogosphere • Intuitive and simple • Simultaneously considers temporal and topological properties • Model works with the local information, does not utilize global information • Experiments show that the properties of evolved blogosphere follow those of real-world blogosphere • Structural properties can be controlled by varying the length of random walk

  17. Questions?

More Related