170 likes | 248 Views
Implicit Structure and Dynamics of Blogspace. Lada Adamic Accelerating Change 2004 (joint work with: Eytan Adar, Li Zhang, and Rajan Lukose). Blogs and the digital experience. Use: Record real-world and virtual experiences Easy to record and discuss things “seen” on the net
E N D
Implicit Structure and Dynamics of Blogspace Lada Adamic Accelerating Change 2004 (joint work with: Eytan Adar, Li Zhang, and Rajan Lukose)
Blogs and the digital experience • Use: • Record real-world and virtual experiences • Easy to record and discuss things “seen” on the net • Structure: blog-to-blog linking • Use + Structure • Great to track “memes”: ideas spreading in the blogosphere like an epidemic
Our interest • Macroscopic patterns of blog epidemics • How does the popularity of a topic evolve over time? • Microscopic patterns of blog epidemics • Implicit & Explicit • Who is getting information from whom? • Ranking algorithms that take advantage of infection patterns
Tracking Blogs • Blogdex: Earliest example • Lets you see which blogs (and when) linked to a site • Others emerged with similar/related functionality • Can find epidemic profiles (popularity over time) • Our question: do different types of information have different epidemic profiles
Slashdot Effect BoingBoing Effect For Example… Popularity Time
Clusters reflect different epidemic profiles Major News – front page More delayed death (broader interest) Slashdot huge surge followed by sharp drop (slashdot-effect)
Clusters Products, etc. Sustained over a period of time Major-news site (editorial content) – back of the paper
b2 b3 Microscale Dynamics • What do we need track specific epidemics? • Timings • Graphs b1 t0 Time of infection t1
b2 b3 Microscale Dynamics • Challenges • Root may be unknown • Multiple possible paths • Uncrawled space, alternate media (email, voice) • No links bn b1 ? ? t0 Time of infection t1
Microscale Dynamics who is getting info from whom • Explicit blog to blog links (easy) • Via links are even better • Implicit/Inferred transfer (harder) • Use ML algorithm for link inference problem • Support Vector Machine (SVM) • Logistic Regression • What we can use • Full text • Blogs in common • Links in common • History of infection
Visualization • Zoomgraph tool • Using GraphViz (by AT&T) layouts • Simple algorithm • If single, explicit link exists, draw it • Otherwise use ML algorithm • Pick the most likely explicit link • Pick the most likely possible link • Tool lets you zoom around space, control threshold, link types, etc.
Giant Microbes epidemic visualization via link inferred link blog explicit link
iRank • “Practical” uses of inferred epidemic information • Can use a simpler inference (timing) • Finding good sources • Invisible authorities b1 True source b2 Popular site b3 b4 … b5 bn
iRank Algorithm • Draw a weighted edge for all pairs of blogs that cite the same URL • higher weight for mentions closer together • run PageRank • control for ‘spam’ t0 Time of infection t1
Shortly thereafter Slashdot posts: "Bloggers' Plagiarism Scientifically Proven" Whichis picked up by Metafilter as "A good amount of bloggers are outright thieves." Do Bloggers Kill Kittens? Friday morning Wired writes: "Warning: Blogs Can Be Infectious.”
Research at the Information Dynamics Lab at HP: http://www.hpl.hp.com/research/idl ladamic@hpl.hp.com