160 likes | 302 Views
Modeling the Spread of Influence on the Blogosphere. Akshay Java, Pranam Kolari, Tim Finin, and Tim Oates UMBC Tech Report 04/12/06. Outline. What is influence? Basic Influence Model Influence models for the blogosphere Results Conclusions. What is Influence?.
E N D
Modeling the Spread of Influence on the Blogosphere Akshay Java, Pranam Kolari, Tim Finin, and Tim Oates UMBC Tech Report 04/12/06
Outline • What is influence? • Basic Influence Model • Influence models for the blogosphere • Results • Conclusions
What is Influence? Main Entry: in·flu·encePronunciation: 'in-"flü-&n(t)s, esp Southern in-'Function: nounEtymology: Middle English, from Middle French, from Medieval Latin influentia, from Latin influent-, influens, present participle of influere to flow in, from in- + fluere to flow -- more at FLUID1 a: an ethereal fluid held to flow from the stars and to affect the actions of humans b: an emanation of occult power held to derive from stars2: an emanation of spiritual or moral force3 a:the act or power of producing an effect without apparent exertion of force or direct exercise of commandb: corrupt interference with authority for personal gain4 : the power or capacity of causing an effect in indirect or intangible ways : SWAY5: one that exerts influence- under the influence: affected by alcohol : DRUNK <was arrested for driving under the influence> NOT This Kind of Influence! ;-)
Motivation • Influence models studied for cocitation graphs • David Kempe, Jon Kleinberg, Eva TardosMaximizing the Spread of Influence through a Social Network, KDD 2003 • Applies to blogs also. • Recent Examples: Startups, Microsoft Origami, Walmart,DoD • GOAL: Predict influential blogs • Target nodes to help achieve a “Tipping Point”* * The Tipping Point: Malcolm Gladwell
Influence on the Blogosphere Post was Influenced by NPR, eWeek
Influence Models for the Blogosphere Blog Graph Influence Graph 1/3 U 2 2 1 3 3 2/5 1/3 V 1/3 1 1 1 1/5 5 5 2/5 4 4 1/2 1/2 Wu,v = Cu,v / dv U links to V => U is Influenced by V
Basic Influence Models Influence Graph • Linear Threshold Model Σ bvw ≥ θv w is the active neighbor of v • Cascade Model Pvw- probability with which a node can activate each of its neighbors, independent of history. 1/3 Active 2 1 3 2/5 1/3 θv 1/3 1 1 1/5 5 2/5 Active 4 Inactive 1/2 1/2
Node Selection Heuristics • Inlinks • Easily spammed • Centrality • Expensive to compute for every large graphs • PageRank • Requires link information • However, is easy to compute • Greedy Heuristic • Computationally expensive • However performs better
Effect of Splogs on Node Selection(indegree vs pagerank) Almost 54% of the links were from splogs/failed to splogs/failed!
Effect of Splogs on Inlinks Tightly Knit Community of Splog
Influence Models(without splog detection) Number of nodes selected
Conlusions • Influence models can be applied to blogs not just cocitation graphs • Splogs are a problem • Greedy heuristics work well, pagerank is an inexpensive approximation
Ideas for CIKM 06 • Good or bad influence? Associating sentiment with links. • Finding influential blogs for a topic. (SVM accuracy 75-85%) • Community structure of blogs.
Questions • Comments/ Feedback? • Thanks! • Acknowledgement: • Buzzmetrics/Blogpulse for the dataset.