1 / 20

Viral Marketing – Learning Influence Probabilities

Viral Marketing – Learning Influence Probabilities . Learning Influence M odels. Where do the numbers come from? . Learning influence models. Where do influence probabilities come from? Real world social networks don’t have probabilities! Can we learn the probabilities from action logs?

orli
Download Presentation

Viral Marketing – Learning Influence Probabilities

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Viral Marketing – Learning Influence Probabilities

  2. Learning Influence Models

  3. Where do the numbers come from?

  4. Learning influence models • Where do influence probabilities come from? • Real world social networks don’t have probabilities! • Can we learn the probabilities from action logs? • Sometimes we don’t even know the social network • Can we learn the social network, too? • Does influence probability change over time? • Yes! How can we take time into account? • Can we predict the time at which user is most likely to perform an action?

  5. Where do the weights come from? • Influence Maximization – Gen 0: academic collaboration networks (real) with weights assigned arbitrarily using some models: • Trivalency: weights chosen uniformly at random from {0.1, 0.01, 0.001}. 0.1 0.001 0.01 0.001 0.01 0.01

  6. Where do the weights come from? • Influence Maximization – Gen 0: academic collaboration networks (real) with weights assigned arbitrarily using some models: • Weighted Cascade: 1/3 1/3 Other variants: uniform (constant), WC with parallel edges. 1/3 edges between and LT: IC: prob. Of success of each attempt. 1/3 Weight assignment not backed by real data.  1/3 1/3

  7. Inference problems • Given a log • P1. Social network not given • Infer network and edge weights • P2. Social network given • Infer edge weights • P3. Social network and attribution given • Explicit “trackbacks” to parent user • Simple counting

  8. P1. Social network not given • Observe activation times, assume probability of a successful activation decays (e.g., exponentially) with time Actual network Learned network [Gomez-Rodriguez, Leskovec, & Krause KDD 2010]

  9. P2. Social network given Input data: (1) social graph and (2) action log of past propagations I read this article great movie 09:30 u45 I liked this movie 09:00 u12 u45 follows u12

  10. P2. Social network given • D(0), D(1), … D(t) nodes that acted at time t. • cumulative. • Find that maximizes likelihood success failure Very expensive (not scalable) Assumes influence weights remain constant over time [Saito et al. KES 2008]

  11. P2. Social N/W given • Action log consists of multiple cascades: and corresp. Cascades may have different lengths -- • Maximize Log Likelihood using EM.

  12. P2. Social network given • Several models of influence probability • in the context of General Threshold model + time • consistent with IC and LT models • Models able to predict whether a user will perform an action or not: predict the time at which she will perform it • Introduce metrics of user and action influenceability • high values  genuine influence • Develop efficient algorithms to learn the parameters of the models; minimize the number of scans over the propagation log • Incrementality property [[Goyal, Bonchi, and L. WSDM2010 ]

  13. Generalized Threshold Model • Each node chooses a threshold at random from just like in LT model. • But now, the net inflow of influence can be a general function where is the set of in-neighbors of that are currently active. • activates iff • Note: Diff. nodes can not only choose diff. they can also use diff. activation functions • Special Case:  LT model.

  14. Incrementality • In a learning algorithm, need to compute activation probs of nodes repeatedly as new in-neighbors get activated. • Suppose we have already computed for a node w.r.t. its current set of active in-neighbors Suppose a new in-neighbor has activated and we want to compute fast, i.e., from and instead of from scratch.

  15. Incrementality /2 • One activation function that satisifes our criterion is: • It’s incremental: • Furthermore, this function is monotone and submodular. These properties are required since they are assumed by IC/LT models. • Incrementality is desirable (for speed), not required.

  16. Influence models Static Models:probabilities are static and do not change over time. Bernoulli: Jaccard: Continuous Time (CT) Models:probabilities decay exponentially in time Not incremental, hence very expensive to apply on large datasets. Discrete Time (CT) Models:Active neighboru of v remains contagious in [t, t+ (u,v)], has constant influence probp(u,v) in the interval and 0 outside. Monotone, submodular, and incremental!

  17. Evaluation • Flickr groups dataset (action=joining) • ~1.3M nodes, 40M edges, 36M actions • 80/20 training/testing split • Predict whether user will become active or not, given active neighbors Ideal Point FPR = FP/N = FP/(FP+TN) TPR = TP/P = TP/(TP+FN) Operating Point Binary Classification (ROC curve)

  18. Comparison of Static, CT and DT models • Time-conscious models better than the static model • CT and DT models perform equally well • Static and DT models are far more efficient compared to CT models because of their incremental nature

  19. Predicting Time – Distribution of Error • Operating Point is chosen corresponding to • TPR: 82.5%, FPR: 17.5%. • Most of the time, error in the prediction is very small

  20. Learning Influence Probabilities Takeaways • Influence network and weights not always available • Learn from the action log • [Gomez-Rodriguez et al. 2010]: Infer social network and edge weights • [Saito et al. 2008]: Infer edge weights using EM approach • [Goyal et al. 2010]: Infer both static and time-conscious models of influence • Using CT models, it is possible to predict even the time at which a user will perform it with a good accuracy. • Introduce metrics of users and actions influenceability. • High values => easier prediction of influence. • Can be utilized in Viral Marketing decisions.

More Related