550 likes | 599 Views
Lecture 3 Influence and Profit. Ding-Zhu Du │ University of Texas at Dallas │. PhD Dissertation:. Optimization in Social Networks. -Influence and Profit. Candidate: Yuqing Zhu Advised by Prof. Weili Wu and Prof. Ding-Zhu Du.
E N D
Lecture 3 Influence and Profit Ding-Zhu Du │ UniversityofTexasatDallas│
PhD Dissertation: Optimization in Social Networks -Influence and Profit Candidate: Yuqing Zhu Advised by Prof. Weili Wu and Prof. Ding-Zhu Du
13th IEEE International Conference on Data Mining (ICDM 2013) Influence and Profit: Two Sides of the Coin Yuqing Zhu
Overview • Background: Social Influence Propagation & Maximization • Influence and Profit • Proposed Model & Its Properties • BIP Maximization Algorithms • Experimental Results • Conclusions
Influence in Social Networks • We live in communities and interact with friends, families, and even strangers • This forms social networks • In social interactions, people may influence each other
Influence Diffusion & Viral Marketing iPad Air is great Word-of-mouth effect iPad Air is great iPad Air is great iPad Air is great iPad Air is great Source:WeiChen’sKDD’10 slides
Social Network as Directed Graph 0.13 0.6 0.3 0.1 0.27 0.41 0.54 0.16 0.11 0.2 0.7 0.2 • 0.1 • 0.8 • 0.7 • 0.9 • Nodes: Individuals in the network • Edges: Links/relationships between individuals • Edge weight on : Influence weight
Linear Threshold (LT) Model – Definition • Each node chooses an activation threshold uniformly at random from [0,1] • Time unfolds in discrete steps 0,1,2… • At step 0, a set 𝑆of seeds are activated • At any step , activate node if • The diffusion stops when no more nodes can be activated • Influence spread of 𝑆:The expected number of active nodes by the end of the diffusion, when targeting 𝑆 initially
Linear Threshold(LT) Model – Example y Inactive Node 0.6 Active Node 0.2 0.2 0.3 Threshold x 0.1 Total Influence Weights 0.4 U 0.3 0.5 0.2 Stop! 0.5 w v Source:DavidKempe’sslides Influence spread of {v} is 4
Independent Cascade (IC) Model – Definition • is the probability of success when tries to activate • Time unfolds in discrete steps 0,1,2… • At step 0, a set 𝑆 of seeds are activated • At any step , a newly activated node has one chance to active its out-neighbor , with probability • The diffusion stops when no more nodes can be activated • Influence spread of 𝑆:The expected number of active nodes by the end of the diffusion, when targeting 𝑆initially
Independent Cascade(IC) Model – Example y Inactive Node 0.6 Active Node 0.2 0.2 0.3 Threshold x 0.1 Total Influence Weights 0.4 U 0.3 0.5 0.2 Stop! 0.5 w v Source:DavidKempe’sslides Influence spread of {v} is 4
Influence Maximization Problem Select k individuals such that by activating them, influence spread is maximized. Input NP-hard #P-hard to compute exact influence Output A directed graph representing a social network, with influence weights on edges
Influence vs. Profit • Classical models do not fully capture monetary aspects of • product adoptions • Being influenced Being willing to purchase • Classical models do not consider the willingness the active nodes on spreading the influence • Being influenced Being willing to spread the influence
Influence vs. Profit • Influence: • Profit: • In market, a famous company does not always make generous profit. E.g. Twitter, SONY, Weibo
Product Adoption • Product adoption is a two-stage process (Kalish 85) • 1st stage: Awareness • Get exposed to the product • Become familiar with its features • 2nd stage: Actual adoption • Only if valuation outweighs price • Only in this case the company gains real profit • The 2nd stage is not captured in existing work
Our Contribution • Incorporate monetary aspects to model the willingness of the nodes on spreading influence • Price-Related (PR) Frame • PR-L model LT model • PR-I model IC model • Balanced Influence and Profit (BIP) Maximization Problem • Two Marketing Strategies: • BinarY priCing (BYC) • PAnoramic Pricing (PAP)
Price Related (PR) Frame Rules in IC or LT Influenced Active Neutral • Three node states: Neutral, Influenced, and Active • NeutralInfluenced: same as in LT or IC • Influenced Active: Only if the valuation is at least the quoted price • Only active nodes will propagate influence to inactive neighbors
Pricing Strategies • BinarY priCing (BYC) • PAnoramic Pricing (PAP)
BIP: Notations :the vector ofquotedprices,oneper • each node • : the seed set • R: the influence function • :the expected influence earned by targeting and setting prices • R: the profit function • :the expected profit earned by targeting and setting prices • : the objective function of balanced Influence and Profit problem
BIPMax Problem Definition Problem Input Select a set of seeds & determine a vector of quoted price, such that the is maximized under the PR Frame Output A directed graph representing a social network, with influence weights on edges
BIPMax vs. InfMax • Difference w/ InfMax under LT/IC • Propagation models are different & have distinct properties • InfMax only requires “binary decision” on nodes, while BIPMax requires to set prices
A Restricted Special Case • Simplifying assumptions: • Valuation: • Pricing: BYC (Seeds get the item for free) • Every seed will automatically adopt the product and propagate the influence • Optimal price vector is out of question
A Restricted Special Case • : The uniform price for every non-seed • : production cost • max:
A Restricted Special Case • Theorem 1:Under PR-I model, when , maximizing B(S) is in P (can be solved in polynomial time). • Equivalent to: how to find the minimum set of nodes such that there is a path from this set to each node in this graph.
A Restricted Special Case • Theorem 2:Under PR-I model, when , maximizing B(S) is NP-hard. • Reduction from the Set Cover problem • Theorem 3:Under PR-L model, for any , maximizing B(S) is NP-hard. • Reduction from the Vertex Cover problem
u1 s1 v1 u2 s2 u3 v2 v3 v4 s3 u4 v5 s4 u5 v6 v7 u6 v8 u7 • Set Cover problem. • Vertex Cover problem.
… • … • … • … • … • … • … • Theorem 2:Reduction from the Set Cover problem.
… • … • … • … • Theorem 3:Reduction from the Vertex Cover problem.
BIPMax Algorithms • Given the distribution function (CDF) of . • the Optimal Price is Define: • Myopic: Ignores network structures and “profit potential” (from influence) of seeds
Determining the Seeds and Prices under BYC Assign all the nodes a uniform price : Pick the node that brings the maximum profit:
Determining the Seeds and Prices under PAP • Two possible results after offering price to : • accepts, . The influence collected from is 1 and the profit is . • does not accept, . The influence collected from it is 1 and the profit is 0.
Determining the Seeds and Prices under PAP • BIP Margin Profit: • : • Define:
Determining the Seeds and Prices under PAP Assign each the node the myopic price : Decide the new price to maximize BIP: Pick the node that brings the maximum profit with its new price :
Network Datasets • Enron • A dataset from about the users who mostly are senior managements of Enron.com • Epinions • A who-trusts-whom network from the customer reviews site • Epinions.com • NetHEPT • A co-authorship network from arxiv.org High Energey Physics Theory section.
Network Datasets • Statisticsofthe datasets:
Experimental Results: Influence and profit of trivalency PR-I on NetHept
Experimental Results: Influence and profit of weighted cascade PR-I on Epinion
Experimental Results: Price Assignment for Seeds
Experimental Results: Profit comparison of APAP and PAGE Experimental Results: Running time on weighted cascade PR-I graph
Conclusions • Extended LT and IC model to incorporate price and valuations & distinguish product adoption from social influence • Studies the properties of the extended model • Proposed Balanced Influence and Profit maximization (BIPMax) problem & effective algorithm to solve it
Future Work • Approximation algorithm design for this problem, approximation ratio and inapproximability.
Future Work • Scalable algorithms of mining the profit and influence in large-scale social networks, e.g., consider the simpler network structure like arborescence and directed acyclic graph.
Publications Book Chapter [1] Lidong Wu, Weili Wu, Zaixin Lu, Yuqing Zhu, and Ding-Zhu Du, “Sensor Cover and Double Partition”, Springer Proceedings in Mathematics & Statistics Volume 59, 2013, pp 203-221. Conference Papers [2] Yuqing Zhu, Weili Wu, Deying Li, and Hui Xiong, “Multi-Influence Maximization in Competitive Social Networks,” submitted to the 20th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2014). [3] Yuqing Zhu, Yiwei Jiang, Weili Wu, Ling Ding, Ankur Teredesai, Deying Li, and Wonjun Lee, “Minimizing Makespan and Total Completion Time in MapReduce-like Systems,” accepted by the 33rd IEEE International Conference on Computer Communications (INFOCOM 2014). [4] Yuqing Zhu, Weili Wu, James Willson, Ling Ding, Lidong Wu, Deying Li, and WonJun Lee, “An Approximation Algorithm for Client Assignment in Client/Server Systems”, accepted by the 33rd IEEE International Conference on Computer Communications (INFOCOM 2014). [5] Yuqing Zhu, Zaixin Lu, Yuanjun Bi, Weili Wu, Yiwei Jiang, and Deying Li, “Influence and Profit: Two sides of the coin,” accepted by IEEE International Conference on Data Mining (ICDM 2013). [6] Yuqing Zhu, Weili Wu, Lidong Wu, Li Wang, and Jie Wang, “SmartPrint: A Cloud Print System for Office”, accepted by IEEE International Conference on Mobile Ad-hoc and Sensor Networks (MSN 2013). [7] Lirong Xue, Donghyun Kim, Yuqing Zhu, Deying Li, Wei Wang, and Alade Tokuta, “A New Approximation Algorithm for Multiple Data Ferry Trajectory Planning Problem in Heterogenous Wireless Sensor Networks”, accepted by the 33rdIEEE International Conference on Computer Communications (INFOCOM 2014). [8] Yuanjun Bi, Weili Wu, and Yuqing Zhu, “CSI: Charged System Influence Model for Human Behavior Prediction,” accepted by IEEE International Conference on Data Mining (ICDM 2013).