270 likes | 299 Views
Explore the fascinating modeling of web graphs using the power law distribution, including directed random surfer models and virtual degrees. Discover how the degrees in various models follow power laws and the open questions in analyzing degree distributions.
E N D
A Random-Surfer Web-Graph Model Avrim Blum, Hubert Chan, Mugizi Rwebangira Carnegie Mellon University
The Web as a Graph Consider the World Wide Web as a graph, with web pages as nodes and links between pages as edges. links.html index.html http://cnn.com resume.html Experiments suggest that the degree distribution of the Web-Graph follows a power law [FFF99].
Power Law The distribution of a quantity X follows a power law if Pr (X=k) = Ck-α Taking the logarithm of both sides: log Pr (X=k) = log C –αlog k Thus if we take a log-logplot of a power law distribution we will obtain a straight line.
Previous Work Barabási and Albert proposed the Preferential Attachment model[BA99]: Each new node connects to the existing nodes with aprobability proportional to their degree. It is known that Preferential Attachment gives a power-law distribution. [Mitzenmacher, Cooper & Frieze 03, KRRSTU00] Other models proposed include the “copying model.” [KRRSTU00]
Motivating Questions • Why would a new node connect to nodes of high degree? • Are high degree nodes more attractive? • Or are there other explanations? How does a new node find out what the high degree nodes are? • Motivating Observation: • Suppose each page has a small probability p of being interesting. • Suppose a user does a (undirected) random walk until they find an interesting page. • If p is small then this is the same as preferential attachment. • What about other processes and directed graphs?
Directed 1-Step Random Surfer At time 1, we start with a single node with a self-loop. At time t, a node is chosen uniformly at random, with probability p the new node connects to this node, or with probability 1-p it connects to a random out-neighbor of that node. (Extension: Repeat process k times for each new node to get out-degree k) Note: This model is just another way of stating the directed preferential attachment model.
T=1 T=2 (½) (½)+ (½) (½)+ (½) (½) T=3 ¾ ¼ T=4 (½) (⅓)+ (½) (⅓)+ (½) (⅓)+(½) (⅓) Directed 1-step Random Surfer, p=.5
Directed Coin Flipping model • At time 1, we start with a single node with a self-loop. • At time t, we choose a node uniformly at random. • We then flip a coin of bias p. • If the coin comes up heads, we connect to the current node. • Else we walk to a random neighbor and go to step 3. “each page has equal probability p of being interesting to us”
NEW NODE RANDOM STARTING NODE 1. COIN TOSS: TAIL 2. COIN TOSS: TAIL 3. COIN TOSS: HEAD
Is Directed Coin-Flipping Power-lawed? We don’t know … but we do have some partial results ... Note: unlike for undirected graphs, the case p → 0 is not so interesting since then you just get a star.
Virtual Degree Definition: Let li(u)be the number of levelidescendents ofu. Let i(i ≥ 1) is a sequence of real number with 1=1. Thenv(u) = 1 + ∑βi li(u) (i ≥ 1)
Virtual Degree u v(u) = 1 + β1 l1(u) + β2 l2(u) + β3 l3(u) + β4 l4(u) + ... = v(u) = 1 + β1 (2) + β2 (4) + β3 (0) + β4 (0) + ... Easy observation: If we set βi = (1-p)i then the expected increase in degree(u) is proportional to v(u).
Virtual Degree • Theorem: There always exist βi such that • For i ≥ 1, |βi| · 1. • As i → ∞, βi →0 exponentially. • The expected increase in v(u) is proportional to v(u). Recurrence: 1=1, 2=p, i+1=i – (1-p)i-1 E.g., for p=¾, i = 1, 3/4, 1/2, 5/16, 3/16, 7/64,... for p=½, i = 1, 1/2, 0, -1/4, -1/4, -1/8, 0, 1/16, … Let vt(u) be the virtual degree of node u at time t and tu be the time when node u first appears. Theorem: For any node u and time t ≥tu, E[vt(u)] = Θ((t/tu)p)
Virtual Degree, contd Let vt(u) be the virtual degree of node u at time t and tu be the time when node u first appears. Theorem: For any node u and time t ≥tu, E[vt(u)] = Θ((t/tu)p) We also have some weak concentration bounds. Unfortunately not strong enough: if these could be strengthened then would have a proof that virtual degrees (not just their expectations) follow power law.
Actual Degree We can also obtain lower bounds on the actual degrees: Theorem: For any node u and time t ≥tu, E[l1(u)] ≥ Ω((t/tu)p(1-p))
Experiments • Random graphs of n=100,000 nodes • Compute statistics averaged over 100 runs. • K=1 (Every node has out-degree 1)
Conclusions • Directed random walk models appear to generate power-laws (and partial theoretical results). • Power laws can naturally emerge, even if all nodes have the same intrinsic “attractiveness”. (Even in absence of “role model” as in copying-model)
Open questions • Can we prove that the degrees in the directed coin-flipping model indeed follow a power law? • Analyze degree distribution for undirected coin-flipping model with p=1/2? • Suppose page i has “interestingness” pi. Can we analyze the degree as a function of t, i and pi?