Power Law and Its Generative Models

Power Law and Its Generative Models Bo Young Kim 2010-03-16

Contents • Recall The Definition of Power Law • Recall Some Properties of Power Law • Generative Models for Power Law - Power Laws via Preferential Attachment - Power Laws via Multiplicative Processes Applied Algorithm Lab.

Recall The Definition of Power Law • Recall Some Properties of Power Law • Generative Models for Power Law - Power Laws via Preferential Attachment - Power Laws via Multiplicative Processes Applied Algorithm Lab.

1. Recall The Definition of Power Law • X: a nonnegative random variable • Def Power Law X is said to have a power law distribution if Pr[X≥x]~cx-αfor constants c>0, α>0 • Def f(x)~g(x) ⇔ limx f(x)/g(x) = 1 • What does this mean? In a power law distribution, asymptotically the tails fall according to the power α. (heavier tail than exponential distribution) Applied Algorithm Lab.

2. Recall Some Properties of Power Law • E.g. The Pareto distribution Pr[X≥x]=(x/k)-α ln(Pr[X≥x])=-α(ln(x)-ln(k)) * Linear Log-log plot (complementary cumulative distribution function) - X has a power law distribution - Then a log-log plot behavior is a straight line. (asymptotic sense) Applied Algorithm Lab.

2. Recall Some Properties of Power Law “Scale Invariance” - Let f(x) := P[X≥x] - f(x) ~ cx-α - f(kx) ~ c(kx) -α = k-α(cx-α) = k’f(x) ∝ f(x) (k’=k-α) - Scaling by a constant simply multiplies the original power law relation by the constant k’. - If we change the measurement unit(=scale), it retains the same power law form w/ the same exponent.  We cannot decide what scale we’re observing. (like Fractals) Applied Algorithm Lab.

2. Recall Some Properties of Power Law • Web follows power law. [4] • Recall (Rank exponent) - dv: outdegree of a node v - rv: the rank of a node v dv=k*rvR (R,k: constant) • Designing random graph models that yield Web-like graphs? • i.e. that yields power law distributions for the indegree and outdegree? Applied Algorithm Lab.

Generative Models for Power Law - Power Laws via Preferential Attachment • Def Preferential Attachment Process (=Yule Process) Any process s.t. some quantity (some form of wealth) is distributed among a number of individuals according to how much they already have,so that those who are already wealthy receive more than those who are not. • ”The rich get richer” Applied Algorithm Lab.

Generative Models for Power Law - Power Laws via Preferential Attachment • The Chinese Restaurant Process - A Chinese restaurant has infinitely many tables - Each table can seat infinitely many customers - At each time step, customer Xtcomes into the restaurant. When Xt+1 comes into here… (CRP1) Sits at an already occupied table k w/ prob. Nk/(t+α) (Nk: # of customers at table k  ΣkNk=t) (CRP2)or, sits at the next unoccupied table w/ prob. α/(t+α) Applied Algorithm Lab.

Generative Models for Power Law - Power Laws via Preferential Attachment When Xt+1 comes into here… (CRP1) Sits at an already occupied table k w/ prob. Nk/(t+α) (Nk: # of customers at table k  ΣkNk=t) (CRP2)or, sits at the next unoccupied table w/ prob. α/(t+α) Applied Algorithm Lab.

Generative Models for Power Law - Power Laws via Preferential Attachment • CPR rule: Next customer sits at a table w/ prob. Proportional to # of customers already sitting at it(and sits at new table w/ prob. Proportional to α)  Customers tend to sit at most popular tables  Most popular tables attract the most new customers, and become even more popular • The concentration parameter α: how likely customer is to sit at a fresh table Applied Algorithm Lab.

Generative Models for Power Law - Power Laws via Preferential Attachment • Generating Power law distribution via Preference Attachment (Most models are variations of this form) • Let’s say “Web Page Process” • Start w/ a single page • This single page has a link to itself • At each time step, a new page appears, w/ outdegree 1 (WPP1) The link of new page points to a page chosen u.a.r. w/ prob. α<1 (WPP2) The link of new page points to page chosen proportionally to the indegree of the page w/ prob. 1- α Applied Algorithm Lab.

Generative Models for Power Law - Power Laws via Preferential Attachment • Xj(t): # of pages w/ indegree j when ∃ t pages in the system • Pr[Xj increase] = αXj-1/t+(1-α)(j-1)Xj-1/t • Pr[Xj decrease] = αXj/t+(1-α)jXj/t (WPP1) The link of new page points to a page chosen u.a.r. w/ prob. α<1 (WPP2) The link of new page points to page chosen proportionally to the indegree of the page w/ prob. 1- α Applied Algorithm Lab.

Generative Models for Power Law - Power Laws via Preferential Attachment • Pr[Xj increase] = αXj-1/t+(1-α)(j-1)Xj-1/t • Pr[Xj decrease] = αXj/t+(1-α)jXj/t  dXj/dt = {α(Xj-1-Xj)+(1-α)((j-1)Xj-1-jXj-1)}/t • Intuitively appealing, BUT how continuous DE describes a discrete process?  This can be justified formally using martingales [Kumar et al 00] & theoretical frameworks of Kurtz, Wormald[Drinea et al. 00, Kurtz 81, Wormald 95]. Applied Algorithm Lab.

Generative Models for Power Law - Power Laws via Preferential Attachment • dX0/dt=1-αX0/t • Suppose in the steady state limit: Xj(t)=cj*t (portion cj)  c0 =dX0/dt=1-αX0/t=1-αc0 ⇔ c0 = 1/(α+1) • Substitute this assumption for dXj/dt = {α(Xj-1-Xj)+(1-α)((j-1)Xj-1-jXj-1)}/t  cj(1+α+j(1-α))=cj-1(α+(j-1)(1-α))  We can determine cjexactly. • Focusing on the asymptotic, for large j cj/cj-1=1-(2-α)/(1+α+j(1-α))~1-{(2-α)/(1-α)}*(1/j) Applied Algorithm Lab.

Generative Models for Power Law - Power Laws via Preferential Attachment • We have cj~cj^(- ) for some constant c, giving a power law. • Notecj~cj^(- ) implies WTS: Σj≥kcj behave the tail of power law distribution (Proof) For some constant c’. So, we’re done. Applied Algorithm Lab.

Generative Models for Power Law - Power Laws via Multiplicative Processes • Pareto: income distribution obeys power law • [Champernowne 53] offered an explanation for this behavior. • Partition income in the following manner: • 1st range: between m and γm for some γ>1 • 2nd range: between γm and γ2m … • persons in class j: their income is between γj-1mand γjm • Pij: prob. of a person moving from class i to class j • At each time step, Pijdepends only on the value (j-i).  Under this assumption, Pareto distribution can be obtained. Applied Algorithm Lab.

Generative Models for Power Law - Power Laws via Multiplicative Processes • E.g. γ=2, Pij=2/3 if j-i=-1 Pij=1/3 if j-i=1 • Special case: i=1  P11=2/3 • The equilibrium property of being in class k: 1/2k X: a person’s income  Pr[X≥2k-1m]=1/2k-1 Pr[X ≥ x]=m/xfor x= 2k-1m This is a power law distribution. Applied Algorithm Lab.

References [1] M. Mitzenmacher, A Brief History of Generative Models for Power Law and Lognormal Distributions, Internet Mathematics, vol 1, No. 2, pp. 226-251, 2004. [2] Mark Johnson, Chinese Restaurant Processes(CG168 notes), cog.brown.edu/~mj/classes/cg168/.../ChineseRestaurants.pdf [3] The lecture notes of C. Faloutsos, Carnegie Mellon University, 15-826 Multimedia Databases and Data Mining, Spring 2008 http://www.cs.cmu.edu/~christos/courses/826.S08/FOILS-pdf/195_powerLaws.pdf [4] Bruno Bassetti, Mina Zarei, Marco Cosentino Lagomarsino, and Ginestra Bianconi., Statistical mechanics of the “Chinese restaurant process”: Lack of self-averaging, anomalous finite-size effects, and condensation, Phys. Rev. E 80, 066118 (2009) [4 pages] [5]http://en.wikipedia.org/wiki/Power_law, http://en.wikipedia.org/wiki/Chinese_restaurant_process, http://en.wikipedia.org/wiki/Preferential_attachment Applied Algorithm Lab.

Power Law and Its Generative Models

Power Law and Its Generative Models

Presentation Transcript

Generative and Discriminative Models in NLP: A Survey

Generative Topic Models for Community Analysis

Why Generative Models Underperform Surface Heuristics

Models of Generative Grammar

REGIONALISM AND THE GENERATIVE POWER OF CITIES

Universal Network Structure and Generative Models

Generative Models For Text

Generative and Discriminative Models in Text Classification

Generative Models vs. Discriminative models

Linear Classification Models: Generative

Generative Models

Generative Models of Images of Objects

Generative Models for Image Understanding

Generative Models for Crowdsourced Data

Generative Topic Models for Community Analysis

Inference in generative models of images and video

Generative Models

Models of Generative Grammar

Generative Models for the Web Graph

Generative Models for Image Analysis

Why Generative Models Underperform Surface Heuristics

Generative and Discriminative Models in NLP: A Survey