260 likes | 345 Views
Algorithmic Problems in the Internet. Christos H. Papadimitriou www.cs.berkeley.edu/~christos. Goals of TCS (1950-2000):
E N D
Algorithmic Problems in the Internet Christos H. Papadimitriou www.cs.berkeley.edu/~christos
Goals of TCS (1950-2000): Develop a productive mathematical understanding of the capabilities and limitations of the von Neumann computer and its software (the dominant and most novel computational artifacts of that time); Mathematical tools: combinatorics, logic What should the goals of TCS be today? (and what math tools will be handy?) Iowa State, April 2003
The Internet • huge, growing, open, emergent, mysterious • built, operated and used by a multitude of diverse economic interests • as information repository: open, huge, available, unstructured, critical • foundational understanding urgently needed Iowa State, April 2003
Today… • Games and mechanism design • Getting lost in the web • The Internet’s heavy tail Iowa State, April 2003
Games, games… strategies strategies 3,-2 payoffs (NB: also, many players) Iowa State, April 2003
matching pennies prisoner’s dilemma e.g. chicken Iowa State, April 2003
Nash equilibrium • Definition: double best response (problem: may not exist) • randomized Nash equilibrium Theorem [Nash 1952]: Always exists. • Problem: there are usually many . . . Iowa State, April 2003
The price of anarchy cost of worst Nash equilibrium [Koutsoupias and P, 1998] “socially optimum” cost in network routing = 2 [Roughgarden and Tardos, 2000, Roughgargen 2002] Iowa State, April 2003
mechanism design(or inverse game theory) • agents have utilities – but these utilities are known only to them • game designer prefers certain outcomes depending on players’ utilities • designed game (mechanism) has designer’s goals as dominating strategies Iowa State, April 2003
e.g., Vickrey auction • sealed-highest-bid auction encourages gaming and speculation • Vickrey auction: Highest bidder wins, pays second-highest bid Theorem: Vickrey auction is a truthful mechanism. (Theorem: It maximizes social benefit and auctioneer expected revenue.) Iowa State, April 2003
Vickrey shortest paths 3 6 5 s 4 t 6 10 3 11 pay e Vc(e) = its declared cost c(e), plus a bonus equal to dist(s,t)|c(e) = - dist(s,t) Iowa State, April 2003
Problem: 1 1 1 1 1 s 10 t Iowa State, April 2003
But… • …in the Internet Vickrey overcharge would be only about 30% on the average [FPSS 2002] • Could this be the manifestation of rational behavior at network creation? • [FPSS 2002]: Vickrey charges • Depend on origin and destination • Can be computed on top of BGP Iowa State, April 2003
But… (cont) • [FPSS 2002]: Vickrey charges • Depend on origin and destination • Can be computed on top of BGP • [with Mihail and Saberi, 2003] • They are small in expectation in random graphs. • (Also: Why traffic grows moderately as the Internet grows…) Iowa State, April 2003
The web as a graphcf: [Google 98], [Kleinberg 98] • how do you sample the web? [Bar-Yossef, Berg, Chien, Fakcharoenphol, Weitz, VLDB 2000] • e.g.: 42% of web documents are in html. How do you find that? • What is a “random” web document? Iowa State, April 2003
documents Idea: random walk Problems: hyperlinks 1. asymmetric 2. uneven degree 3. 2nd eigenvalue? = 0.99999 Iowa State, April 2003
The web walker: results • mixing time is ~log N/(1-) • WW mixing time: 3,000,000 • actual WW mixing time: 100 • .com 49%, .jp 9%, .edu 7%, .cn 0.8% Iowa State, April 2003
Q: Is the web a random graph? • Many K3,3’s (“communities”) • Indegrees/outdegrees obey “power laws” • Model [Kumar et al, FOCS 2000]: copying Iowa State, April 2003
Also the Internet • [Faloutsos3 1999] the degrees of the Internet are power law distributed • Both autonomous systems graph and router graph • Eigenvalues: ditto!??! • Model? Iowa State, April 2003
The world according to Zipf • Power laws, Zipf’s law, heavy tails,… • i-th largest is ~ i-a (cities, words: a = 1, “Zipf’s Law”) • Equivalently: prob[greater than x] ~ x -b • (compare with law of large numbers) • “the signature of human activity” Iowa State, April 2003
Models • Size-independent growth (“the rich get richer,” or random walk in log paper) • Growing number of growing cities • In the web: copying links [Kumar et al, 2000] • Carlson and Doyle 1999: Highly optimized tolerance (HOT) Iowa State, April 2003
Our model [with Fabrikant and Koutsoupias, 2002]: minj < i [ dij + hopj] Iowa State, April 2003
Theorem: • if < const, then graph is a star degree = n -1 • if > n, then there is exponential concentration of degrees prob(degree > x) < exp(-ax) • otherwise, if const < < n, heavy tail: prob(degree > x) > x -b Iowa State, April 2003
Heuristically optimized tradeoffs • Also: file sizes (trade-off between communication costs and file overhead) • Power law distributions seem to come from tradeoffs between conflicting objectives (asignature of human activity?) • cf HOT, [Mandelbrot 1954] • Other examples? • General theorem? Iowa State, April 2003
PS: eigenvalues Model: Edge [i,j] has prob. ~ di dj Theorem [with Mihail, 2002]: If the di’s obey a power law, then the nb largest eigenvalues are almost surely very close to d1, d2, d3, … (NB: The eigenvalue exponent observed in Faloutsos3 is about ½ of the degree exponent) Corollary: Spectral methods are of dubious value in the presence of large features Iowa State, April 2003