450 likes | 642 Views
Introduction to information complexity. June 30, 2013. Mark Braverman Princeton University. Part I: Information theory. Information theory, in its modern format was introduced in the 1940s to study the problem of transmitting data over physical channels. . communication channel. Bob. Alice.
E N D
Introduction to information complexity June 30, 2013 Mark Braverman Princeton University
Part I: Information theory • Information theory, in its modern format was introduced in the 1940s to study the problem of transmitting data over physical channels. communication channel Bob Alice
Quantifying “information” • Information is measured in bits. • The basic notion is Shannon’s entropy. • The entropy of a random variable is the (typical) number of bits needed to remove the uncertainty of the variable. • For a discrete variable:
Shannon’s entropy • Important examples and properties: • If is a constant, then • If is uniform on a finite set of possible values, then . • If is supported on at most values, then . • If is a random variable determined by , then .
Conditional entropy • For two (potentially correlated) variables , the conditional entropy of given is the amount of uncertainty left in given : . • One can show . • This important fact is known as the chain rule. • If , then
Example • Where . • Then • ; ; ;
Mutual information • The mutual information is defined as • “By how much does knowing reduce the entropy of ?” • Always non-negative . • Conditional mutual information: • Chain rule for mutual information: • Simple intuitive interpretation.
Information Theory • The reason Information Theory is so important for communication is because information-theoretic quantities readily operationalize. • Can attach operational meaning to Shannon’s entropy: “the cost of transmitting ”. • Let be the (expected) cost of transmitting a sample of .
? • Not quite. • Let trit • . • It is always the case that .
But and are close • Huffman’s coding: • This is a compression result: “an uninformative message turned into a short one”. • Therefore: .
Shannon’s noiseless coding • The cost of communicating many copies of scales as . • Shannon’s source coding theorem: • Let be the cost of transmitting independent copies of . Then the amortized transmission cost . • This equation gives operational meaning.
per copy to transmit ’s communication channel
is nicer than • is additive for independent variables. • Let be independent trits. • . • Works well with concepts such as channel capacity.
Operationalizing other quantities • Conditional entropy • (cf. Slepian-Wolf Theorem). per copy to transmit ’s communication channel
Operationalizing other quantities • Mutual information : per copy to sample’s communication channel
Information theory and entropy • Allows us to formalize intuitive notions. • Operationalized in the context of one-way transmission and related problems. • Has nice properties (additivity, chain rule…) • Next, we discuss extensions to more interesting communication scenarios.
Communication complexity • Focus on the two party randomized setting. Shared randomness R Y X A & B implement a functionality . A F(X,Y) B e.g.
Communication complexity Goal: implement a functionality . A protocol computing : Shared randomness R Y X m1(X,R) m2(Y,m1,R) m3(X,m1,m2,R) A B F(X,Y) Communication cost = #of bits exchanged.
Communication complexity • Numerous applications/potential applications. • Considerably more difficult to obtain lower boundsthan transmission (still much easier than other models of computation!).
Communication complexity • (Distributional) communication complexity with input distribution and error : Error w.r.t. . • (Randomized/worst-case) communication complexity: . Error on all inputs. • Yao’s minimax: .
Examples • Equality . • .
Equality • is. • is a distribution where w.p. and w.p. are random. Y X MD5(X) [128 bits] X=Y? [1 bit] A Error? B • Shows that
Examples • I. • . In fact, using information complexity: • .
Information complexity • Information complexity :: communication complexity as • Shannon’s entropy :: transmission cost
Information complexity • The smallest amount of information Alice and Bob need to exchange to solve . • How is information measured? • Communication cost of a protocol? • Number of bits exchanged. • Information cost of a protocol? • Amount of information revealed.
Basic definition 1: The information cost of a protocol • Prior distribution: . Y X Protocol transcript Protocol π A B what Alice learns about Y + what Bob learns about X
Example • is. • is a distribution where w.p. and w.p. are random. Y X MD5(X) [128 bits] X=Y? [1 bit] A B 1 + 65 = 66 bits what Alice learns about Y + what Bob learns about X
Prior matters a lot for information cost! • If a singleton,
Example • is. • is a distribution where are just uniformly random. Y X MD5(X) [128 bits] X=Y? [1 bit] A B 0 + 128 = 128 bits what Alice learns about Y + what Bob learns about X
Basic definition 2: Information complexity • Communication complexity: . • Analogously: . Needed!
Prior-free information complexity • Using minimax can get rid of the prior. • For communication, we had: . • For information .
Operationalizing IC: Information equals amortized communication • Recall [Shannon]: . • Turns out [B.-Rao’11]: , for . [Error allowed on each copy] • For : . • [an interesting open problem.]
Can interactive communication be compressed? • Is it true that ? • Less ambitiously: • (Almost) equivalently: Given a protocol with , can Alice and Bob simulate using communication? • Not known in general…
Applications • Information = amortized communication means that to understand the amortized cost of a problem enough to understand its information complexity.
Example: the disjointness function • , are subsets of • Alice gets , Bob gets . • Need to determine whether . • In binary notation need to compute • An operator on copies of the 2-bit AND function.
Set intersection • , are subsets of • Alice gets , Bob gets . • Want to compute . • This is just copies of the 2-bit AND. • Understanding the information complexity of AND gives tight bounds on both problems!
Exact communication bounds[B.-Garg-Pankratov-Weinstein’13] • (trivial). • [Kalyanasundaram-Schnitger’87, Razborov’92] New: • .
Small set disjointness • , are subsets of , • Alice gets , Bob gets . • Need to determine whether . • Trivial: . • [Hastad-Wigderson’07]: • [BGPW’13]: .
Open problem: Computability of IC • Given the truth table of , and , compute • Via can compute a sequence of upper bounds. • But the rate of convergence as a function of is unknown.
Open problem: Computability of IC • Can compute the -round information complexity of . • But the rate of convergence as a function of is unknown. • Conjecture: • This is the relationship for the two-bit AND.