540 likes | 676 Views
Information and interactive computation. Mark Braverman Computer Science, Princeton University. January 16, 2012. Prelude: one-way communication. Basic goal : send a message from Alice to Bob over a channel. communication channel. Bob. Alice. One-way communication. Encode; Send;
E N D
Information and interactive computation Mark Braverman Computer Science, Princeton University January 16, 2012
Prelude: one-way communication • Basic goal: send a message from Alice to Bob over a channel. communication channel Bob Alice
One-way communication • Encode; • Send; • Decode. communication channel Bob Alice
Coding for one-way communication • There are two main problems a good encoding needs to address: • Efficiency: use the least amount of the channel/storage necessary. • Error-correction: recover from (reasonable) errors;
Interactive computation Today’s theme Extending information and coding theory to interactive computation. I will talk about interactive information theory and Anup Rao will talk about interactive error correction.
Efficient encoding • Can measure the cost of storing a random variable X very precisely. • Entropy: H(X) = ∑Pr[X=x] log(1/Pr[X=x]). • H(X) measures the average amount of information a sample from X reveals. • A uniformly random string of 1,000 bits has 1,000 bits of entropy.
Efficient encoding • H(X) = ∑Pr[X=x] log(1/Pr[X=x]). • The ZIP algorithm works because • H(X=typical 1MB file) < 8Mbits. • P[“Hello, my name is Bob”] >> P[“h)2cjCv9]dsnC1=Ns{da3”]. • For one-way encoding, Shannon’s source coding theorem states that • Communication ≈ Information.
Efficient encoding • The problem of sending many samples of X can be implemented in H(X) communication on average. • The problem of sending a single sample of X can be implemented in <H(X)+1 communication in expectation.
Communication complexity [Yao] • Focus on the two party setting. Y X A & B implement a functionality F(X,Y). A F(X,Y) B e.g. F(X,Y) = “X=Y?”
Communication complexity Goal: implement a functionality F(X,Y). A protocol π(X,Y) computing F(X,Y): Y X m1(X) m2(Y,m1) m3(X,m1,m2) A B F(X,Y) Communication cost = #of bits exchanged.
Distributional communication complexity • The input pair (X,Y) is drawn according to some distribution μ. • Goal: make a mistake on at most an ε fraction of inputs. • The communication cost: C(F,μ,ε): C(F,μ,ε) := minπ computes F with error≤ε C(π, μ).
Example μis a distribution of pairs of files. F is “X=Y?”: Y X MD5(X) (128b) X=Y? (1b) A B Communication cost = 129 bits. ε ≈ 2-128.
Randomized communication complexity • Goal: make a mistake of at most ε on every input. • The communication cost: R(F,ε). • Clearly: C(F,μ,ε)≤R(F,ε) for all μ. • What about the converse? • A minimax(!) argument [Yao]: R(F,ε)=maxμ C(F,μ,ε).
A note about the model • We assume a shared public source of randomness. R Y X A B
The communication complexity of EQ(X,Y) • The communication complexity of equality: R(EQ,ε) ≈ log 1/ε • Send log 1/ε random hash functions applied to the inputs. Accept if all of them agree. • What if ε=0? R(EQ,0) ≈ n, where X,Y in {0,1}n.
Information in a two-way channel • H(X) is the “inherent information cost” of sending a message distributed according to X over the channel. communication channel X Bob Alice What is the two-way analogue of H(X)?
Entropy of interactive computation • “Inherent information cost” of interactive two-party tasks. R Y X A B
One more definition: Mutual Information • The mutual information of two random variables is the amount of information knowing one reveals about the other: I(A;B) = H(A)+H(B)-H(AB) • If A,B are independent, I(A;B)=0. • I(A;A)=H(A). H(B) H(A) I(A,B)
Information cost of a protocol • [Chakrabarti-Shi-Wirth-Yao-01, Bar-Yossef-Jayram-Kumar-Sivakumar-04, Barak-B-Chen-Rao-10]. • Caution: different papers use “information cost” to denote different things! • Today, we have a better understanding of the relationship between those different things.
Information cost of a protocol • Prior distribution: (X,Y) ~ μ. Y X Protocol transcript π Protocol π A B I(π,μ) = I(π;Y|X) + I(π;X|Y) what Alice learns about Y + what Bob learns about X
External information cost • (X,Y) ~ μ. Y X Protocol transcript π Protocol π A C B Iext(π,μ) = I(π;XY) what Charlie learns about (X,Y)
Another view on I and Iext • It is always the case that C(π, μ) ≥Iext(π, μ) ≥ I(π, μ). • Iext measures the ability of Alice and Bob to compute F(X,Y) in an information theoretically secure way if they are afraid of an eavesdropper. • I measures the ability of the parties to compute F(X,Y) if they are afraid of each other.
Example • F is “X=Y?”. • μis a distribution where w.p. ½ X=Y and w.p. ½ (X,Y) are random. Y X MD5(X) [128b] X=Y? A B Iext(π,μ) = I(π;XY) = 129 bits what Charlie learns about (X,Y)
Example • F is “X=Y?”. • μis a distribution where w.p. ½ X=Y and w.p. ½(X,Y) are random. Y X MD5(X) [128b] X=Y? A B I(π,μ) = I(π;Y|X)+I(π;X|Y) ≈ 1 + 64.5 = 65.5 bits what Alice learns about Y + what Bob learns about X
The (distributional) information cost of a problem F • Recall: C(F,μ,ε) := minπ computes F with error≤ε C(π, μ). • By analogy: I(F, μ, ε) := infπ computes F with error≤εI(π, μ). Iext(F, μ, ε) := infπ computes F with error≤ε Iext (π, μ).
I(F,μ,ε) vs. C(F,μ,ε): compressing interactive computation Source Coding Theorem: the problem of sending a sample of X can be implemented in expected cost <H(X)+1 communication – the information content of X. Is the same compression true for interactive protocols? Can F be solved in I(F,μ,ε) communication? Or in Iext(F,μ,ε) communication?
The big question • Can interactive communication be compressed? • Can π be simulated by π’ such that C(π’, μ) ≈ I(π, μ)? Does I(F,μ,ε) ≈C(F,μ,ε)?
Compression results we know • Let ε, ρbe constants; let π be a protocol that computes F with error ε. • π’s costs: C, Iext, I. • Then π can be simulated using: • (I·C)½·polylog(C) communication; [Barak-B-Chen-Rao’10] • Iext·polylog(C) communication; [Barak-B-Chen-Rao’10] • 2O(I)communication;[B’11] while introducing an extra error of ρ.
The amortized cost of interactive computation Source Coding Theorem: the amortized cost of sending many independent samples of X is =H(X). What is the amortized cost of computing many independent copies of F(X,Y)?
Information = amortized communication • Theorem[B-Rao’11]: for ε>0 I(F,μ,ε) = limn→∞ C(Fn,μn,ε)/n. • I(F,μ,ε) is the interactive analogue of H(X).
Information = amortized communication • Theorem[B-Rao’11]: for ε>0 I(F,μ,ε) = limn→∞ C(Fn,μn,ε)/n. • I(F,μ,ε) is the interactive analogue of H(X). • Can we get rid of μ? I.e. make I(F,ε) a property of the task F?
Prior-free information cost • Define: I(F,ε) := infπ computes F with error≤ε maxμ I(π, μ) • Want a protocol that reveals little information against all priors μ! • Definitions are cheap! • What is the connection between the “syntactic” I(F,ε) and the “meaningful” I(F,μ,ε)? • I(F,μ,ε) ≤ I(F,ε)…
Prior-free information cost • I(F,ε) := infπ computes F with error ≤ε maxμ I(π, μ). • I(F,μ,ε) ≤ I(F,ε) for all μ. • Recall: R(F,ε)=maxμ C(F,μ,ε). • Theorem[B’11]: I(F,ε) ≤ 2·maxμI(F,μ,ε/2). I(F,0) = maxμI(F,μ,0).
Prior-free information cost • Recall: I(F,μ,ε) = limn→∞ C(Fn,μn,ε)/n. • Theorem: for ε>0 I(F,ε) = limn→∞ R(Fn,ε)/n.
Example • R(EQ,0) ≈ n. • What is I(EQ,0)?
The information cost of Equality • What is I(EQ,0)? • Consider the following protocol. A non-singular in X in {0,1}n Y in {0,1}n Continue for n steps, or until a disagreement is discovered. A1·X A A1·Y B A2·X A2·Y
Analysis (sketch) • If X≠Y, the protocol will terminate in O(1) rounds on average, and thus reveal O(1) information. • If X=Y… the players only learn the fact that X=Y (≤1 bit of information). • Thus the protocol has O(1) information complexity.
Direct sum theorems • I(F,ε) = limn→∞ R(Fn,ε)/n. • Questions: • Does R(Fn,ε)=Ω(n·R(F,ε))? • DoesR(Fn,ε)=ω(R(F,ε))?
Direct sum strategy • The strategy for proving direct sum results. • Take a protocol for Fn that costs Cn=R(Fn,ε), and make a protocol for F that costs ≈Cn/n. • This would mean that C<Cn/n, i.e.Cn>n∙C. ~ ~ A protocol for n copies of F ? Cn/n 1 copy of F Cn
Direct sum strategy • If life were so simple… Copy 1 Easy! Copy 2 Cn/n 1 copy of F Cn Copy n
Direct sum strategy • Theorem: I(F,ε) = I(Fn,ε)/n ≤ Cn= R(Fn,ε)/n. • Compression → direct sum!
The information cost angle • There is a protocol of communication cost Cn, but information cost ≤Cn/n. Restriction Copy 2 Copy n Copy 1 Cn Cn/n info Compression? 1 bit Cn/n 1 copy of F
Direct sum theorems Best known general simulation [BBCR’10]: • A protocol with C communication and I information cost can be simulated using (I·C)½·polylog(C) communication. • Implies: R(Fn,ε) = Ω(n1/2∙R(F,ε)). ~
Compression vs. direct sum • We saw that compression → direct sum. • A form of the converse is also true. • Recall: I(F,ε) = limn→∞ R(Fn,ε)/n. • If there is a problem such that I(F,ε)=o(R(F,ε)), thenR(Fn,ε)=o(n·R(F,ε)).
A complete problem • Can define a problem called Correlated Pointer Jumping – CPJ(C,I). • The problem has communication cost C and information cost I. • CPJ(C,I) is the “least compressible problem”. • If R(CPJ(C,I),1/3)=O(I), then R(F,1/3)=O(I(F,1/3)) for all F.
The big picture direct sum for information I(Fn,ε)/n I(F, ε) information = amortized communication interactive compression? direct sum for communication? R(Fn,ε)/n R(F, ε)
Partial progress • Can compress bounded-round interactive protocols. • The main primitive is a one-shot version of Slepian-Wolf theorem. • Alice gets a distribution PX. • Bob gets a prior distribution PY. • Goal: both must sample from PX.
Correlated sampling PY PX A M ~ PX M ~ PX B • The best we can hope for is D(PX||PY).
u3 u3 u1 u1 u2 u2 u4 u4 Proof Idea • Sample using D(PX||PY)+O(log 1/ε+D(PX||PY)½) communication with statistical error ε. Public randomness: ~|U| samples u1 u2 u3 u4 u5 u6 u7 1 1 q1 q2 q3 q4 q5 q6 q7 …. PX PY 0 0 PY PX u4
h1(u4) Proof Idea • Sample using D(PX||PY)+O(log 1/ε+D(PX||PY)½) communication with statistical error ε. 1 1 PX PY u2 u4 0 0 u2 h2(u4) PY PX u4