Introduction to information complexity

Introduction to information complexity June 30, 2013 Mark Braverman Princeton University

Part I: Information theory • Information theory, in its modern format was introduced in the 1940s to study the problem of transmitting data over physical channels. communication channel Bob Alice

Quantifying “information” • Information is measured in bits. • The basic notion is Shannon’s entropy. • The entropy of a random variable is the (typical) number of bits needed to remove the uncertainty of the variable. • For a discrete variable:

Shannon’s entropy • Important examples and properties: • If is a constant, then • If is uniform on a finite set of possible values, then . • If is supported on at most values, then . • If is a random variable determined by , then .

Conditional entropy • For two (potentially correlated) variables , the conditional entropy of given is the amount of uncertainty left in given : . • One can show . • This important fact is known as the chain rule. • If , then

Example • Where . • Then • ; ; ;

Mutual information

Mutual information • The mutual information is defined as • “By how much does knowing reduce the entropy of ?” • Always non-negative . • Conditional mutual information: • Chain rule for mutual information: • Simple intuitive interpretation.

Information Theory • The reason Information Theory is so important for communication is because information-theoretic quantities readily operationalize. • Can attach operational meaning to Shannon’s entropy: “the cost of transmitting ”. • Let be the (expected) cost of transmitting a sample of .

? • Not quite. • Let trit • . • It is always the case that .

But and are close • Huffman’s coding: • This is a compression result: “an uninformative message turned into a short one”. • Therefore: .

Shannon’s noiseless coding • The cost of communicating many copies of scales as . • Shannon’s source coding theorem: • Let be the cost of transmitting independent copies of . Then the amortized transmission cost . • This equation gives operational meaning.

per copy to transmit ’s communication channel

is nicer than • is additive for independent variables. • Let be independent trits. • . • Works well with concepts such as channel capacity.

Operationalizing other quantities • Conditional entropy • (cf. Slepian-Wolf Theorem). per copy to transmit ’s communication channel

Operationalizing other quantities • Mutual information : per copy to sample’s communication channel

Information theory and entropy • Allows us to formalize intuitive notions. • Operationalized in the context of one-way transmission and related problems. • Has nice properties (additivity, chain rule…) • Next, we discuss extensions to more interesting communication scenarios.

Communication complexity • Focus on the two party randomized setting. Shared randomness R Y X A & B implement a functionality . A F(X,Y) B e.g.

Communication complexity Goal: implement a functionality . A protocol computing : Shared randomness R Y X m1(X,R) m2(Y,m1,R) m3(X,m1,m2,R) A B F(X,Y) Communication cost = #of bits exchanged.

Communication complexity • Numerous applications/potential applications. • Considerably more difficult to obtain lower boundsthan transmission (still much easier than other models of computation!).

Communication complexity • (Distributional) communication complexity with input distribution and error : Error w.r.t. . • (Randomized/worst-case) communication complexity: . Error on all inputs. • Yao’s minimax: .

Examples • Equality . • .

Equality • is. • is a distribution where w.p. and w.p. are random. Y X MD5(X) [128 bits] X=Y? [1 bit] A Error? B • Shows that

Examples • I. • . In fact, using information complexity: • .

Information complexity • Information complexity :: communication complexity as • Shannon’s entropy :: transmission cost

Information complexity • The smallest amount of information Alice and Bob need to exchange to solve . • How is information measured? • Communication cost of a protocol? • Number of bits exchanged. • Information cost of a protocol? • Amount of information revealed.

Basic definition 1: The information cost of a protocol • Prior distribution: . Y X Protocol transcript Protocol π A B what Alice learns about Y + what Bob learns about X

Example • is. • is a distribution where w.p. and w.p. are random. Y X MD5(X) [128 bits] X=Y? [1 bit] A B 1 + 65 = 66 bits what Alice learns about Y + what Bob learns about X

Prior matters a lot for information cost! • If a singleton,

Example • is. • is a distribution where are just uniformly random. Y X MD5(X) [128 bits] X=Y? [1 bit] A B 0 + 128 = 128 bits what Alice learns about Y + what Bob learns about X

Basic definition 2: Information complexity • Communication complexity: . • Analogously: . Needed!

Prior-free information complexity • Using minimax can get rid of the prior. • For communication, we had: . • For information .

Operationalizing IC: Information equals amortized communication • Recall [Shannon]: . • Turns out [B.-Rao’11]: , for . [Error allowed on each copy] • For : . • [an interesting open problem.]

Entropy vs. Information Complexity

Can interactive communication be compressed? • Is it true that ? • Less ambitiously: • (Almost) equivalently: Given a protocol with , can Alice and Bob simulate using communication? • Not known in general…

Applications • Information = amortized communication means that to understand the amortized cost of a problem enough to understand its information complexity.

Example: the disjointness function • , are subsets of • Alice gets , Bob gets . • Need to determine whether . • In binary notation need to compute • An operator on copies of the 2-bit AND function.

Set intersection • , are subsets of • Alice gets , Bob gets . • Want to compute . • This is just copies of the 2-bit AND. • Understanding the information complexity of AND gives tight bounds on both problems!

Exact communication bounds[B.-Garg-Pankratov-Weinstein’13] • (trivial). • [Kalyanasundaram-Schnitger’87, Razborov’92] New: • .

Small set disjointness • , are subsets of , • Alice gets , Bob gets . • Need to determine whether . • Trivial: . • [Hastad-Wigderson’07]: • [BGPW’13]: .

Open problem: Computability of IC • Given the truth table of , and , compute • Via can compute a sequence of upper bounds. • But the rate of convergence as a function of is unknown.

Open problem: Computability of IC • Can compute the -round information complexity of . • But the rate of convergence as a function of is unknown. • Conjecture: • This is the relationship for the two-bit AND.

Thank You!

Introduction to information complexity

Introduction to information complexity

Presentation Transcript

Introduction to Information Security

Introduction to Information Retrieval

Introduction to Complexity Classes

Introduction to Algorithms: Verification, Complexity, and Searching

COMS W4236: Introduction to Computational Complexity

An Introduction to Black-Box Complexity

Information Complexity: an Overview

Introduction to Information Security

Information Complexity Lower Bounds

Introduction to complexity

Introduction to Information Systems

From Complexity to Simplicity to Complexity

Introduction to Information

An Introduction to Computational Complexity

Introduction to the Analysis of Complexity

Information display: decision complexity

Introduction to the Analysis of Complexity

Introduction to Complexity Analysis