1 / 63

Noise-Insensitive Boolean-Functions are Juntas

Explore the impact of variables on Boolean functions, influential people, dictatorship, Juntas, and Long-Code tests in the realm of theoretical computer science and social choice theory. Delve into Fourier/Walsh analysis, noise-sensitivity, high-frequency weight, and more.

endicott
Download Presentation

Noise-Insensitive Boolean-Functions are Juntas

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Noise-Insensitive Boolean-Functions are Juntas Guy Kindler & Muli SafraSlides prepared with help of: Adi Akavia

  2. Influential People • The theory of the influence of variables on Boolean functions[BL, KKL] and related issues, has been introduced to tackle social choice problems, furthermore has motivated a magnificent sequence of works, related to economics [K], percolation [BKS], Hardness of approximation [DS]Revolving around the Fourier/Walsh analysis of Boolean functions… • And the real important question:

  3. Where to go for Dinner? Who has suggestions: Each cast their vote in an (electronic) envelope, and have the system decided, not necessarily according to majority… It turns out someone –in the Florida wing- has the power to flip some votes Power influence

  4. Voting Systems • n agents, each voting either “for” (T) or “against” (F) – a Boolean function over n variables f is the outcome • The values of the agents (variables) may each, independently, flip with probability  • It turns out: one cannot design an f that would be robust to such noise -that is, would, on average, change value w.p. < O(1)- unless taking into account only very few of the votes

  5. Dictatorship Def: a Boolean function P([n]){-1,1} is a monotone e-dictatorships --denoted fe--if:

  6. Juntas Def: a Boolean function f:P([n]){-1,1} is a j-Junta if J[n] where|J|≤ j, s.t. for every x[n]: f(x) = f(x  J) Def: f is an [, j]-Junta if  j-Junta f’ s.t. Def: f is an [, j, p]-Junta if  j-Junta f’ s.t. We would tend to omit p p-biased, product distribution

  7. Long-Code • In the long-code L:[n] {0,1}2neach element is encoded by an 2n-bits • This is the most extensive binary code, having one bit for every subset in P([n])

  8. Long-Code • Encoding an element e[n]: • Eelegally-encodes an element e if Ee = fe T F F T T

  9. Long-Code  Monotone-Dictatorship • The truth-table of a Boolean function over n elements, can be considered as a 2n bits long string (each corresponding to one input setting – or a subset of [n])For a long-code, the legal code-words are all monotone dictatorshipsHow about the Hadamard code?

  10. Long-code Tests • Def (a long-code test): given a code-word w, probe it in a constant number of entries, and • accept w.h.p if w is a monotone dictatorship • reject w.h.p if w is not close to any monotone dictatorship

  11. Efficient Long-code Tests For some applications, it suffices if the test may accept illegal code-words, nevertheless, ones which have short list-decoding: Def(a long-code list-test): given a code-word w, probe it in 2/3 places, and • accept w.h.p if w is a monotone dictatorship, • reject w.h.p if wis not even approximately determined by a short list of domain elements, that is, if a JuntaJ[n] s.t. f is close to f’ and f’(x)=f’(xJ) for all x Note: a long-code list-test, distinguishes between the case w is a dictatorship, to the case w is far from a junta.

  12. Background • Thm (Friedgut): a Boolean function f with small average-sensitivity is an [,j]-junta • Thm (Bourgain): a Boolean function f with small high-frequency weight is an [,j]-junta • Thm (Kindler&Safra): a Boolean function f with small high-frequency weight in a p-biased measure is an [,j]-junta • Corollary: a Boolean function f with smallnoise-sensitivity is an [,j]-junta • Parameters: average-sensitivity [BL,KKL,F] high-frequency weight [KKL,B] noise-sensitivity [BKS]

  13. [n] [n] I I z x Noise-Sensitivity How often does the value of f changes when the input is perturbed? [n] [n] I I z x

  14. [n] [n] I I z x Noise-Sensitivity • Def(,p,x[n] ): Let 0<<1, and xP([n]). Then y~,p,x, if y = (x\I) z where • I~[n] is a noise subset, and • z~ pI is a replacement. Def(-noise-sensitivity): let 0<<1, then [ When p=½ equivalent to flipping each coordinate in x w.p. /2.]

  15. Fourier/Walsh Transform Write f:{-1, 1}n{-1, 1} as a polynomial What would be the monomials? • For every set S[n] we have a monomial which is the product of all variables in S (the only relevant powers are either 0 or 1)????? Make sense now to consider the degree of f or to break it according to the various degrees of the monomials..

  16. High/Low Frequencies and their Weights Def: the high-frequency portion of f: Def: the low-frequency portion of f: Def: the high-frequency-weight is: Def: the low-frequency-weight is:

  17. Low High-Frequency Weight Prop: the -noise-sensitivity can be expressed in Fourier transform terms as Prop: Low ns Low high-freq weight Proof: By the above proposition, low noise-sensitivity impliesnevertheless, f being {-1, 1} function, by Parseval formula (that the norm 2 of the function and its Fourier transform are equal) implies

  18. Average and Restriction [n] Def: Let I[n],xP([n]\I), the restriction function is Def: the average function is Note: I y x [n] I y y y y y x

  19. Fourier Expansion • Prop: • Prop????: • Corollary:

  20. Variation Def: the variation of f: Prop: the following are equivalent definitions to the variation of f:

  21. Low-freq Variation and Low-freq Average-Sensitivity Def: the low-frequency variation is: Def: the average sensitivity is And in Fourier representation: Def: the low-frequency average sensitivity is:

  22. Main Result Theorem:  constant >0 s.t. any Boolean function f:P([n]){-1,1} satisfying is an [,j]-junta for j=O(-2k32k). Corollary: fix a p-biased distribution p overP([n]). Let >0 be any parameter. Set k=log1-(1/2). Then  constant >0 s.t. any Boolean function f:P([n]){-1,1} satisfying is an [,j]-junta for j=O(-2k32k).

  23. Of course they’ll have to discuss it over dinner…. Where to go for Dinner? Who has suggestions: Each cast their vote in an (electronic) envelope, and have the system decided, not necessarily according to majority… It turns out someone –in the Florida wing- has the power to flip some votes Form a Committee Power influence

  24. First Attempt: Following Freidgut’s Proof Thm: any Boolean function f is an [,j]-junta for Proof: • Specify the juntawhere, let k=O(as(f)/) and fix =2-O(k) • Show the complement of J has small variation P([n]) J

  25. P([n]) J Following Freidgut - Cont Lemma: Proof: Now, lets bound each argument: Prop: Proof: characters of sizek contribute to the average-sensitivity at least (since )

  26. we do not know whether as(f) is small!  True only since this is a {-1,0,1} function. So we cannot proceed this way with only ask! Following Freidgut - Cont Prop: Proof:

  27. If k were 1 Easy case (!?!): If we’d have a bound on the non-linear weight, we should be done. The linear part is a set of independent characters (the singletons) In order for those to hit close to 1 or -1 most of the time, they must avoid the law of large numbers, namely be almost entirely placed on one singleton [by Chernoff like bound]Thm[FKN, ext.]: Assume f is close to linear, then f is close to shallow ( a constant function or a dictatorship)

  28. How to Deal with Dependency between Characters Recall (theorem’s premise) Idea: Let • Partition [n]\J into I1,…,Ir, for r >> k • w.h.p fI[x] is close to linear (low freq characters intersect I expectedly by 1 element, while high-frequency weight is low). P([n]) I2 Ir I I1 J

  29. P([n]) I2 Ir I I1 J So what? fI[x] is close to linear By FKNfI[x]is either a constant-function or a dictatorship, for any x Still, fI[x] could be a different dictatorship for every x, hence the variation of each iI might be low

  30. almost linear  almost shallow Theorem([FKN]): global constant M, s.t. Boolean function f, shallow Boolean function g, s.t. • Hence, ||fI[x]>1||2 is small fI[x] is close to shallow!

  31. Dictatorship and its Singleton • Prop: if fI[x] is a dictatorship, then coordinate i s.t. (where p is the bias). • Corollary (from [FKN]): global constant M, s.t. Boolean function h, eitheror weight Total weight of no more than 1-p Characters {1} {2} {i} {n} {1,2} {1,3} {n-1,n} S {1,..,n}

  32. fI[x] Mostly Constant • Lemma: >0, s.t. for any  and any function g:P([m])  • Def: Let DI be the set of xP(I), s.t. fI[x] is a dictatorship • Next we show, that |DI| must be small, hence for most x, fI[x] is constant.

  33. Parseval Prev lemma |DI| must be small • Lemma: • Proof: let , then Each S is counted only for one index iI. (Otherwise, if S was counted for both i and j in I, then |SI|>1!)

  34. ai no more than 1 1 1 2 3 max n ai 1 1/amax 1 2 3 n Simple Prop • Prop: let {ai}iI be sub-distribution, that is, iIai1, 0ai, then iIai2maxiI{ai}. • Proof:

  35. |DI| must be small - Cont • Therefore(since ), • Hence

  36. Recall • However {S}S are orthonormal, and Obtaining the Lemma • It remains to show that indeed: • Prop1: • Prop2:

  37. Obtaining the Lemma – Cont. • Prop3: • Proof: separate by freq: • Small freq: • Large freq: • Corollary(from props 2,3):

  38. Obtaining the Lemma – Cont. • Recall: by corollary from [FKN], Either or • Hence • By Corollary • Combined with Prop1 we obtain: |DI| is small

  39. Important Lemma • Lemma: >0, s.t. for any  and any function g:P([m]) , the following holds: high-freq Low-freq

  40. Beckner/Nelson/Bonami Inequality Def: let Tbe the following operator on f Thm: for any p≥rand≤((r-1)/(p-1))½ Corollary: for f s.t. f>k=0

  41. Probability Concentration • Simple Bound: • Proof: • Low-freq Bound: Let g:P([m])  be of degree k and >0, then >0 s.t. • Proof: recall the corollary: 

  42. Lemma’s Proof • Now, let’s prove the lemma: • Bounding low and high freq separately:, simple bound Low-freq bound

  43. Shallow Function • Def: a function f is linear, if only singletons have non-zero weight • Def: a function f is shallow, if f is either a constant or a dictatorship. • Claim: Boolean linear functions are shallow. weight Charactersize 0 1 2 3 k n

  44. Boolean Linear  Shallow • Claim: Boolean linear functions are shallow. • Proof: let f be Boolean linear function, we next show: • {io} s.t. (i.e. ) • And conclude, that either or i.e.f is shallow

  45. 1 -1 Claim 1 • Claim 1: let f be boolean linear function, then {io} s.t. • Proof: w.l.o.g assume • for any z{3,…,n}, considerx00=z, x10=z{1}, x01=z{2}, x11=z{1,2} • then . • Next value must be far from {-1,1}, • A contradiction! (boolean function) • Therefore ?

  46. 1 0 -1 Claim 2 • Claim 2: let f be boolean function, s.t.Then either or • Proof: consider f() and f(i0): • Then • but f is boolean, hence • therefore

  47. Proving FKN: almost-linear  close to shallow • Theorem: Let f:P([n])  be linear, • Let • let i0 be the index s.t. is maximal then • Note: f is linear, hence w.l.o.g., assume i0=1, then all we need to show is:We show that in the following claim and lemma.

  48. Corollary • Corollary: Let f be linear, andthen  a shallow booleanfunction g s.t. • Proof: let , let g be the boolean function closest to l. Then,this is true, as • is small (by theorem), • and additionally is small, since

  49. weight Each of weight no more than c Characters {} {1} {2} {i} {n} {1,2} {1,3} {n-1,n} S {1,..,n} Claim 1 • Claim 1: Let f be linear. w.l.o.g., assumethen global constant c=min{p,1-p}s.t.

  50. 1 -1 Proof of Claim1 • Proof: assume • for any z{3,…,n}, considerx00=z, x10=z{1}, x01=z{2}, x11=z{1,2} • then • Next value must be far from {-1,1} ! • A contradiction! (to ) ?

More Related