1 / 21

The Learnability of Quantum States

The Learnability of Quantum States. . Scott Aaronson University of Waterloo. Outline. A Quantum Occam’s Razor Theorem - Why you should want it to be true - Why it is true - Application to quantum communication - Application to quantum advice.

mikel
Download Presentation

The Learnability of Quantum States

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Learnability of Quantum States  Scott Aaronson University of Waterloo

  2. Outline A Quantum Occam’s Razor Theorem - Why you should want it to be true - Why it is true - Application to quantum communication - Application to quantum advice Sneak Preview: Quantum Software Copy-Protection - What it has to do with learning - Why it might be possible

  3. The Sun David Hume (1711-1776) Why do we believe the sun will rise tomorrow? The hypothesis that it will rise every day until tomorrow is equally compatible with evidence… In my view, a branch of CS called computational learning theory has pretty much solved this Humean Problem of Induction, insofar as it has a solution…

  4. In particular: If you want to output a hypothesis from set H that explains at least a 1- fraction of future data with probability at least 1-, then data points suffice. Occam’s Razor Theorem(Valiant, Vapnik, Blumer et al…) “If the possible hypotheses have sufficiently fewer bits than the data you’ve collected, and if one of those hypotheses succeeds in explaining your data, then that hypothesis will probably also explain most of the data you haven’t collected”

  5. “Operationally meaningful subset” HILBERT SPACE Trouble in QuantumLand Fear not, physicists! Why would he even be raising this “dilemma” if he wasn’t gonna demolish it on the very next slide? To describe a quantum state of n qubits takes ~2n classical bits Indeed, traditional quantum state tomography requires (22n) measurements on copies of the state Does this mean that a generic 10,000-particle state can never be “learned” within the lifetime of the universe? If so, would call into question the operational status of many-body quantum states themselves…

  6. The Quantum Occam’s Razor Theorem Let  be an n-qubit mixed state. Let D be a distribution over two-outcome measurements. Suppose we draw m measurements E1,…,Em independently from D, and then output a “hypothesis state”  such that for all i. Then provided /10 and we’ll have with probability at least 1- over E1,…,Em

  7. Q: But what if I can’t estimate the Tr(E)’s? What if for each measurement E, all I get is a bit that’s 1 with probability Tr(E) and 0 with probability 1-Tr(E)? A: In that case you need this many measurements: Upshot for Experimentalists You can do “pretty good tomography” on an arbitrary entangled state of n spins, using a number of measurements that scales only linearly (!) with n Here “pretty good” means with respect to anyfixeddistribution over observables

  8. To prove the theorem, we need a notion introduced by Kearns and Schapire called Fat-Shattering Dimension Let C be a class of functions from S to [0,1]. We say a set {x1,…,xk}S is -shattered by C if there exist reals a1,…,ak such that, for all 2k possible statements of the form f(x1)a1-  f(x2)a2+  …  f(xk)ak-, there’s some fC that satisfies the statement. Then fatC(), the -fat-shattering dimension of C, is the size of the largest set -shattered by C.

  9. Small Fat-Shattering Dimension Implies Small Sample ComplexityProof uses a 1996 result of Bartlett and Long Let C be a class of functions from S to [0,1], and let fC. Suppose we draw m elements x1,…,xm independently from some distribution D, and then output a hypothesis hC such that |h(xi)-f(xi)| for all i. Then provided /7 and we’ll have with probability at least 1- over x1,…,xm.

  10. No need to thank me! Upper-Bounding the Fat-Shattering Dimension of Quantum StatesProof uses Ashwin Nayak’s lower bound for “quantum random access codes,” which in turn uses Holevo’s Theorem on quantum channel capacity Let S be the set of two-outcome measurements on n qubits. Let Cn be the set of functions f:S[0,1] defined by f(E)=Tr(E) for some n-qubit mixed state . Then Quantum Occam’s Razor Theorem is then just plug & chug…

  11. GBUSTERS L Simple Application of Quantum Occam’s Razor Theorem to Communication Complexity x y f(x,y) Alice Walker Bob Dylan • f: Boolean function mapping Alice’s N-bit string x and Bob’s M-bit string y to a binary output • D1(f), R1(f), Q1(f): Deterministic, randomized, and quantum one-way communication cost of f • How much can quantum communication save? • It’s known that D1(f)=O(M Q1(f)) for all total f • In 2004 I showed that for all f,D1(f)=O(M Q1(f)logQ1(f))

  12. Theorem: R1(f)=O(M Q1(f)) for all f, partial or total Proof: By Yao’s minimax principle, Alice can consider a worst-case distribution Dx over Bob’s input y Alice’s classical message will consist of y1,…,yT drawn from Dx, together with f(x,y1),…,f(x,yT) Here T=(Q1(f)) Bob searches for a quantum message  that yields the right answers on y1,…,yT (certainly such a  exists) By the Quantum Occam’s Razor Theorem, with high probability such a  yields the right answers on most y drawn from Dx

  13. Computational Complexity of Learning Quantum States I showed that, if you find a state  that explains O(n) measurements drawn from D, with high probability that  will correctly explain most future measurements drawn from D. This says nothing about the computational problem of finding ! Indeed, if  can always be prepared by a polynomial-time quantum algorithm, then no one-way function is secure against quantum attack.

  14. PostBQP/poly BQP/qpoly QMA/poly YQP/poly QMA BQP/poly YQP BQP To say more, we need to visit the bestiary… YQP: Yaroslav Quantum Polynomial-Time Class of problems solvable efficiently on a quantum computer, with the help of polynomial-size untrusted quantum advice

  15. Theorem: AvgBQP/qpoly = AvgYQP/poly Or in English: We can use trusted classical advice to verify that untrusted quantum advice will work on most inputs. Proof Idea:The classical advice will consist of “training inputs” x1,…,xm, as well as whether xiL for all 1im Given a purported advice state |, first check that | yields the right answers on x1,…,xm, and only then use it on the x you care about By Quantum Occam’s Razor Theorem, m=O(poly(n)) is enough to ensure | will work on most inputs w.h.p. The technical part is to do the verification without damaging | too badly

  16. Quantum Copy-Protection We say a program P is copy-protected if there’s no efficient algorithm that, given P’s source code, outputs two programs with the same input/output behavior as P Classically, copy-protection is trivially impossible(tell that to Sony/BMG…) Quantumly: well, it’s called the “No-Cloning Theorem” for a reason… Connection to learning: If P can be learned from input/output behavior, then it can’t be copy-protected

  17. A Weird Example Let G be a finite group, such that we can efficiently prepare |G (a uniform superposition over gG) Let HG be a subgroup with |H|  |G|/polylog|G| Let f(g)=1 if gH and f(g)=0 otherwise Given |H (a uniform superposition over H), Watrous showed that we can efficiently compute fTest whether |H and |gH are equal or orthogonal Conversely, given a black box that computes f, we can efficiently prepare |HFirst prepare |G, then postselect on f(g)=1 So any program for f can be pirated—but (apparently) only in an indirect, quantum way

  18. The Pirate’s Nightmare In the quantum world, can any program that can’t be learned be copy-protected? Main Result: There exists a “quantum oracle” relative to which the answer is yes Upshot: Even if the answer is no, we can’t prove it without using “quantumly nonrelativizing techniques”

  19. Handwaving Proof Idea For each circuit C, choose a “meaningless quantum label” |C according to the Haar measure The quantum oracle will map |C|x|0 to |C|x|C(x), as well as |C|0 to |C|C Intuitively, then, being given |C is “no better” than being given a black box for C To prove this, we need to simulate an algorithm that prepares |C given another copy of |C, by an algorithm that prepares |C given only black-box access to C Strategy: Mimic the copying algorithm, by “mocking up” a random pure state | that plays the same role as |C Problem: “Mocking up” a random pure state takes exponential time

  20. Solution: Pseudorandom States where p is a degree-d univariate polynomial over GF(2n) for some d=poly(n), and p0(x) is the “leading bit” of p(x) Clearly the |p’s can be prepared in polynomial time Lemma: If p is chosen uniformly at random, then |p “looks like” it was chosen under the Haar measure- Even if we get polynomially many copies of |p- Even if we query the quantum oracle, which depends on |p So the simulator can use |p’s in place of |C’s

  21. r r DUNCE DUNCE Open Problems Can we tighten the Quantum Occam’s Razor Theorem?The best lower bounds I can prove go like (n/2), or (n/4) in the case where each measurement is applied only once Does BQP/qpoly = YQP/poly?I.e., can we use classical advice to verify quantum advice in the worst-case setting? Is D1(f) = O(M Q1(f))? Or even O(M+Q1(f))?Even more ambitiously, could learning theory techniques help us show that R1(f)=O(Q1(f)) for all total f? In the real world, are there nontrivial programs that can be quantumly copy-protected?What about point functions (f(x)=1 if x equals a secret password s; otherwise f(x)=0)?

More Related