1 / 34

Research in Theoretical Computer Science Madhu Sudan CSAIL

Explore the theoretical foundations of computation, complexity, algorithms, and their applications in various fields. Focus on recent research directions and their implications.

jasperm
Download Presentation

Research in Theoretical Computer Science Madhu Sudan CSAIL

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Research in Theoretical Computer ScienceMadhu SudanCSAIL

  2. Overview • Part I: Introduction to Theory of Computation. • Part II: Perspective on (immediate) relevance. • Part III: A current research direction. • Introverted Algorithms • Communication with errors: Meaning of bits

  3. Part I: Introduction to Theory of CS

  4. Theory of Computing • Mathematical study of Computation and its consequences. • Computation: Sequence of simple steps, leading to complex change in information. • Measures: Efficiency of algorithm/program: • Depends on hardware and implementation. • Can ask how it scales? • If I double the hardware capacity (speed/memory) • Will this increase the biggest size of problem I can solve by constant factor? (polynomial solution) • Or by additive constant? (exponential solution)

  5. Theory of Computing • Mathematical study of Computation and its consequences. • Computation: Sequence of simple steps, leading to complex change in information. • Issues: • Algorithms: Design efficient sequence of steps that produce a desired effect. What is efficient? • Complexity: When is inefficiency inherent? • Implications: What effect does (in)efficiency have on human (intelligent) interaction? • Surprisingly broad in scope and impact.

  6. Example: Integer Arithmetic • Addition: • Multiplication: • Factoring: 2 3 1 5 6 7 + 5 8 9 1 4 2 9 0 4 8 1 9 0 4 8 1 0 4 8 1 4 8 1 8 1 1 Linear!

  7. Example: Integer Arithmetic • Addition: Linear! • Multiplication: • Factoring? 2 3 1 5 6 7 x 5 8 9 1 4 9 2 6 2 6 8 2 3 1 5 6 7 2 0 8 4 1 0 3 1 8 5 2 5 3 6 1 1 5 7 8 3 5 1 3 6 4 2 5 3 8 2 3 8 Quadratic! Fastest? Not Linear?

  8. Example: Integer Arithmetic • Addition: Linear! • Multiplication: Quadratic! Fastest? Not-linear • Factoring? Write 13642538238 as product of two integers (each less than 1000000) • Inverse of above problem. • Not known to be linear/quadratic/cubic. • Believed to require exponential time.

  9. Fundamental quests of CS Theory • Algorithms: Given a task (e.g., multiplication) find fast algorithms. • First algorithm we think of may not be fastest. • Complexity: Prove lower bounds on resources required to solve problem. • Is multiplication harder than addition? • Is factoring harder than multiplication? • Implications: Cryptography … • Economics: Markets implement efficient computation. • Biology: Nature implements efficient computation. • Networks: Errors implement efficient computation.

  10. Long-range questions • Is “P=NP?” • Formally, Is all computation reversible? (e.g., multiplication vs. factoring?) • Philosophically, can every designer (mathematician, physicist, engineer, biologist) be replaced by a computer? • (Most of us don’t expect this). • Can we factor integers efficiently? • (Hopefully, still no). • If not, can we build secure communication based on this? • Led to RSA. Still many challenges today.

  11. Modern addenda to long-term quests • Is the universe random? • Maybe … if so: • Can build efficient algorithms this way (modern examples due to Karger, Rubinfeld, Indyk, Kelner) • Can synchronize distributed systems (essential, as shown by Lynch et al.) • Can generate and preserve secrets (essential, as shown by Goldwasser and Micali). • Maybe not … if so • Might still look random to us, because P ≠ NP. (Long history … Blum, Micali, Yao) • Is the universe quantum? Factoring easy (Shor)

  12. Current quests in computation • Algorithms for Massive data sets • How can we leverage the computational power of a laptop, to understand data such as the WWWMain issue: Massive data – won’t fit in our storage. • Factors in our favor: • We can perform random sampling • We don’t have to deliver “guaranteed answers” • Many Results [Karger, Vempala, Rubinfeld, Indyk] • Can tell if there’s a “trend change” [Rubinfeld et al.] • Can tell if a signal has high-intensity in some frequency. [Indyk et al.] • Underlying emphasis on Randomness.

  13. Part II: Perspective of theory

  14. History of theoretical CS • 1930s: Turing – invented Turing machine. • Universality: One machine implements all algorithms. • Why? To model thought/reasoning/logic • theorems and proofs • Became foundation of modern computers (von Neumann) • 1960s: Non-trivial algorithms: • Peterson – BCH decoder • Cooley-Tukey – FFT • Dijkstra – shortest paths • 1970s: NP-completeness, Cryptography, RSA. • 1990s: Internet algorithms (Yahoo!, Akamai, Google).

  15. Theory vs. Practice • Theoretical Perspective • Focus on Long-term time horizon; not very close attention to current nature of: • Hardware • Domain-specific information • Solution feasibility • Why should you care (today?) • Lessons learned from past are useful (theories more important than theorems). • Good insight into problems of the future. • Occasionally … solutions useful today!

  16. Part III: Recent ResearchProblems, Solutions

  17. Part IIIa: Introverted Algorithms

  18. Sublinear time algorithms[R. Rubinfeld, P. Valiant] • Typical Algorithmic Tasks. • Given x, compute some f(x) in time |x|. Linear time! • Modern challenges: • Data too “massive” to allow time |x| to process it. • Can we do much faster? • Allow “randomness” in algorithms. • Allow some “approximation error”.

  19. Motivations • Internet Traffic • Suppose we maintain vast amounts of logs of internet traffic through a router. • Was there a major shift in the nature of requests within the last hour (perhaps a denial of service attack). • Disease Patterns • Suppose we have data for spread of a disease. • What are causal factors. • … • Theme: Data Abundant; Processing bottleneck

  20. “Introverted Algorithms” New Area : Many Problems, Few Tools [P. Valiant]: SymmetricApproximation Properties of Distributions Invariant under renaming yes ? no “Uniform a—m” = “Uniform n—z” Distribution Space “Intrinsic properties”

  21. “Introverted Algorithms” New Area : Many Problems, Few Tools [P. Valiant]: SymmetricApproximation Properties of Distributions β Invariant under renaming yes ? no “Uniform a—m” = “Uniform n—z” α Distribution Space “Intrinsic properties” Reals

  22. “Introverted Algorithms” New Area : Many Problems, Few Tools [P. Valiant]: SymmetricApproximation Properties of Distributions β Invariant under renaming yes ? no continuous “Uniform a—m” = “Uniform n—z” α Distribution Space “Intrinsic properties” Reals Includes: approximating Entropy, Statistical (L1) Distance, Support Size, Information Divergences, other Lc distances, weighted distances Includes: approximating Entropy, Statistical (L1) Distance, Support Size, Information Divergences, other Lc distances, weighted distances …

  23. New Contribution Entropy Approximation: <α or >β? Statistical Distance: <α or >β? nα/β n Two Components of a Solution: nα/β [BDKR ’02] n [B ’01] An Upper Bound (Algorithm) A Lower Bound (Impossibility Proof) n1/2 [BFRSW ’00] n2α/3β [RRSS ’07] g u u d d a g g c u e e

  24. New Contribution Canonical Tester Entropy Approximation: <α or >β? Statistical Distance: <α or >β? nα/β n Canonical Testing Theorem: “If the Canonical Tester does not work, nothing does.” Both an upper and a lower bound Determining the sample complexity of property testing is now a question of algorithm analysis —What’s the algorithm?

  25. log n 2 The Canonical Tester yes no (a,b,b,a,a,a,f,e,e,e) estimate high frequencies threshold: 3 ∩{yes,no} constrain low frequencies yes ? no   .4 .3 <.3 <.3 <.3 “If the Canonical Tester does not work, nothing will”  is (,)-weakly continuous: if |d1-d2|<  then |(d1)-(d2)|< If the k-sample Canonical Tester with threshold O( ) does not correctly distinguish <α-ε from >β+ε, then no tester can distinguish <α+ε from >β-ε in k/no(1) samples.

  26. Part IIIb: Robust Intelligent Communication

  27. Intelligence and Interaction [Juba & S.] • Typical communication “protocols” non-robust. • Depend on perfect understanding between sender and receiver. Require universal adoption of fixed standards. Is this essential? • Why? • To reduce human oversight in critical tasks. • E.g., Cars that exchange information, hospitals exchanging medical records. • Heterogeneity leads to violation of “standards”. • Technical issues: • Classical communication suppresses/fears intelligence of communicators. Need new models, methods to exploit intelligence of sender & receiver.

  28. Modelling the Problem • Alice wishes to send algorithm A to Bob • Both know programming; but do so in different languages. • Can she send him the algorithm? • Theorem: Not possible to do this unambiguously. • Implications: Perfect understanding impossible in evolving settings (when two communicators evolve).

  29. Modelling the Problem • Alice wishes to send algorithm A to Bob • Both know programming; but do so in different languages. • Can she send him the algorithm? • Theorem [Juba & S.]: Not possible to do this unambiguously. • Implications: Perfect understanding impossible in evolving settings (when two communicators evolve) • What should we do?

  30. Communication & Goals • Communication is not an end in itself, it is a means to some (selfish, verifiable) end. • Bob must be trying to use Alice to some benefit • E.g., to alter the environment (remote control) • To learn something (intellectual curiosity). • Test Case: Bob (weak computer) tries to communicate with Alice (strong computer) to use her computational abilities. • Theorem [Juba & S.]:Bob can use Alice’s help to solve his problem iff problem is verifiable (without common prior background).

  31. Examples • Bob uses Alice to determine which programs are viruses. • Undecidable problem. Bob can not verify. • Eventually he will make an error. • Bob uses Alice to break cryptosystem. • He knows when he has broken in. Should do so. • In the process of doing so he learns Alice’s language (and realizes he is learning). • Bob uses Alice to add integers. • Can verify – so he won’t make mistakes. • But probably won’t learn her language.

  32. Implications • Architecture for communicating computers: • Each interface should have a dedicated “interpreter” • Interpreter is constantly in mode of checking and adapting. • Will future of communication look like this? • Answer in 20 years …

  33. Recap … Why is Theory Important? • Lessons learned from past are useful (theories more important than theorems). • Message of FoxConn Algorithms Course! • Good insight into problems of the future. • Occasionally … solutions useful today! • RSA, Akamai (CSAIL has more royalties from theory than all other sources put together)!

  34. Thank You!

More Related