1 / 27

The beauty of prime numbers vs the beauty of the random

The beauty of prime numbers vs the beauty of the random . Ely Porat Bar-Ilan University Israel. Outline. Applications Prime Numbers Group Testing De-randomized approach for group testing Applications getting into details Length Reduction. Pattern Matching .

tamika
Download Presentation

The beauty of prime numbers vs the beauty of the random

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The beauty of prime numbersvs the beauty of the random Ely Porat Bar-Ilan University Israel

  2. Outline • Applications • Prime Numbers Group Testing • De-randomized approach for group testing • Applications getting into details • Length Reduction

  3. Pattern Matching • Given a Text T and Pattern P, the problem is to find all the substring of T that equal to P. T= P=

  4. Streaming Model • The character of T arrive one by one • We can’t save T T= Automata? Φ(P) P= Our goal is to do that without saving P

  5. Hamming distance with wildcards • Find a pattern in a text with 2 complications: • Don’t cares (wildcards Ø) • Mismatches Text: Pattern:

  6. Summaries results • Offline • O(nklog2m) hamming distance with wildcards • Online Pattern Matching • hamming distance • O(klog2m) hamming distance with wildcards • O(klogm) Edit distance • Streaming • O(log2m) space O(logm) time – Exact match • O(k3log5m) space O(k2log2m) time – hamming

  7. Open problem t2p1+t3p2+…t5p6 • Online convolution in o(log2m) time per symbol. • Offline is done by FFT in O(nlogm). p1 p2 p3 p4 p5 t1 t2 t3 t4 t5 t6 . . . tn p1 p2 p3 p4 p5 m=5 t1p1+t2p2+…t5p5

  8. . . . Problem Definition . . . • m people • at most k are sick • Query: Is someone in this set sick? • Goal: identify the sick people by only few tests. • Non-adaptive ? ? ? ? ? ?

  9. Motivations • Syphilis, HIV [Dor43] • Mapping genomes [BLC91, BBK+95, TJP00] • Quality control in product testing [SG59] • Searching files in storage systems [KS64] • Sequential screening of experimental variables [Li62] • Efficient contention resolution algorithms for multiple access communication [KS64, Wol85] • Data compression [HL00] • Software testing [BG02, CDFP97] • DNA sequencing [PL94] • Molecular biology [DH00, FKKM97, ND00, BBKT96]

  10. Background Scheme size • Same conditions: • Deterministic KS64 • Random KS64 • Heavy deterministic AMS06 • Lower bound: • CR96 • Relaxed conditions: • Fully adaptive • Two staged group testing and selectors [CGR00, Kni95, BGV03, CMS01, BV03, BGV05] • Optimal monotone encoding [AH08] • Similar problems: • Inhibitors [FKKM97, Dam98, BV98, BGV03] • Bayesian case [Kni95, BL02, BL03, A.J98, BGV03] • Errors [BGV98] • DIMACS 2006 Deterministic Random and Heavy deterministic Lower bound

  11. Our Results Scheme size • Deterministic • Size • Fast construction Deterministic Random and Heavy deterministic Lower bound

  12. Prime Numbers Group Testing Position of sicks Bad event: Exist y s.t

  13. Prime Numbers Group Testing Bad event: Exist y s.t x1 x2 x3 x4 . . . xk There is a dot below each prime There exisit xi that for pi1pi2…pid>n Y mod pij=xi By CRT xi=y

  14. Prime Numbers Group Testing This give group testing of size: p1+p2+…+pr By choosing good enough primes we get O(k2log2m)

  15. Randomized Group Testing • Just choose O(k2logn) random sets of size n/k.

  16. Overall derandomization plan

  17. Error correction codes • Length of words = m • Number of words = • Distance = • Rate = R • Relative distance = • Linear code Rm m

  18. Good random linear error correction codes • GV bound: There existswith • Linear codes  faster construction • Algorithm: Pick the entries of the generating matrixuniformly and independently.

  19. Method of conditional probabilities • Algorithm: Pick the entries of the generating matrix one by one. • In each step minimize the expectednumber of collisions between code words.

  20. 0 0 0 0 2 1 0 0 0 1 1 1 1 1 1 1 0 2 2 2 2 1 0 2 1 0 1 1 0 2 1 2 1 0 1 2 2 0 1 2 0 2 1 2 0 1 2 2 2 0 1 C=[3,2,2]3-RS

  21. Reduction from Error correction codes to group testing schemes C=[3,2,2]3-RS: 1: 0 0 0 2: 1 1 1 3: 2 2 2 4: 0 1 2 5: 1 2 0 6: 2 0 1 7: 0 2 1 8: 2 1 0 9: 1 0 2 GT scheme: {1,4,7} {2,5,9} {3,6,8} {1,6,9} {2,4,8} {3,5,7} {1,5,8} {2,6,7} {3,4,9}

  22. Why should it work? • Theorem: Let C be an Then F(C) is a group testing scheme for n people with up to sick people. C=[3,2,2]3-RS: 1: 0 0 0 2: 1 1 1 3: 2 2 2 4: 0 1 2 5: 1 2 0 6: 2 0 1 7: 0 2 1 8: 2 1 0 9: 1 0 2 (Up to 2 Sick people) GT scheme: {1,4,7} {2,5,9} {3,6,8} {1,6,9} {2,4,8} {3,5,7} {1,5,8} {2,6,7} {3,4,9}

  23. Why should it work? Proof Codewords representing sick men: k A codeword representing a healthy man:

  24. Worst Case Codewords representing sick men: k A codeword representing a healthy man:

  25. What we got? Scheme size Deterministic Random and Heavy deterministic Lower bound

  26. Applications getting into details • Streaming • Up to 1 mismatch: • Assume we have a black box for searching for exact match. P: p1p2p3p4p5…pm P1,2: p1 p3 p5…pm There is more then one mistake P2,2: p2 p4 … The other way around isn’t true

  27. Streaming: Up to 1 mismatch P: p1p2p3p4p5…pm P1,2: p1 p3 p5…pm 2*3*5*7*11*…*q>m P2,2: p2 p4 … With CRT we be able to find the position of the mismatch. P1,3: p1 p4 …pm P2,3 : p2 p5… P3,3: p3 … In order to support more mistake we will had on that The Prime numbers group testing Pq,q:

More Related