720 likes | 746 Views
Explore the world of combinatorial problems, factorization challenges, and RSA encryption. Learn about efficient recipes for solving complex problems and the impact on encryption security schemes. Delve into the Factorization Problem and its significance in cryptography. Discover historical milestones in algorithmic advancements and the latest optimizations in AKS's paper. Unravel the mysteries of DNA sequences and bioinformatics applications. Dive into the fascinating realm of computational sciences and encryption technologies.
E N D
An optimal algorithm for identifying a maximum-density segment 呂學一 (中央研究院 資訊科學所) http://www.iis.sinica.edu.tw/~hil/ Microsoft Office XP is needed to see all the animation effects. Maximum-Density Segment @ EE.NTU
What do algorithm people do? Inventing efficient recipes to solve combinatorial problems Maximum-Density Segment @ EE.NTU
A famous combinatorial problem • The Factorization Problem • Input: a number N • Output: • “yes” if N is a prime number; • A factorization of N if N is not a prime number. • For example, • N = 323264989793317. • Output = 18672511 * 17312347. Maximum-Density Segment @ EE.NTU
OPEN QUESTION Is there an efficient recipe for the Factorization Problem? Maximum-Density Segment @ EE.NTU
Why Factorization? The security of many encryption schemes is based upon the assumption that the factorization problem is difficult. Maximum-Density Segment @ EE.NTU
RSA encryption –– 1978 Rivest Shamir Adleman Maximum-Density Segment @ EE.NTU
RSA factorization challenges Maximum-Density Segment @ EE.NTU
US$10,000 –– RSA-576 • 1881988129206079638386972394616504398071635633794138270076335642298885971523466548531906060650474304531738801130339671619969232120573403187955065699621305168759307650257059 Maximum-Density Segment @ EE.NTU
RSA-576 factored in December 3, 2003 • 398075086424064937397125500550386491199064362342526708406385189575946388957261768583317 • 472772146107435302536223071973048224632914695302097116459852171130520711256363590397527 • At the same time, Adi Shamir gave two talks at NTU (Dec. 4, 2003 ) Maximum-Density Segment @ EE.NTU
US$20,000 –– RSA-640 • 3107418240490043721350750035888567930037346022842727545720161948823206440518081504556346829671723286782437916272838033415471073108501919548529007337724822783525742386454014691736602477652346609 Maximum-Density Segment @ EE.NTU
US$200,000 –– RSA-2048 • 25195908475657893494027183240048398571429282126204032027777137836043662020707595556264018525880784406918290641249515082189298559149176184502808489120072844992687392807287776735971418347270261896375014971824691165077613379859095700097330459748808428401797429100642458691817195118746121515172654632282216869987549182422433637259085141865462043576798423387184774447920739934236584823824281198163815010674810451660377306056201619676256133844143603833904414952634432190114657544454178424020924616515723350778707749817125772467962926386356373289912154831438167899885040445364023527381951378636564391212010397122822120720357 Maximum-Density Segment @ EE.NTU
Short of cash? www.rsasecurity.com/rsalabs/challenges/factoring/
RSA 2003 (April ’03) Maximum-Density Segment @ EE.NTU
2002 Turing Award (June’03) Maximum-Density Segment @ EE.NTU
The awarded paper • Only 7 pages. • “A Method for Obtaining Digital Signatures and Public Key Cryptosystems”, Communications of the ACM21, 120-126, 1978. Maximum-Density Segment @ EE.NTU
“PRIMES is in P”Agarwal, Kayal, and Saxena August 6, 2002 Maximum-Density Segment @ EE.NTU
PRIMES is in P • The PRIMES problem: • Input: a number N. • Output: • “yes” if N is a prime number. • “no” if N is not a prime number. • Only 9 pages! • Running time is O(n12), where n is the number of digits. Maximum-Density Segment @ EE.NTU
NEW YORK TIMES, Aug. 8, 2002 • Previous algorithmic results that caught the attention of the New York Times • 1984, Karmarkar’s algorithm for solving linear programs. • 1979, Khachian’s algorithm for solving linear programs. Maximum-Density Segment @ EE.NTU
The latest version (v.3) of AKS’s paper • The running time is now improved from O(n12) to O(n7.5). Maximum-Density Segment @ EE.NTU
What do algorithm people do? • Looking for important/interesting combinatorial problems • Coming up with efficient recipes to solve them exactly or approximately. Maximum-Density Segment @ EE.NTU
Bioinformatics • A gold mine of combinatorial problems Maximum-Density Segment @ EE.NTU
An example: My results Maximum-Density Segment @ EE.NTU
Finding a DNA segment with Max GC-density in linear time WABI J. Comput. Sys. Sci. ESA SIAM J. Computing Maximum-Density Segment @ EE.NTU
DNA Sequences • [Chargaff and Vischer, 1949] • DNA consisting of A, G, T, C • Adenine (腺嘌呤) • Guanine (鳥糞嘌呤) • Cytosine (胞嘧啶) • Thymine (胸腺嘧啶) Maximum-Density Segment @ EE.NTU
[Vischer, Zamenhof, Chargaff, 1949] • Negative evidences for the widely believed %A = %G = %T = %C. Maximum-Density Segment @ EE.NTU
Edwin Chargaff, 1905- • Observing • %A ~ %T • %G ~ %C • “A comparison of the molar proportions reveals certain striking, but perhaps meaningless, regularities” Maximum-Density Segment @ EE.NTU
Double Helix • [Watson and Crick, Nature, April 25, 1953] • Biologist (age 23, fresh Ph.D.) + Physicist (age 35, still a Ph.D. student) • 900 words, 2 pages Maximum-Density Segment @ EE.NTU
1962 Nobel Prize in Physiology or Medicine • Crick, Watson, and Wilkins Maximum-Density Segment @ EE.NTU
DNA’s picture • [Alexander Rich, 1973] • Structure biologist at MIT. • DNA’s picture in atomic resolution. Maximum-Density Segment @ EE.NTU
Celebrating 50 years of Double Helix (April 25, 1953 – 2003) Maximum-Density Segment @ EE.NTU
Francis Crick 1916-2004 • Passed away on July 28, 2004 taken in 1993 in Paris Maximum-Density Segment @ EE.NTU
Maurice Wilkins 1916-2004 • Passed away on Oct 5, 2004 Maximum-Density Segment @ EE.NTU
GC-content • Non-uniformity of nucleotide composition • 25% - 75% in genomes of all of organisms • 40% - 50% in typical mammalian genomes • 30% - 60% in human chromosomes • The underlying causes are still unknown. Maximum-Density Segment @ EE.NTU
GC content • GC-content is positively correlated with • gene length, • gene density, • patterns of coden usage, • recombination rate within chromosomes, • … Maximum-Density Segment @ EE.NTU
The Problem • Input: • an n-bit string S, • an integer L. • Output: • a substring S[i, j] of S with maximum density over all substrings of S with at least L bits. Maximum-Density Segment @ EE.NTU
Example • S = 1 0 0 1 1 0 0 1 0 1 1 0 1 0 0 1 • L = 1, 1 0 0 1 1 0 0 1 0 1 1 0 1 0 0 1 • L = 2, 1 0 0 1 1 0 0 1 0 1 1 0 1 0 0 1 • L = 3, 1 0 0 1 1 0 0 1 0 1 1 0 1 0 0 1 Maximum-Density Segment @ EE.NTU
density of each segment in O(1) time • prefix-sum(i) = S[1]+S[2]+…+S[i], • all n prefix sums are computable in O(n) time. • sum(i, j) = prefix-sum(j) – prefix-sum(i-1) • density(i, j) = sum(i, j) / (j-i+1) Maximum-Density Segment @ EE.NTU
Good partners • Finding the best ending position g(i) for each i=1,2,…,n. i + L g(i) L maximing avg[i, g(i)] Maximum-Density Segment @ EE.NTU
Previous Work • [Huang, CABIOS ’94] • O(nL) time. • Key observation: no need to examine substrings longer than 2L. g(i) i+L L L Maximum-Density Segment @ EE.NTU
Recent Progress • [Lin, Jiang, Chao, J. Computer Systems and Science (JCSS), 2002] • O(n log L) time. • Techniques: • Right-skew decomposition. • Jumping tables that allows binary search. g(i) i+L L L Maximum-Density Segment @ EE.NTU
Our results • Reducing the running time to O(n). Maximum-Density Segment @ EE.NTU
Reviewing Lin, Jiang, and Chao’s Algorithm Maximum-Density Segment @ EE.NTU
Right-Skew Substring • S[i, j] is right-skew if for each k = i,…, j-1 • density[i, k] ≤ density[k+1, j]. • S =1 0 0 1 1 0 0 1 0 1 1 0 1 0 0 1 • 1 0 0 1 1 0 0 1 0 1 1 0 1 0 0 1 • 1 0 0 1 1 0 0 1 0 1 1 0 1 0 0 1 • 1 0 0 1 1 0 0 1 0 1 1 0 1 0 0 1 Maximum-Density Segment @ EE.NTU
Right-Skew Decomposition • Partition S into substrings S1,S2,…,Sk such that • each Si is a right-skew substring of S • density(S1) > density(S2) > … > density(Sk) • [Lin, Jiang, Chao] • Unique • Computable in linear time. Maximum-Density Segment @ EE.NTU
1 1 1 0 1 1 0 1 0 1 1 0 0 An example 1 > 2/3 > 3/5 > 1/3 Maximum-Density Segment @ EE.NTU
Why RS-decomposition? • It suffices to search for g(i) among the boundaries of RS-decomposition of S[i, n]. • The boundaries’s “potential” of being a good partner is bi-tonic. • density[i, j1], density[i, j2], …, density[i, jk] is first monotonically increasing then monotonically decreasing. Maximum-Density Segment @ EE.NTU
Illustration g(i) i+L L i+L L Maximum-Density Segment @ EE.NTU
Preprocessing steps • RS-decomposition of S[i, n] for each i. • Jumping table that enables binary search among the boundaries. Maximum-Density Segment @ EE.NTU
i L First preprocessing:All RS-decompositions • The RS-decomposition of each S[i, n] • Linear time for each i = 1, …, n. • All n RS-decompositions • [Lin et al.] O(n2) time O(n) time. Maximum-Density Segment @ EE.NTU
1 1 1 0 1 1 0 1 0 1 1 0 0 Key: nested structures Maximum-Density Segment @ EE.NTU