340 likes | 436 Views
On the R ange M aximum-Sum S egment Q uery Problem. Kuan-Yu Chen and Kun-Mao Chao Department of Computer Science and Information Engineering, National Taiwan University, Taiwan. The Maximum-Sum Segment. Also called the maximum-sum interval or the maximum-scoring region
E N D
On the Range Maximum-Sum Segment Query Problem Kuan-Yu Chen and Kun-Mao Chao Department of Computer Science and Information Engineering, National Taiwan University, Taiwan Chen and Chao
The Maximum-Sum Segment • Also called the maximum-sum interval or the maximum-scoring region • Given a sequence of numbers, the maximum-sum segment is simply the contiguous subsequence having the greatest total sum. • <5, -5.1, 1, 3, -4, 2, 3, -4, 7> With greatest total sum = 8 Zero prefix-/suffix-sums are possible. Chen and Chao
A Relevant Problem - RMQ • Range Minima (Maxima) Query Problem (also called Discrete Range Searching) • Given a sequence of numbers, by preprocessing the sequence we wishto retrieve the minimum (maximum) value within a given querying interval efficiently • <5, -5.1, 1, 3, -4, 2, 3, -4, 7> Minimum Maximum Chen and Chao
Range Maximum-Sum Segment Query Problem Definition: • The input is a sequence <a1,a2,……an> of real numbers which is to be preprocessed. • A query is comprised of two intervals S and E. • Our goal is to return the maximum-sum segment whose starting index lies in S and end index lies in E. Chen and Chao
A Nonoverlapping Example • Input Sequence: • 9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -5, 3 Total sum = 6 Starting region End region Chen and Chao
An Overlapping Example • Input Sequence: • 9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -5, 3 Total sum = 8 Starting region End region Chen and Chao
Our Results • We propose an algorithm that runs in O(n) preprocessing time and O(1) query time under the unit-cost RAM model. • We show that the RMSQ techniques yield alternative O(n) time algorithms for the following problems: • The maximum-sum segment with length constraints • All maximal-sum segments Chen and Chao
Strategy • Reduce the RMSQ to the RMQ problem • Theorem. If there is a <f(n), g(n)>-time solution for the RMQ problem, then there is a <f(n)+O(n), g(n)+O(1)>-time solution for the RMSQ problem. O(n) RMSQ RMQ O(1) Chen and Chao
Cumulative Sum/ Prefix Sum prefix-sum(i) = a1+a2+…+ai Chen and Chao
Computing sum(i,j)in O(1) time • prefix-sum(i) = a1+a2+…+ai • all n prefix sums are computable in O(n) time. • sum(i, j) = prefix-sum(j) – prefix-sum(i-1) j i prefix-sum(j) prefix-sum(i-1) Chen and Chao
Case 1: Nonoverlapping Maximize Maximize Minimize sum(i, j ) = prefix-sum(j) – prefix-sum(i-1) Prefix-sum sequence: • 9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -5, 3 Range Minima Query Find the highest point here Find the lowest point here Chen and Chao
Case 2: Overlapping • Some problems may occur • Prefix-sum sequence • 9, -10, 4, -2, 5, -5, 4, -3, 6, -11, 8, -3, 4, -5, 3 Negative Sum !! Find the highest point here Find the lowest point here Chen and Chao
Case 2: Overlapping • Divide into 3 possible cases: • Prefix-sum sequence: • 9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -5, 3 Range Minima Query Preprocessing time = f(n) Query time = g(n) Range Minima Query Preprocessing time = f(n) Query time = g(n) Find the highest point here Find the highest point here What should we do? Find the lowest point here Find the lowest point here Chen and Chao
Dealing with the Special Case:Single Range Query • Input Sequence: • 9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -5, 3 • Challenge: Can this special case be reduced to the RMQ problem? Total sum = 6 Chen and Chao
Reduction Procedure • Step 1. Find a partner for each index. • Step 2. Record the sum of each pair in an array • Step 3. Retrieve the maximum-sum pair by applying the RMQ techniques Chen and Chao
Our First Attempt (1) • Step 1: For each index i, we define the lowest point preceding i as its partner • Prefix-sum sequence: i Lowest point Find a partner within this region Chen and Chao
Our First Attempt (2) • Step 2: Record sum(partner(i), i) in an array i Lowest point sum(partner(i), i) Chen and Chao
Our First Attempt (3) • Step 3: Apply the RMQ techniques to the array i Applying RMQ to this sequence Querying this interval The maximum-sum pair can be retrieved Lowest point sum(partner(i), i) Chen and Chao
Bump into Difficulties • What if its partners go beyond the querying interval? i We might have to update every pair! Needs to be updated partner(i) sum(partner(i), i) Chen and Chao
A Better Partner • Prefix-sum sequence Find the nearest point at least as large as i i Left_bound(i) Find the lowest point New partner(i) Chen and Chao
Why Is It Better? (1) • It remains the best choice. • It saves lots of update steps. • It turns out that zero or one point needs to be updated. Chen and Chao
Why Is It Better? (2)-- Remains the Best Find the nearest higher point i Left_bound(i) Find the lowest point partner(i) Impossible region Chen and Chao
Why Is It Better? (3)-- Minimal-Maximal Property • Height(partner(i))< Height(j) < Height(i), for all partner(i)< j< i Next higher point Maximal point Minimal point i partner(i) No one higher than i No one lower than partner(i) Chen and Chao
Why Is It Better? (4)-- Save Some Updates • Prefix-sum sequence Next higher point Can not be the right end of the maximum-sum segment Querying interval i partner(i) No one higher than i Chen and Chao
Why Is It Better? (5)-- Nesting Property • For two indices i < j, it cannot be the case that partner(i)<partner(j) ≦i<j Maximal point i j Minimal point Minimal point Maximal point partner(j) partner(i) Chen and Chao
Why Is It Better? (6)-- An example • No overlapping is allowed • 9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -5, 3 • Nesting Property • 9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -5, 3 Chen and Chao
When a Query Comes-- Case 1: No Exceeding • The maximum pair (partner(i), i) lies in the querying interval Retrieve the maximum pair Querying interval i partner(i) We are done. Output (partner(i), i). Chen and Chao
When a Query Comes-- Case 2: Exceeding • The maximum pair (partner(i), i) goes beyond the querying interval Retrieve the maximum pair Retrieve the maximum pair Querying interval j i Maximal Minimal partner(i) Update partner(i) partner(j) (Partner(i), i) is the maximum pair. Compare (new_partner(i), i) and (partner(j), j) Can not be the right end of the maximum-sum segment. Nesting property Chen and Chao
Time Complexity • RMSQ can be reduced to the RMQ problem in O(n) time • Since under the unit-cost RAM model, there is a <O(n), O(1)>-time solution for the RMQ problem, there is a <O(n), O(1)>-time solution for the RMSQ problem. • On the other hand, RMQ can be reduced to the RMSQ problem in O(n) time, too. (Range Maxima Query: For each two adjacent elements, we augment a negative number whose absolute value is larger than them.) O(n) RMQ RMSQ O(1) Chen and Chao
Use RMSQ Techniques to Solve TwoRelevant Problems • 1. Finding the Maximum-Sum Segment with length constraints in O(n) time. - Y.-L. Lin, T. Jiang, K.-M. Chao, 2002 - T.-H Fan et al.,2003 • 2. Finding all maximal scoring subsequences in O(n) time. - W. L. Ruzzo & M. Tompa, 1999 Chen and Chao
Problem 1:The Maximum-Sum Segment with Length Constraints • Lin, Jiang, and Chao [JCSS 2002] and Fan et al. [CIAA 2003] gave O(n)-time algorithmsfor this problem. • Length at least L, and at most U L U Chen and Chao
Problem 1: Finding the Maximum-Sum Segment with Length Constraints • Length at least L, at most U • For each index i, find the maximum-sum segment whose starting point lies in [i-U+1, i-L+1] and end point is i i RMSQ query L U Runs in O(n) time since each query costs O(1) time Chen and Chao
Problem 2: All Maximal-Sum Segments • Ruzzo and Tompa [ISMB 1999] gave a O(n)-time algorithm for this problem. • Recursive definition. L(S) R(S) S Chen and Chao
Problem 2: Finding All Maximal Scoring Subsequences • Recursive calls. • Input sequence: L(S) R(S) S RMSQ query Runs in O(n) time since each query costs O(1) time Chen and Chao