1 / 80

Efficient Method for Maximizing Bichromatic Reverse Nearest Neighbor

Efficient Method for Maximizing Bichromatic Reverse Nearest Neighbor. Raymond Chi-Wing Wong (Hong Kong University of Science and Technology) M. Tamer Ozsu (University of Waterloo) Philip S. Yu (University of Illinois at Chicago) Ada Wai-Chee Fu (Chinese University of Hong Kong)

Download Presentation

Efficient Method for Maximizing Bichromatic Reverse Nearest Neighbor

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Efficient Method for Maximizing Bichromatic Reverse Nearest Neighbor Raymond Chi-Wing Wong (Hong Kong University of Science and Technology) M. Tamer Ozsu (University of Waterloo) Philip S. Yu (University of Illinois at Chicago) Ada Wai-Chee Fu (Chinese University of Hong Kong) Lian Liu (Hong Kong University of Science and Technology) Presented by Raymond Chi-Wing Wong Presented by Raymond Chi-Wing Wong

  2. Outline • Introduction • Related work – Bichromatic Reverse Nearest Neighbor • Problem - MaxBRNN • Algorithm - MaxOverlap • Empirical Study • Conclusion

  3. 1. Introduction • Bichromatic Reverse Nearest Neighbor (BRNN or RNN) • Given • P and O are two sets of points in the same data space • Problem • Given a point pP, a BRNN query finds all the points oO whose nearest neighbor (NN) in P are p.

  4. o3 o1 p1 o4 o2 p2 o5 1. Introduction Convenience stores NN: Nearest neighbor RNN: Reverse nearest neighbor P = {p1, p2} Customers O = {o1, o2, o3 , o4, o5} NN in P = p1 NN in P = p2 NN in P = p2 NN in P = p1 RNN = {o1, o2} RNN = {o3, o4 , o5} NN in P = p2

  5. o3 o1 p1 p o4 o2 p2 o5 Placement 1 RNN = {o1, o2} 2 1. Introduction Convenience stores NN: Nearest neighbor RNN: Reverse nearest neighbor P = {p1, p2} Customers O = {o1, o2, o3 , o4, o5} Placement 1 Suppose that we want to set up a new convenience store p Where should we set up? RNN = {o1, o2} Influence value = 2

  6. o3 o1 p1 p o4 o2 p2 o5 Placement 1 RNN = {o1, o2} 2 1. Introduction Placement 2 RNN = {o1, o2 , o3, o4 , o5} 5 Which placement is better? Placement 2 Convenience stores NN: Nearest neighbor RNN: Reverse nearest neighbor P = {p1, p2} Customers O = {o1, o2, o3 , o4, o5} Placement 2 Suppose that we want to set up a new convenience store p Where should we set up? Different placements of p may have different RNN sets RNN = {o1, o2 , o3, o4 , o5} Influence value = 5

  7. o3 o1 p1 p o4 o2 p2 o5 Placement 1 RNN = {o1, o2} 2 1. Introduction Placement 2 RNN = {o1, o2 , o3, o4 , o5} 5 Placement 3 RNN = {o1, o2 , o3, o4 , o5} 5 Convenience stores NN: Nearest neighbor RNN: Reverse nearest neighbor P = {p1, p2} Customers O = {o1, o2, o3 , o4, o5} Placement 3 Suppose that we want to set up a new convenience store p Where should we set up? Different placements of p may have the same RNN set RNN = {o1, o2 , o3, o4 , o5} Influence value = 5

  8. o3 o1 p1 o4 o2 p2 o5 Placement 1 RNN = {o1, o2} 2 1. Introduction Placement 2 RNN = {o1, o2 , o3, o4 , o5} 5 Placement 3 RNN = {o1, o2 , o3, o4 , o5} 5 Convenience stores NN: Nearest neighbor RNN: Reverse nearest neighbor P = {p1, p2} Customers O = {o1, o2, o3 , o4, o5} Suppose that we want to set up a new convenience store p Where should we set up? Problem: We want to find a region R (or area) such that when p is placed in R, the influence value of p is maximized.

  9. 1. Introduction • Related Work • Arrangement • Running Time = O(|O| log |P| + |O|2 +2(|O|))where (|O|) is a function on |O| and is (|O|) • Our Proposed Algorithm MaxOverlap • Running Time = O(|O| log |P| + k2 |O| +k |O| log |O|)where k << |O| • Significant improvementon Running Time Problem: We want to find a region R (or area) such that when p is placed in R, the influence value of p is maximized.

  10. o3 o1 p1 p o4 o2 p2 o5 2. Problem Convenience stores NN: Nearest neighbor RNN: Reverse nearest neighbor P = {p1, p2} Customers O = {o1, o2, o3 , o4, o5} RNN = {o1, o2 , o3, o4 , o5} Problem: We want to find a region R (or area) such that when p is placed in R, the influence value of p is maximized.

  11. o3 o1 p1 p o4 o2 p2 o5 2. Problem Convenience stores NN: Nearest neighbor RNN: Reverse nearest neighbor P = {p1, p2} Customers O = {o1, o2, o3 , o4, o5} RNN = {o1, o2 , o3, o4 , o5} Consistent region Problem: We want to find a region R (or area) such that when p is placed in R, the influence value of p is maximized. Influence value = 5 For any two possible placements in this region, their RNN sets are the same

  12. o3 o1 p1 o4 o2 p2 o5 2. Problem Convenience stores NN: Nearest neighbor RNN: Reverse nearest neighbor P = {p1, p2} Customers O = {o1, o2, o3 , o4, o5} Problem: We want to find a region R (or area) such that when p is placed in R, the influence value of p is maximized.

  13. o3 o1 p1 p o4 o2 p2 o5 2. Problem Convenience stores NN: Nearest neighbor RNN: Reverse nearest neighbor P = {p1, p2} Customers O = {o1, o2, o3 , o4, o5} RNN = {o1, o2} Problem: We want to find a region R (or area) such that when p is placed in R, the influence value of p is maximized.

  14. o3 o1 p1 p o4 o2 p2 o5 2. Problem Convenience stores NN: Nearest neighbor RNN: Reverse nearest neighbor P = {p1, p2} Customers O = {o1, o2, o3 , o4, o5} RNN = {o1, o2 , o3, o4 , o5} Problem: We want to find a region R (or area) such that when p is placed in R, the influence value of p is maximized. Non-Consistent region

  15. o3 o1 p1 o4 o2 p2 o5 2. Problem Convenience stores NN: Nearest neighbor RNN: Reverse nearest neighbor P = {p1, p2} Customers O = {o1, o2, o3 , o4, o5} Consistent region Problem: We want to find a region R (or area) such that when p is placed in R, the influence value of p is maximized.

  16. o3 o1 p1 o4 o2 p2 o5 2. Problem Convenience stores NN: Nearest neighbor RNN: Reverse nearest neighbor P = {p1, p2} Customers O = {o1, o2, o3 , o4, o5} Consistent region Problem: We want to find a region R (or area) such that when p is placed in R, the influence value of p is maximized. Influence value = 5 Many consistent regions!

  17. o3 o1 p1 o4 o2 p2 o5 2. Problem Convenience stores NN: Nearest neighbor RNN: Reverse nearest neighbor P = {p1, p2} Customers O = {o1, o2, o3 , o4, o5} Maximal consistent region Problem: We want to find a region R (or area) such that when p is placed in R, the influence value of p is maximized. Influence value = 5 There does not exist another consistent region R’ where (1) R’ covers R and (2) the RNN sets of R and R’ are equal

  18. o3 o1 p1 o4 o2 p2 o5 2. Problem Convenience stores NN: Nearest neighbor RNN: Reverse nearest neighbor P = {p1, p2} Customers O = {o1, o2, o3 , o4, o5} Maximal consistent region Maximal consistent region Problem: We want to find a region R (or area) such that when p is placed in R, the influence value of p is maximized. Influence value = 5 There does not exist another consistent region R’ where (1) R’ covers R and (2) the RNN sets of R and R’ are equal

  19. o3 o1 p1 o4 o2 p2 o5 Problem: We want to find a maximal consistent region R such that when the influence value of R is maximized. 2. Problem We call this problem Maximizing Bichromatic Reverse Nearest Neighbor (MaxBRNN) Maximal consistent region Problem: We want to find a region R (or area) such that when p is placed in R, the influence value of p is maximized.

  20. o3 o1 p1 o4 o2 p2 o5 Problem: We want to find a maximal consistent region R such that when the influence value of R is maximized. 2. Problem We call this problem Maximizing Bichromatic Reverse Nearest Neighbor (MaxBRNN) Two challenges: Challenge 1: It is difficult to find a maximal consistent region Challenge 2: We need to return the maximal consistent region with the greatest influence value

  21. p1 o1 o2 p2 Problem: We want to find a maximal consistent region R such that when the influence value of R is maximized. 2. Problem Nearest location circle (NLC) We call this problem Maximizing Bichromatic Reverse Nearest Neighbor (MaxBRNN) Convenience stores P = {p1, p2} Customers Construct a circle centered at o2 with radius |p2, o2| O = {o1, o2} NN in P = p2 Two challenges: Challenge 1: It is difficult to find a maximal consistent region Challenge 2: We need to return the maximal consistent region with the greatest influence value Construct a circle centered at o1 with radius |p1, o1| NN in P = p1

  22. p1 o1 o2 p2 Problem: We want to find a maximal consistent region R such that when the influence value of R is maximized. 2. Problem Nearest location circle (NLC) We call this problem Maximizing Bichromatic Reverse Nearest Neighbor (MaxBRNN) Convenience stores P = {p1, p2} Customers O = {o1, o2} Two challenges: Challenge 1: It is difficult to find a maximal consistent region A Challenge 2: We need to return the maximal consistent region with the greatest influence value

  23. p1 o1 o2 p2 Problem: We want to find a maximal consistent region R such that when the influence value of R is maximized. 2. Problem Nearest location circle (NLC) We call this problem Maximizing Bichromatic Reverse Nearest Neighbor (MaxBRNN) Convenience stores P = {p1, p2} Customers O = {o1, o2} Two challenges: Challenge 1: It is difficult to find a maximal consistent region B A Challenge 2: We need to return the maximal consistent region with the greatest influence value

  24. p1 o1 o2 p2 Problem: We want to find a maximal consistent region R such that when the influence value of R is maximized. 2. Problem Nearest location circle (NLC) We call this problem Maximizing Bichromatic Reverse Nearest Neighbor (MaxBRNN) Convenience stores P = {p1, p2} Customers O = {o1, o2} Two challenges: Challenge 1: It is difficult to find a maximal consistent region B C A Challenge 2: We need to return the maximal consistent region with the greatest influence value

  25. p1 o1 o2 p2 Problem: We want to find a maximal consistent region R such that when the influence value of R is maximized. 2. Problem Nearest location circle (NLC) We call this problem Maximizing Bichromatic Reverse Nearest Neighbor (MaxBRNN) Convenience stores P = {p1, p2} Customers O = {o1, o2} Two challenges: D Challenge 1: It is difficult to find a maximal consistent region B C A Challenge 2: We need to return the maximal consistent region with the greatest influence value

  26. p1 o1 o2 p2 Problem: We want to find a maximal consistent region R such that when the influence value of R is maximized. Four maximal consistent regions 2. Problem Solution: Region A Intersection between two NLCs Nearest location circle (NLC) We call this problem Maximizing Bichromatic Reverse Nearest Neighbor (MaxBRNN) Convenience stores P = {p1, p2} Customers O = {o1, o2} 0 RNN set = {} Two challenges: D Challenge 1: It is difficult to find a maximal consistent region B C A Challenge 2: We need to return the maximal consistent region with the greatest influence value RNN set = {o1} RNN set = {o2} 1 1 RNN set = {o1, o2} 2

  27. p1 o1 o2 p2 Problem: We want to find a maximal consistent region R such that when the influence value of R is maximized. Four maximal consistent regions 2. Problem Solution: Region A Intersection between two NLCs Nearest location circle (NLC) We call this problem Maximizing Bichromatic Reverse Nearest Neighbor (MaxBRNN) Lemma: The solution of MaxBRNN can be represented by an intersection of multiple nearest location circles. Two challenges: D Challenge 1: It is difficult to find a maximal consistent region B C A Challenge 2: We need to return the maximal consistent region with the greatest influence value

  28. Problem: We want to find a maximal consistent region R such that when the influence value of R is maximized. 2. Problem We call this problem Maximizing Bichromatic Reverse Nearest Neighbor (MaxBRNN) Two challenges: We propose an algorithm called MaxOverlap Challenge 1: It is difficult to find a maximal consistent region Challenge 2: We need to return the maximal consistent region with the greatest influence value

  29. 3. Algorithm • Make use of the principle of region-to-point transformation • Search a limited number of points • Find the optimal point This optimal point can be mapped to the optimal region in Optimal Region Search Problem Optimal Point Search Problem Optimal Region Search Problem

  30. p3 p4 o4 o3 o5 o2 p2 o6 p5 o1 p1 3. Algorithm Convenience stores P = {p1, p2 , p3 , p4 , p5} Customers O = {o1, o2, o3 , o4, o5 , o6}

  31. 3. Algorithm p3 p4 o4 o3 o5 o2 p2 o6 p5 o1 p1

  32. Solution Intersection of c1, c2 and c3 3. Algorithm NLC c3 o4 o3 o5 NLC c2 o2 o6 The maixmal consistent region which maximizes the RNN set o1 NLC c1 Intersection of c1, c2 and c3

  33. 3. Algorithm • Algorithm MaxOverlap • Three-Step Algorithm

  34. 3. Algorithm Step 1 (Finding Intersection Point) o4 o3 o5 o2 o6 o1

  35. 3. Algorithm Step 1 (Finding Intersection Point) q7 o4 o3 q6 o5 q8 o2 q9 q1 o6 q4 q3 q5 q2 o1

  36. 3. Algorithm Step 2 (Point Query) q7 o4 o3 q6 o5 q8 o2 q9 q1 o6 q4 q3 q5 q2 o1 Point query for q4

  37. 3. Algorithm Step 2 (Point Query) c1 Result for q4 = { } q7 o4 o3 q6 o5 q8 o2 q9 q1 o6 q4 q3 q5 q2 o1 Point query for q4

  38. 3. Algorithm Step 2 (Point Query) c1 Result for q4 = { } , c3 q7 o4 o3 q6 o5 q8 o2 q9 q1 o6 q4 q3 q5 q2 o1 Point query for q4

  39. 3. Algorithm Step 2 (Point Query) c1 Result for q4 = { } , c3 q7 o4 o3 q6 o5 q8 o2 q9 q1 o6 q4 q3 q5 q2 o1 Point query for q3

  40. 3. Algorithm Step 2 (Point Query) c1 Result for q4 = { } , c3 c1 Result for q3 = { } q7 o4 o3 q6 o5 q8 o2 q9 q1 o6 q4 q3 q5 q2 o1 Point query for q3

  41. 3. Algorithm Step 2 (Point Query) c1 Result for q4 = { } , c3 c1 , c2 Result for q3 = { } q7 o4 o3 q6 o5 q8 o2 q9 q1 o6 q4 q3 q5 q2 o1 Point query for q3

  42. 3. Algorithm Step 2 (Point Query) c1 Result for q4 = { } , c3 c1 , c2 , c3 Result for q3 = { } q7 o4 o3 q6 o5 q8 o2 q9 q1 o6 q4 q3 q5 q2 o1 Point query for q3

  43. 3. Algorithm Step 2 (Point Query) c1 Result for q4 = { } , c3 c1 , c2 , c3 Result for q3 = { } q7 c1, c2, c3 Result for q1 = { } o4 o3 q6 o5 q8 o2 q9 q1 o6 q4 q3 q5 q2 o1 Point query for q1

  44. 3. Algorithm Step 2 (Point Query) c1 Result for q4 = { } , c3 c1 , c2 , c3 Result for q3 = { } q7 c1, c2, c3 Result for q1 = { } o4 c1, c2, c3 Result for q5 = { } o3 q6 o5 q8 o2 … q9 q1 o6 q4 q3 q5 q2 o1 Point query for q5

  45. 3. Algorithm The intersection of c1, c2 and c3 corresponds to the solution. Step 3 (Finding Maximum Size) c1 Result for q4 = { } , c3 c1 , c2 , c3 Result for q3 = { } q7 c1, c2, c3 Result for q1 = { } o4 c1, c2, c3 Result for q5 = { } o3 q6 o5 q8 o2 … q9 q1 o6 q4 q3 q5 q2 o1 Optimal Point Search Problem Optimal Region Search Problem

  46. 3. Algorithm • Theorem: The running time of algorithm MaxOverlap is O(|O| log |P| + k2|O| + k |O| log |O|)where • k is typically much smaller than |O|

  47. 3. Algorithm • Enhancement 1: We process the intersection points q in a pre-defined order • Enhancement 2: • Step 2 and Step 3 can be combined • We introduce a pruning technique such that some intersection points will not be processed.

  48. 4. Empirical Study • Synthetic Dataset • P: Gaussian distribution • O: Zipfian distribution • Real Dataset • Rtree Portalhttp://www.rtreeportal.org/spatial.html • CA (62,556) • LB (53,145) • GR (23,268) • GM (36,334) • P: one of the above datasets • O: one of the above datasets

  49. 4. Empirical Study • Measurements • Execution Time • Storage • Our proposed algorithms • MaxOverlap-P • MaxOverlap with Pruning • MaxOverlap-NP • MaxOverlap without pruning • Comparison with adapted algorithms • Arrangement • Buffer-Adapt

  50. 4. Empirical Study • Small dataset

More Related