410 likes | 500 Views
Finding k-Dominant Skylines in High Dimensional Space K-Dominant Skyline Computation by Using Sort-Filtering Method. SIGMOD 2006 PAKDD 2009. Outline. Motivation Definition Analysis One-scan Two-scan Sorted Retrieval Sort-Filtering Method Experimental Result Conclusion. Motivation.
E N D
Finding k-Dominant Skylines in High Dimensional SpaceK-Dominant Skyline Computation by Using Sort-Filtering Method SIGMOD 2006 PAKDD 2009
Outline • Motivation • Definition • Analysis • One-scan • Two-scan • Sorted Retrieval • Sort-Filtering Method • Experimental Result • Conclusion
Motivation • The Number of skyline point may be huge in high dimensional space. • A new concept, called k-dominant skyline to alleviate the effect of dimensionality curse on skyline query in high dimensional spaces.
Definition • D-dimensional space • A set of points : is a data set on S if every is a d-dimensional data point on S. • Total order relationship, denoted , we assume > here.
Definition • p2 dominate p5.
Definition • p5 is dominated by p2. • SP(D,S)={p1,p2,p3,p4}
Definition • Assume k=5 • p1 is better than p4 on s1,s2,s3,s5,and s6. p1 5-dominants p4
Definition • p1 can’t be 5-dominanted by the other points p1 is 5-dominant skyline point.
Analysis • User want most choose :k is bigger • p1 5-dominates p4,and p1 4-dominates p4
Analysis • k=5, 5-dominate skyline points:p1,p2,p3 • k=6, 6-dominate skyline points:p1,p2,p3,p4
One-Scan • Skyline point:P1,P2,P3,and P4.<-free skyline points. • k=3, • P1 3-dominante P2, P1 is 3-dominanted by P4 • P2 3-dominante P3. • P4 3-dominante P3. • 3-dominate skyline point:P4 belongs free skyline point. • P2 is not 3-dominate skyline point. It is 3-dominated by P1, but P1 is not 3-dominant skyline point.
One-Scan • Thus, based on Lemma 4.1, our algorithm computes k-dominant skyline points by actually computing the free skyline points in D and using them to eliminate non-k-dominate skyline points. • 1.R stores the set of intermediate k-dominant skyline points in D. • 2. T stores the set of intermediate skyline points in D that are not k-dominant (i.e., not in R). • Together, R ∪ T gives the set of skyline points in D.
One-Scan • For each point p in D, p is first compared against points in T. 1. If a point ∈ T is dominated by p( is not skyline),then remove from T. 2. If a point ∈ T dominates p(p is not skyline) or p= (p is not unique),then p is ignored. • Case 1: p is unique skyline point, compared against points in R to check k-dominante.
One-Scan • For each point in R, 1.If p k-dominates ,then is moved from R to T . 2.If k-dominates p, then p is not k-dominant End of p compared against points in R. P is not dominated-> insert to R P is dominated-> insert to T.
One-Scan • K=5 • p1: initial p1 insert to R T:{}, R:{p1}
One-Scan • p2: T:{}, R:{p1}->T is empty ,check point in R p2 is not 5-dominated by p1 and p1 is not 5-dominated by p2 and->p2 insert to R T:{}, R:{p1,p2}
One-Scan • p3: T:{}, R:{p1,p2}->T is empty ,check point in R p3 is not 5-dominated by p1 or p2 and p1 and p2 are not 5-dominated by p3 ->p3 insert to R T:{}, R:{p1,p2,p3}
One-Scan • p4: T:{}, R:{p1,p2,p3}->T is empty ,check point in R p4 is 5-dominated by p1, p2, and p3 ->p4 insert to T T:{p4}, R:{p1,p2,p3}
One-Scan • p5: T:{p4}, R:{p1,p2,p3}->check point in T • p5 don’t dominates p4 and p4 don’t dominates p5 -> check point in R • p5 is 5-dominated by p2 and p3->p5 insert to T • T:{p4,p5}, R:{p1,p2,p3}
One-Scan • p5 is dominated by p2. p5 is not skyline, but it is in T.
Two Scan • In the One-Scan algorithm, free skyline points (i.e., T ) need to be maintained to compute the k-dominant skyline points. • Scanning D twice avoid need to maintain T. • Fist scan of D, computed a set of candidate k-dominant R. • Base on Lemma 4.1 p2, false positive can exist in R. • Second scan D-R to determine whether a point is indeed k-dominate skyline
Two Scan • k=3 • First Scan: Initinal p1:insert to R R={p1}
Two Scan • k=3 • First Scan: p2 compared against point in R={p1}. p2 3-dominates p1 p1 remove from R, p2 is inserted to R. R={p2}
Two Scan • k=3 • First Scan: p3 compared against point in R={p2}. p3 is 3-dominated by p2, R={p2}
Two Scan • k=3 • First Scan: p4 compared against point in R={p2}. p2 is inserted to R R={p2,p4}
Two Scan • k=3 • Second Scan: R={p2,p4},D-R={p1,p3} choose p1 compared against point in R={p2,p4} R ={p2,p4},
Two Scan • k=3 • Second Scan: R={p2,p4},D-R={p1,p3} choose p3 compared against point in R={p2,p4} p3 3-dominates p4 (false positive) remove p4 from R ,R={p3} 3-dominant skyline point: p3
Sorted Retrieval Initial T=D 4-dominate p3,p4 Remove p3, p4 from T.
Sorted Retrieval 3=d-k+1=6-4+1 p1 is 4-dominant skyline point Moved from T to R
Sort-Filtering Method • K-Dominant Skyline Algorithm: (From k=d calculation)1.Domination Power Calculation 2.k-Dominant Checking
Sort-Filtering Method Domination Power Calculation Example : p(9,1,2) and q(3,2,3) :in 3D space Domination Power p=2, q=1 sum(p)=12, sum(q)=8 sum(p)>sum(q), but Domination Power p>q p is 2-dominated q.
Sort-Filtering Method • Domination Power Calculation Calculate Domination Power and sum.
Sort-Filtering Method • Domination Power Calculation
Sort-Filtering Method • k-Dominant Checking • Consider 5-dominant • N5,N3,N8,N1,N6 are 5-dominated by the first object N2 , remove 5-dominated objects ,output N2
Conclusion • Use domination power to find k-domination skyline? • Choose k to reduce the number of k-dominant skyline points.