230 likes | 322 Views
K-Route Diversification. CSci 5980 - Team 6 Reem Ali Amr Magdy University of Minnesota. Diversity: Overview. Select results that covers different possible aspects of the query Usually supported with top-k queries e.g., get top-3 images related to “jaguar”.
E N D
K-Route Diversification CSci 5980 - Team 6 Reem Ali Amr Magdy University of Minnesota
Diversity: Overview • Select results that covers different possible aspects of the query • Usually supported with top-k queries • e.g., get top-3 images related to “jaguar”
k-Route Diversification: Motivation • Plethora of spatial data is available • Between a source and destination, some applications need to select routes satisfying certain conditions • E.g., fastest paths, fuel-efficient, …etc • Summarization and selecting representative results for spatial queries becomes more important • Some applications need to select diverse routes • E.g., transporting hazardous material
Problem Statement • Input • n routes between source S and destination D • Integer k: required number of diverse routes • Output • k diverse routes between S and D • Objective • Maximizing the diversity measure among the selected routes
Challenges • Selecting k diverse items is NP-hard • Lack of standard datasets for multiple routes between single source and destination • Diversity algorithms uses distance measures that are defined on points rather than routes • e.g., Euclidean distance, road-network distance…etc • New measures are needed to capture properties of routes
Novelty • Related work addresses diversity in general without addressing spatial routes properties • No prior work incorporates diversity indexes for spatial data • E.g., Shannon index, Gini Index
Proposed Solution: Overview 3 Alternatives: • Cluster routes into k clusters and select a prototype for each cluster • using k-medoids • Employing Minack diversity algorithm by adapting an existing objective function • Defining a diversity measure for this problem (new objective function)
Minack Diversity Algorithm • General algorithm applicable for any set of items • Objective function: • Maximize the minimum pair-wise distance between two items • Distance between 2 routes: avg. pair-wise node distances • Work on two phases: • Init Phase:Get initial solution of k items • Refine Phase:For each remaining item, checkif considering this item improves the objective function Min dist k=3 Min dist
Diversity Measures: Shannon Entropy • Popular diversity index in ecology. • Reflects how many different types are there in a dataset, and how evenly entities are distributed among those types. • increases when the number of types increases and when evenness increases.
Entropy-based K-Route Diversification • Route diversity increases: • With spatial spread • With uniform distribution of routes
Entropy-based K-Route Diversification • Proposed Diversity Measure Where:
Entropy-based K-Route Diversification • Proposed Diversity Measure Region 1
Entropy-based K-Route Diversification • Proposed Diversity Measure Region 2
Entropy-based K-Route Diversification Figure 1: • p1 = (20/2)/20 = 1/2 • p2 = (20/2)/20 = 1/2 • Entropy = 1 Figure 2: • p1 = ((15+20)/2)/30 = 17.5/30 • p2 = ((20+5)/2)/30 = 12.5/30 • Entropy = 0.97 More uniformly distributed
Entropy-based K-Route Diversification Figure 1: • p1 = (40/1)/80 = 1/2 • p2 = (40/1)/80 = 1/2 • Entropy = 1 Figure 2: • p1 = (60/1)/180 = 1/3 • p2 = (60/1)/180 = 1/3 • P3 = (60/1)/180 = 1/3 • Entropy = 1.584962501 Larger Spatial Spread
Experimental Evaluation • Diversity (using min-distance) versus Clustering • Tried different k= 3,5,8,10 for different datasets (10 to 100 routes) • K=3 out of 20 • min-distDiversity = 59.43 min-distclustering = 5.78
Experimental Evaluation • Diversity (using min-distance) versus Clustering • K=5 out of 100 • min-distDiversity = 3954.84 min-distclustering = 238.54
Experimental Evaluation • Diversity (using min-distance) versus Clustering • K=10 out of 100 • min-distDiversity = 1867.51 min-distclustering = 155.05
Experimental Evaluation • Diversity (using min-distance) versus Entropy-based Diversity • K= 3 out of 10 • Min-distance Entropy-based • min-dist = 10.73 Entropy = 0.98 • min-dist = 4.23 Entropy = 1.58
Experimental Evaluation • Diversity (using min-distance) versus Entropy-based Diversity • K= 5 out of 20 • Min-distance Entropy-based • min-dist = 45.67 Entropy = 0.66 • min-dist = 5.01 Entropy = 1.57
Experimental Evaluation • Diversity (using min-distance) versus Entropy-based Diversity • K= 10 out of 100 • Min-distance Entropy-based • min-dist = 1867.5 Entropy = 0.97 • min-dist = 187.97 Entropy = 0.99
Conclusion • Diversity algorithms are more suited to Route Diversification than Clustering approaches. • Maximizing the minimum distance does not guarantee a more diverse solution. • The proposed measure is less sensitive to differences in uniformity of route distribution when each region has roughly the same number of routes. • As the difference in entropy value increases, the higher entropy solution becomes more significantly better than the lower entropy solution. • Using the pair-wise distances variance can better capture the uniformity of routes.
Thank You. Questions