760 likes | 877 Views
Walking on Minimax Paths for -NN Search. Kye-Hyeon Kim and Seungjin Choi Machine Learning Laboratory POSTECH, Korea AAAI-13. Motivation: Clustered Data. Motivation: -NN Search. Query. Motivation: Euclidean Distance. Query. Link-based Measures.
E N D
Walking on Minimax Pathsfor -NN Search Kye-Hyeon Kim and Seungjin Choi Machine Learning Laboratory POSTECH, Korea AAAI-13
Motivation: -NN Search Query
Link-based Measures • Pseudo-Inverse of Graph Laplacian(’03, ’07) • (Laplacian) Exponential Diffusion Kernel (’03) • Von Neumann Diffusion Kernel (’04) • Euclidean Commute Time Distance (’04) • Regularized LaplacianKernel (’05) • Markov Diffusion Kernel (’06) • Cross-Entropy Diffusion Matrix (’06) • Random Walk with Restart (’06, ’08) • Regularized Commute Time Kernel (’12) • (See Fouss et al. 2012 and references therein)
Neighborhood Graph Connect Close Points
Link-based Measures: Example Connect Close Points
Link-based Measures: -NN Graph Time Connect Close Points
Link-based Measures: Example More & Shorter Paths More Similar
Link-based Measures: Example More & Shorter Paths More Similar
Link-based Measures: Example More & Shorter Paths More Similar
Link-based Measures: Example More & Shorter Paths More Similar
Link-based Measures: Example More & Shorter Paths More Similar
Link-based Measures: Example More & Shorter Paths More Similar
Link-based Measures: Example More & Shorter Paths More Similar
Link-based Measures: Example More & Shorter Paths More Similar
Link-based Measures: Computation Evaluation Needed for All Possible Paths between Nodes More & Shorter Paths More Similar
Link-based Measures: Computation Similarity = Sum of Scores over All Possible Paths between Nodes More & Shorter Paths More Similar
Link-based Measures: Computation Similarity = Sum of Scores over All Possible Paths between Nodes More & Shorter Paths More Similar
Link-based Measures: Computation Similarity = Sum of Scores over All Possible Paths between Nodes More & Shorter Paths More Similar
Link-based Measures: Computation Similarity = Sum of Scores over All Possible Paths between Nodes More & Shorter Paths More Similar
Link-based Measures: Computation Similarity = Sum of Scores over All Possible Paths between Nodes Naïve Algorithm: Time Recent Algorithm: Time (Yen, Mantrach, & Shimbo 2008) More & Shorter Paths More Similar
Shortest Path Distance Shorter Path Exists More Similar
Shortest Path Distance: Computation Distance = Total Length of Shortest Path between Nodes Dijkstra’s Algorithm: Time Shorter Path Exists More Similar
Shortest Path Distance: “Shortcuts” Shortcuts Make Distinct Clusters Closer
Problem Summary & Our Goal Poor Scalability of Link-based Measures Poor Robustness of Shortest Path Distance
Problem Summary & Our Goal Poor Scalability of Link-based Measures Poor Robustness of Shortest Path Distance Link-based Measure on More Reliable Paths than Shortest Paths
New Link-based Measure Similarity = Sum of New Scores over All Possible Paths between Nodes
New Link-based Measure Similarity = Sum of New Scores over All Possible Paths between Nodes
New Link-based Measure: Varying Similarity = Sum of New Scores over All Possible Paths between Nodes Long Paths (Large Norms) Low Scores
New Link-based Measure: Varying Similarity = Sum of New Scores over All Possible Paths between Nodes
New Link-based Measure: Varying Similarity = Sum of New Scores over All Possible Paths between Nodes
New Link-based Measure: Varying Similarity = Sum of New Scores over All Possible Paths between Nodes
New Link-based Measure: Varying Similarity = Sum of New Scores over All Possible Paths between Nodes
New Link-based Measures: Example 8 13 10 12 7 11
New Link-based Measures: Example 8 13 10 12 7 11
New Link-based Measures: Example 8 13 10 12 7 11
New Link-based Measure: Varying With Small , Similarity Sum of Scores over Only a Few Small Paths Between Nodes
New Link-based Measure: Varying With Small , Similarity Sum of Scores over Only a Few Small Paths Between Nodes When , Distance = Norm of Smallest Path Between Nodes
New Link-based Measure: Varying 1 1 1 1 5 1 1 Loose Path 2 2 2 2 2 2 2 2 Compact Path
New Link-based Measure: Varying 1 1 1 1 5 1 1 2 2 2 2 2 2 2 2
New Link-based Measure: Varying When Large :Compact Path Small
New Link-based Measures: Example Long Links between Distinct Clusters