1 / 20

Learning Trajectory Patterns by Clustering: Comparative Evaluation

Learning Trajectory Patterns by Clustering: Comparative Evaluation. Group D. Problem Description & Definition. Problem Description & Definition. Preprocessing G rid Q uantization Clustering

brilliant
Download Presentation

Learning Trajectory Patterns by Clustering: Comparative Evaluation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Learning Trajectory Patterns by Clustering: Comparative Evaluation Group D

  2. Problem Description & Definition

  3. Problem Description & Definition • Preprocessing Grid Quantization • Clustering Distance/Similarity - modified Euclidean distance, dynamic time warping and longest common sequence Clustering - bisection, Agglomerative and min-cut graph based with number of clusters predefined • Clustering Validation Ground-truth based Hungarian Algorithm for matching clusters generated with ground-truth clusters

  4. Preprocessing • Normalization • Grid Quantization Grid quantization s=2

  5. Preprocessing • Computation Complexity Reduction • Entry and Exit detection based on clustering starting and ending points of each trajectory (k-means clustering k=4) Location 1 Location 2 Location 3 Location 4

  6. Distance Metrics • Modified Euclidean Distance (m>n) • Dynamic Time Warping DTW is used to compare unequal length signals by finding a time warping that minimizes the total distance between matching points

  7. Distance Metrics • Longest Common Sub Sequence • s1={a, b, c, d, e, f}; s2={b, d, e, f, m ,n} • LCSS(s1,s2)={b, d, f} • where δ is a constant that controls how far we can look in the past and ε is a constant that controls the size of proximity in which we are looking for matches

  8. Distance to Similarity Metrics • Gaussian Kernel Function A similarity matrix S = {sij}, which represents a fully connected graph, is constructed from the trajectory distances using a Gaussian kernel function Where D represents one of the distance measure defined previously and the parameter σ describes the trajectory neighborhood. Large values of σ cause further apart trajectories to have a higher similarity score while small values lead to a more sparse similarity matrix (more entries will be very small) DTW σ =7.1 σ =0.1 σ =0.9 σ =2.1 σ =4.1

  9. Clustering Methods(CLUTO) • Divisive Divisive clustering is the top-down clustering where the entire trajectory training set is considered a single cluster. The K clusters are obtained by performing K − 1 repeated bisections where each bisecting cluster split results an optimal 2-way division of the similarity matrix. In addition to ensuring local optimality a global optimization step is used to optimize the solution across all bisections. • Agglomerative Agglomerative clustering is a bottom-up strategy that initially treats each trajectory as an individual cluster and merges similar clusters hierarchically in a tree-like structure, stopping when only K clusters remain. • Graph (min-cut) Similar to the divisive clustering method, graph methods seek to divide the full dataset into individual clusters. Instead of operating directly on the similarity matrix, a nearest neighbor graph is constructed where a trajectory is a vertex. Each vertex is connected by a weighted edge to its most similar trajectories. The K clusters are found using a min-cut partitioning algorithm which finds a division of the graph with minimal loss of edge weights.

  10. Clustering Validation c2 c3 c1 Ground truth clusters c1 c3 c2 Clusters to evaluated Hungarian Algorithms to maximize The number of clusters matched Accuracy=n_matched/n_total

  11. Evaluation • Dataset • CLUTO • CLUTO is a software package for clustering low- and high-dimensional datasets and for analyzing the characteristics of the various clusters. Standalone program scluster is utilized for clustering trajectories LankershimDataset 1032 trajectories 18 clusters

  12. Evaluation-Distance Metrics • How the size of Gaussian Kernel function influences the converting from distance matrix to similarity matrix: σ should be large enough accuracy σ DTW + Agglomerative

  13. Evaluation-Distance Metrics • How the size of Gaussian Kernel function influences the converting from distance matrix to similarity matrix: σ should be large enough accuracy DTW + Divisive

  14. Evaluation-Distance Metrics • How the size of Gaussian Kernel function influences the converting from distance matrix to similarity matrix: σ should be large enough Modified_Euclidean + Divisive

  15. Evaluation-Distance Metrics • How (δ, ε)parameters of LCSS influences the clustering results δ LCSS+ Graph

  16. Evaluation-Clustering • How (δ, ε)parameters of LCSS influences the clustering results ε LCSS+ Graph

  17. Evaluation-Clustering • Modified_Euclidean, DTW σ=7.1 • LCSS δ=3, ε=8 d1-Modified Euclidean, d2-DTW, d3-LCSS c1-divisive, c2-agglomerative, c3-graph

  18. Conclusion • Distance Metric Computation Complexity d1<d3<d2 • Distance Metric Distiguishability d1<d2<d3 • Clustering Capability c2<c3 c1 • Clustering Computation Complexity c1<c3c2 • Comprehensive performance d3(LCSS)+c3(graph) is the best combination

  19. Demo

  20. Thanks 

More Related