810 likes | 949 Views
Computational Movement Analysis Lecture 2: Clustering Joachim Gudmundsson. Fundamental tools: clustering. Clustering: Group similar objects into clusters. . Fundamental tools: clustering. Clustering: Group similar (sub)curves into clusters. Similarity measure: Fr é chet distance.
E N D
Computational Movement AnalysisLecture 2: ClusteringJoachim Gudmundsson
Fundamental tools: clustering Clustering: Group similar objects into clusters.
Fundamental tools: clustering Clustering: Group similar (sub)curves into clusters. Similarity measure:Fréchet distance Question: Do we need any constraints on a cluster? Constraints on subcurves in a cluster?
Aim: Cluster subcurves Cluster of subcurves
Recall: Fréchet Distance Fréchet Distance measures the similarity of two curves. Dog walking example • Person is walking his dog (person on one curve and the dog on other) • Allowed to control their speeds but not allowed to go backwards! • Fréchet distance of the curves: minimal leash length necessary for both to walk the curves from beginning to end
Recall: Fréchet Distance Input: Two polygonal chains P=p1, … , pn and Q=q1, … , qm in Rd. The Fréchet distance between P and Q is: where and range over all continuous non-decreasing reparametrizations. Note that (0)=p1, (1)=pn, (0)=q1 and (1)=qm. Well-suited for the comparison of curves since it takes the continuity of the curves into account. (P,Q) =
Decision algorithm: compute path Algorithm: 1. Compute Free Space diagram mn cells O(mn) time 2. Compute a non-xy-decreasing path from (q1,p1) to (qm,pn). Build network O(mn) time. Find a path O(mn) time. (qm,pn) P (q1,p1) Q
Cluster Input: A polygonal curve T, an integer m>1 and a distance d. Cluster: m subcurves T1, … , Tm of T with distance at most d between any two subcurves. Constraints?
Cluster Input: A polygonal curve T, an integer m>1 and a distance d. Cluster: m subcurves T1, … , Tm of T with distance at most d between any two subcurves. Constraint 1: subcurvesare pairwise disjoint
Cluster Input: A polygonal curve T, an integer m>1 and a distance d. Cluster: m subcurves T1, … , Tm of T with distance at most d between any two subcurves. Constraint 1: subcurvesare pairwise disjoint More constraints? d infinite number of clusters
Cluster Input: A polygonal curve T, an integer m>1 and a distance d. Cluster: m subcurves T1, … , Tm of T with distance at most d between any two subcurves. Constraint 1: subcurvesare pairwise disjoint Constraint 2: cluster has to be maximal “length” d infinite number of clusters
Decision Problem Given a curve T, a subcurve cluster SC(m,l,d) of T consists of at least m subcurves T1, … , Tm of T such that: the subcurves are pairwise disjoint, the distance between any two subcurves is at most d, and at least one subcurve has length l.
Decision Problem Given a curve T, a subcurve cluster SC(m,l,d) of T consists of at least m subcurves T1, … , Tm of T such that: the subcurves are pairwise disjoint, the distance between any two subcurves is at most d, and at least one subcurve has length l.
Decision Problem Given a curve T, a subcurve cluster SC(m,l,d) of T consists of at least m subcurves T1, … , Tm of T such that: the subcurves are pairwise disjoint, the distance between any two subcurves is at most d, and at least one subcurve has length l. The length of a subcurve cluster is assumed to be maximal.
Decision Problem Given a curve T, a subcurve cluster SC(m,l,d) of T consists of at least m subcurves T1, … , Tm of T such that: the subcurves are pairwise disjoint, the distance between any two subcurves is at most d, and at least one subcurve has length l. The length of a subcurve cluster is assumed to be maximal.
Decision Problem Given a trajectory T, a subtrajectory cluster SC(m,l,d) of T consists of at least m subtrajectoriesT1, … , Tm of T such that: the subtrajectoriesare pairwise disjoint, the distance between any two subtrajectoriesis at most d, and at least one subtrajectory has length l. The length of a subtrajectory cluster is assumed to be maximal.
Problem Decision version:Subtrajectory cluster SC(m,l,d) Given a trajectory T, is there a subtrajectory cluster with parameters m, l and d? Optimisation versions: SC(m,max,d) – maximise length of cluster
Hardness results Theorem 1: Finding any approximation of the SC(m,max,d) problem is 3SUM-hard. Theorem 2: The decision problem SC(m,l,d) is NP-complete. Theorem 3: The problem of computing a (2-)-distance approximation of the SC(m,max,d)-problem is NP-hard. [Gudmundsson & van Kreveld’08]
Hardness results Theorem 2: The decision problem SC(m,l,d) is NP-complete. Reduction from MaxClique MaxClique: Is there a clique of size k ina given graph G=(V,E)? Clique of size 4
a d c e b MaxClique Longest subtrajectory cluster: NP-complete Problem: SC(m,l=n,d). a b c d e b,c,d a,c,e a,b a,e b,d d,e b,c a,c e d
a d c e b MaxClique Longest subtrajectory cluster: NP-complete Problem: SC(m,l=n,d). a b c d e b,c,d a,c,e a,b a,e b,d d,e b,c a,c e d
a d c e b MaxClique Longest subtrajectory cluster: NP-complete Problem: SC(m,l=n,d). a b c d e b,c,d a,c,e a,b a,e b,d d,e b,c a,c e d
a d c e b MaxClique Longest subtrajectory cluster: NP-complete Problem: SC(m,l=n,d). a b c d e b,c,d a,c,e a,b a,e b,d d,e b,c a,c e d
a d c e b MaxClique Longest subtrajectory cluster: NP-complete Problem: SC(m,l=n,d). a b c d e b,c,d a,c,e a,b a,e b,d d,e b,c a,c e d
a d c e b MaxClique Longest subtrajectory cluster: NP-complete Problem: SC(m,l=n,d). a b c d e b,c,d a,c,e a,b a,e b,d d,e b,c a,c e d
a d c e b MaxClique Longest subtrajectory cluster: NP-complete Problem: SC(m,l=n,d). a b c d e b,c,d a,c,e a,b a,e b,d d,e b,c a,c e d
a d c e b MaxClique Longest subtrajectory cluster: NP-complete Problem: SC(m,l=n,d). a b c d e b,c,d a,c,e a,b a,e b,d d,e b,c a,c e d
a d c e b MaxClique Longest subtrajectory cluster: NP-complete Problem: SC(m,l=n,d). b c d e a b,c,d a,c,e a,b a,e b,d d,e b,c a,c e d SC(m,l=n,d) Clique of size m in G Problem as hard as MaxClique!
Hardness results Theorem 2: The decision problem SC(m,l,d) is NP-complete.
a d c e b MaxClique Longest subtrajectory cluster: NP-complete Problem: SC(m,l=n,d). a b c d e b,c,d a,c,e a,b a,e b,d d,e b,c a,c e d
Hardness results Theorem 3: The problem of computing a (2-)-distance approximation of the SC(m,max,d)-problem is NP-hard.
Hardness results Theorem 3: The problem of computing a (2-)-distance approximation of the SC(m,max,d)-problem is NP-hard. Corollary 1: The problem of computing a (2-)-distance approximation of SC(max, l, r), for any constant 0 < < 1, is at least as hard as approximating MaxClique.
Hardness results Theorem 3: The problem of computing a (2-)-distance approximation of the SC(m,max,d)-problem is NP-hard. Corollary 1: The problem of computing a (2-)-distance approximation of SC(max, l, r), for any constant 0 < < 1, is at least as hard as approximating MaxClique. Can we find a 2-distance approximation in polynomial time?
Fréchet distance between m curves Input: Set of m polygonal curves F = {F1, …, Fm} with |Fi| = ni The Fréchetdistance of F can be computed by computing the Fréchetdistance between every pair of curves. Time: O( (ninjlog ninj)) i,j If |Fi| = n/m then O((n/m)4log n/m).
Fréchet distance between m curves Input: Set of m polygonal curves F = {F1, …, Fm} with |Fi| = ni Observation: Given F1, F2 and F3, we have: F(F1,F3) F(F1,F2) + F(F2,F3). [Dumitrescu & Rote’04]
Fréchetdistance between m curves Input: Set of m polygonal curves F = {F1, …, Fm} with |Fi| = ni Observation: Given F1, F2 and F3, we have: F(F1,F3) F(F1,F2) + F(F2,F3). [Dumitrescu & Rote’04] a a+b b Can we use this observation to get an approximation?
Fréchetdistance between m curves Input: Set of m polygonal curves F = {F1, …, Fm} with |Fi| = ni Idea: Select a representative curve F1 of F. Compute the maximum Fréchetdistance D between F1 and all other curves in F.
Fréchetdistance between m curves Input: Set of m polygonal curves F = {F1, …, Fm} with |Fi| = ni Idea: Select a representative curve F1 of F. Compute the maximum Fréchet distance D between F1 and all other curves in F. D F 2D Observation: Gives a 2-approximation
Fréchetdistance between m curves Input: Set of m polygonal curves F = {F1, …, Fm} with |Fi| = ni Idea: Select a representative curve F1 of F. Compute the maximum Frechet distance D between F1 and all other curves in F. D F 2D Observation: Gives a 2-approximation Time:O((n1ni log n1ni)) i
Decision algorithm: compute path Recall: Deciding if the Fréchet distance between two curves P and Q is less than r can be computed in O(mn) time. The Fréchetdistance between two polygonal curves P and Q can be computed in O(mn log mn) time using parametric search. (qm,pn) P Q P (q1,p1) Q
Recall the problem Given a trajectory T, a subtrajectory cluster SC(m,l,d) of T consists of at least m subtrajectories T1, … , Tm of T such that: the subtrajectories are pairwise disjoint, the distance between any two subtrajectories is at most d, and at least one subtrajectory has length l.
Recall the problem • Input: A trajectory T with n points, an integer m>1 and a real value d>0. • Output: SC(m,max,d) Constraint: For simplicity we will assume that all sub-trajectories in a cluster has to start and end at a vertex. Idea: Create a free space diagram describing the distance between T and T.
Free space diagram of T T A B D(A,C) d D(B,C) d D(A,B) 2d C
Free space diagram of T C: representative trajectory The length of the SC {A,B,C} is the length of the representative trajectory. A B C