160 likes | 262 Views
Gentian GUSHO LUSSI department – ENST Bretagne, FRANCE. Properties of Minimum Rigidity Graphs Associated with a Clustering System ( How to account for internal structure of clusters? ). CLASSIFICATION : binarity – rigidity – stability. INTRODUCTION. Single Linkage Hierarchy (SLH)
E N D
Gentian GUSHO LUSSI department – ENST Bretagne, FRANCE Properties of Minimum Rigidity Graphs Associated with a Clustering System ( How to account for internal structure of clusters? )
CLASSIFICATION: binarity – rigidity – stability INTRODUCTION Single Linkage Hierarchy (SLH) ( algorithm of AHC ) Minimum Spanning Tree (MST) ( Prim algorithm ) Each class of SLH is connected in every MST MST has a minimum number of edges as a connected graph MST is not unique and all the MSTs have the same length
CLASSIFICATION: binarity – rigidity – stability CLUSTERING Clustering and rigidity Clustering system (CS) ( Folklore ) Let X be a set of objects andSa set of subsets of X calledclusters . • Definition: Sis called aclustering systemif : • S • x X, { x } S and X S ( trivial clusters ) Minimum Rigidity Graph (MRG) associated with a CS ( Flament et al.(1976) ) Let G = ( X, E ) be a connected graph on X andS a CS. • Definition: Gis called aminimum rigidity graphofSif : • each cluster ofSis a connected class of G • the cardinal of E is minimum for this property
CLASSIFICATION: binarity – rigidity – stability DEFINITION OF A CLASS Classes of a dissimilarity ( Jardine & Sibson (1971) Every maximal clique of a threshold graph is called aclassof d. The set of all the classes of d denoted byCd, is aCS. { 1 }, { 2, 3 } and { 4, 5, 6 } are classes of d the index of a class is the smallest threshold of the graph in which it appears as a maximal clique. the index is exactly the diameter of the class
CLASSIFICATION: binarity – rigidity – stability DEFINITION OF A CLASS Balls and 2 balls of a dissimilarity Ball of d with centre x and radius r B(x, r) = { y: d(x, y) r } 2-ball of d generated by x and y B(x, y) = B(x, d(x, y) B(y, d(x, y)) B(x, y) = { z: max {d(x,z), d(y,z) d(x,y) } We denoteBdthe set of all the balls of d andB2dthe set of its 2-balls. They both form aCS. B( 3, d(3, 5) ) = { 2, 3, 5, 6 } and B( 5, d(3, 5) ) = { 3, 4, 5, 6} B( 3, 5 ) = { 3, 5, 6 }
CLASSIFICATION: binarity – rigidity – stability DEFINITION OF A CLASS Realizations of a dissimilarity (Brucker, 2003) Realization of a pair x, y R(x, y) = { C Cd : x, y C } Properties - closest elements to x and y - same behaviours relative to the other objects - diamd ( R( x, y) ) = d( x, y) The set of all the realizations denoted byRd, is a CS. Another definition R(x, y) = { B Bd: x, y B and diamd ( B ) d( x, y) } Property R( x, y) B( x, y) : R( 3, 5 ) = {3, 5}, B( 3, 5 ) = {3, 5, 6}
CLASSIFICATION: binarity – rigidity – stability COMPLEXITY Computational complexity Enumeration issues Proposition:( Capobianco and Molluzo (1978) ) : The maximum number of maximal cliques of a graph with n nodes is limited exponentially by n ( O( 3 n/3) ). Note: The number of the balls of d is limited byO ( n)and the number of the 2-balls and the realizations is limited byO ( n 2 ) Complexity issues Maximal cliques: exponential time Balls: polynomial time O ( n 2 ) 2-balls: polynomial time O ( n 3 ) Realizations: polynomial time O ( n 5 )
CLASSIFICATION: binarity – rigidity – stability TRENDS Some approaches in classification to approach data by a classification model partition hierarchy quasi-hierarchy to extract classes from the data as they are articulation point complete graph support minimum rigidity graph clique cluster connected cluster
CLASSIFICATION: binarity – rigidity – stability PROPERTIES Properties of the MRG of the realizations Rd 45, 46, 56, 23, 35, 15, 356, 345, 145, 1456, 123456 The realizations of d ( Rd ) Minimum Rigidity Graph ofRd The MRG of the realizations is not unique and all are of same length The MRG of the realizations is computed in a polynomial time O ( n 5 ) Every MRG of the realizations contains at least one MST of d Every MST of an MRG of the realizations is an MST of d
CLASSIFICATION: binarity – rigidity – stability EXAMPLE Adding an object SLH 45 456 23 17 23456 X R 45 46 56 23 17 35 356 47 15 27 345 14 467 14567 1457 123457 13457 X R 45 46 56 23 35 356 15 345 145 1456 12345 1345 X SLH 45 456 23 23456 X
CLASSIFICATION: binarity – rigidity – stability EXAMPLE Realizations and quasi-hierarchy - An example ( but an open problem ) R 45 46 56 23 35 356 15 345 145 1456 12345 1345 X QH 45 456 23 3456 15 145 1456 X QH 45 46 456 23 17 3456 47 15 27 14567 X R 45 46 56 23 17 35 356 47 15 27 345 14 467 14567 1457 123457 13457 X
CLASSIFICATION: binarity – rigidity – stability CONCLUSIONS Conclusions Rd represents the data as they are Rd is computed in a polynomial time MRG provides information about the internal structures of the classes The MRG of Rd is not unique, but … ( Osswald 2003 ) The MRG of Rd contains at least one MST of d
CLASSIFICATION: binarity – rigidity – stability PERSPECTIVES Perspectives stability of the method ( noising of the data ) study of “the” minimum hypergraph of rigidity other relationships between MSTs and MRGs realizations and other classificatory models
CLASSIFICATION: binarité – rigidité – stabilité MERCI THANK YOU !