530 likes | 766 Views
KmL & KML3D : K- Means FOr Longitudinal Data. Christophe Genolini Bernard Desgraupes Bruno Falissard. Definition. Two trajectories. TEN trajectories. Two many trajectories. Solution : clusters. Cluster example. how cluster?. Parametric algorithms Non parametric algorithms.
E N D
KmL & KML3D: K-MeansFOr Longitudinal Data Christophe Genolini Bernard Desgraupes Bruno Falissard
how cluster? • Parametric algorithms • Non parametric algorithms
how cluster? • Parametric algorithms • Example : proc traj • Base on likelihood • Non parametric algorithms • K means (KmL)
Likelihood for size Size = 1,84 Small likelihood Big likelihood
Parametric Algorithms • Number of clusters • Trajectories shape (linear, polynomial,…) • Distributions of variable (poisson, normal…) Maximization of the likelihood
Non Parametric algorithms • Number of clusters Maximization of some criteria
Example > kml(cld3,4,1,print.traj=TRUE)
example longData <- as.cld(gald()) kml(longData,2:5,10,print.traj=TRUE) choice(longData)
Solution: cluster • C1: partition for V1 • C2: partition for V2 • C1xC2: partition for joint trajectories? • C1 = {small,medium,big} • C2 = {blue,red} • C1xC2 = {small blue, small red, medium blue, medium red, big blue, big red}
Solution: third dimension par(mfrow=c(1,2)) a <- c(1,2,1,3,2,3,3,4,5,3,5) b <- c(6,6,6,5,6,6,5,5,4,3,3) plot(a,type="l",ylim=c(0,10),xlab="First variable",ylab="") plot(b,type="l",ylim=c(0,10),xlab="Second variable",ylab="") points3d(1:11,a,b) axes3d(c("x", "y", "z")) title3d(, , "Time","Firstvariable","Second variable") box3d() aspect3d(c(2, 1, 1)) rgl.viewpoint(0, -90, zoom = 1.2)
Cluster in 3D cl <- gald(functionClusters=list(function(t){c(-4,-4)},function(t){c(5,0)},function(t){c(0,5)}),functionNoise = function(t){c(rnorm(1,0,2),rnorm(1,0,2))}) plot3d(cl) kml(cl,3,1,paramKml=parKml(startingCond="randomAll")) plot3d(cl,paramTraj=parTraj(col="clusters"))
Award: best “number of clusters” finder… • The nominees are: • Calinsky & Harabatz • Ray & Turie • Davies & Bouldin • ... • The winner is…
Award: best “number of clusters” finder… • The nominees are: • Calinsky & Harabatz • Ray & Turie • Davies & Bouldin • ... • The winner is… • Falissard & Genolini • (or G & F ?)
Perspective : Cluster according to shape « classic » distance « shape » distance