1 / 49

KmL & KML3D : K- Means FOr Longitudinal Data

KmL & KML3D : K- Means FOr Longitudinal Data. Christophe Genolini Bernard Desgraupes Bruno Falissard. Definition. Two trajectories. TEN trajectories. Two many trajectories. Solution : clusters. Cluster example. how cluster?. Parametric algorithms Non parametric algorithms.

dusan
Download Presentation

KmL & KML3D : K- Means FOr Longitudinal Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. KmL & KML3D: K-MeansFOr Longitudinal Data Christophe Genolini Bernard Desgraupes Bruno Falissard

  2. Definition

  3. Two trajectories

  4. TEN trajectories

  5. Two many trajectories...

  6. Solution : clusters

  7. Cluster example

  8. how cluster? • Parametric algorithms • Non parametric algorithms

  9. how cluster? • Parametric algorithms • Example : proc traj • Base on likelihood • Non parametric algorithms • K means (KmL)

  10. I ♥ Quebec…

  11. Likelihood for size Size = 1,84 Small likelihood Big likelihood

  12. Biglikelihood?

  13. Parametric Algorithms • Number of clusters • Trajectories shape (linear, polynomial,…) • Distributions of variable (poisson, normal…) Maximization of the likelihood

  14. Non Parametric algorithms • Number of clusters Maximization of some criteria

  15. K-meanskml

  16. K Means Longitudinal

  17. K Means Longitudinal

  18. K Means Longitudinal

  19. K Means Longitudinal

  20. Example > kml(cld3,4,1,print.traj=TRUE)

  21. Strength: Missing values

  22. weakness: local maximum

  23. Solution: re-running

  24. Problem: number of clusters

  25. example longData <- as.cld(gald()) kml(longData,2:5,10,print.traj=TRUE) choice(longData)

  26. kml3D

  27. Joint trajectories

  28. Joint trajectories

  29. Solution: cluster • C1: partition for V1 • C2: partition for V2 • C1xC2: partition for joint trajectories? • C1 = {small,medium,big} • C2 = {blue,red} • C1xC2 = {small blue, small red, medium blue, medium red, big blue, big red}

  30. Problem

  31. Problem

  32. Problem

  33. Problem

  34. Problem

  35. Problem

  36. Problem

  37. Solution: third dimension

  38. Solution: third dimension par(mfrow=c(1,2)) a <- c(1,2,1,3,2,3,3,4,5,3,5) b <- c(6,6,6,5,6,6,5,5,4,3,3) plot(a,type="l",ylim=c(0,10),xlab="First variable",ylab="") plot(b,type="l",ylim=c(0,10),xlab="Second variable",ylab="") points3d(1:11,a,b) axes3d(c("x", "y", "z")) title3d(, , "Time","Firstvariable","Second variable") box3d() aspect3d(c(2, 1, 1)) rgl.viewpoint(0, -90, zoom = 1.2)

  39. Cluster in 3D cl <- gald(functionClusters=list(function(t){c(-4,-4)},function(t){c(5,0)},function(t){c(0,5)}),functionNoise = function(t){c(rnorm(1,0,2),rnorm(1,0,2))}) plot3d(cl) kml(cl,3,1,paramKml=parKml(startingCond="randomAll")) plot3d(cl,paramTraj=parTraj(col="clusters"))

  40. Perspectives

  41. Award: best “number of clusters” finder… • The nominees are: • Calinsky & Harabatz • Ray & Turie • Davies & Bouldin • ... • The winner is…

  42. Award: best “number of clusters” finder… • The nominees are: • Calinsky & Harabatz • Ray & Turie • Davies & Bouldin • ... • The winner is… • Falissard & Genolini • (or G & F ?)

  43. Perspective : shape distance

  44. Perspective : Cluster according to shape « classic » distance « shape » distance

  45. Imputation

  46. Imputation

  47. Imputation

  48. Imputation

  49. Thank you!

More Related