1 / 32

Clustering

Clustering. Hypothesis: Hebbian synaptic plasticity enables a perceptron to compute the mean of its preferred stimuli. Unsupervised learning. Sequence of data vectors Learn something about their structure Multivariate statistics Neural network algorithms Brain models.

geranium
Download Presentation

Clustering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Clustering

  2. Hypothesis: Hebbian synaptic plasticity enables a perceptron to compute the mean of its preferred stimuli.

  3. Unsupervised learning • Sequence of data vectors • Learn something about their structure • Multivariate statistics • Neural network algorithms • Brain models

  4. Data can be summarized by a few prototypes.

  5. Vector quantization • Many telecom applications • Codebook of prototypes • Send index of prototype rather than whole vector • Lossy encoding

  6. A single prototype • Summarize all data with the sample mean.

  7. Multiple prototypes • Each prototype is the mean of a subset of the data. • Divide data into k clusters. • One prototype for each cluster.

  8. Assignment matrix cluster  • Data structure for cluster memberships. data vector a

  9. k-means algorithm • Alternate between computing means and computing assignments.

  10. Objective function • Why does it work? • Method of minimizing an objective function.

  11. Rubber band computer • Attach rubber band from each data vector to the prototype vector. • The prototype will converge to the sample mean.

  12. The sample mean maximizes likelihood • Gaussian distribution • Maximize

  13. Objective function for k-means

  14. Local minima can exist

  15. Model selection • How to choose the number of clusters? • Tradeoff between model complexity and objective function.

  16. Neural implementation • A single perceptron can learn the mean in its weight vector. • Many competing perceptrons can learn prototypes for clustering data.

  17. Batch vs. online learning • Batch • Store all data vectors in memory explicitly. • Online • Data vectors appear sequentially. • Use one, then discard it. • Only memory is in learned parameters.

  18. Learning rule 1

  19. Learning rule 2 • “weight decay”

  20. Learning rule 2 again • Is there an objective function?

  21. Stochastic gradient following The average of the update is in the direction of the gradient.

  22. Stochastic gradient descent

  23. Convergence conditions • Learning rate vanishes • slowly • but not too slowly • Every limit point of the sequence wt is a stationary point of E(w)

  24. Competitive learning • Online version of k-means

  25. Competition with WTA • If the wa are normalized

  26. Objective function

  27. Cortical maps

  28. Ocular dominance columns 0.5 mm wide

  29. Orientation map

  30. Kohonen feature map

  31. Hypothesis: Receptive fields are learned by computing the mean of a subset of images

  32. Nature vs. nurture • Cortical maps • dependent on visual experience? • preprogrammed?

More Related