1 / 46

Dimensionality Reduction

Learn about high-dimensional data reduction techniques in computer graphics, including images, videos, and documents. Understand the concept of curse of dimensionality and explore linear and non-linear reduction methods like PCA and MDS.

lucies
Download Presentation

Dimensionality Reduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Dimensionality Reduction June 2013 Computer Graphics Course

  2. What is high dimensional data? Images Videos Documents Most data, actually!

  3. How many dimensions? • Images – dimension 3·X·Y • This is the number of bytes in the image file • We can treat each byte as a dimension • Each image is a point in high dimensional space • Which space? • “space of images of size X·Y”

  4. How many dimensions? • But we can describe an image using less bytes! • “Blue sky, green grass, yellow road…” • “Drawing of a kong-fu rat”

  5. Why do Dimensionality Reduction? • Visualization: Understanding the structure of data

  6. Why do Dimensionality Reduction? • Visualization: Understanding the structure of data • Fewer dimensions are easy to describe and find correlations (rules) • Compression of data for efficiency • Clustering • Discovering similarities between elements

  7. Why do Dimensionality Reduction? • Curse of dimensionality • 100000000000 • 010000000000 • 001000000000 • 000100000000 • … • All these vectors are the sameEuclidean distance from each other • But some dimensions could be “worth more” • Can you work with 1,000 images of 1,000,000 dimensions?

  8. How to reduce dimensions? • Image features: • Average colors • Histograms • FFT based features (Frequency space) • More… • Video features • Document features • Etc…

  9. How to reduce dimensions? • Feature dimension is still quite high (512, 1024, etc) • What now?

  10. Linear Dimensionality Reduction • Simplest way: Project all points on a plane (2D) ora lower dimension sub-space

  11. Linear Dimensionality Reduction • Simplest way: Project all points on a plane (2D) • Only one question:Which plane is the best? • PCA (SVD)

  12. Linear Dimensionality Reduction • Simplest way: Project all points on a plane (2D) • Only one question:Which plane is the best? • PCA (SVD) • For specificapplications: • CCA (correlation) • LDA (data with labels) • NMF (non-negative components) • ICA (multiple sources)

  13. Non-Linear Dimensionality Reduction • What if data is not linear? • No plane will work here

  14. Non-Linear Dimensionality Reduction • MDS – MultiDimensionalScaling • Use only distances between elements • Try to reconstruct element positions from distancessuch that: • Reconstruction can happen in 1D, 2D, 3D, … • More dimensions = less error

  15. Non-Linear Dimensionality Reduction

  16. Non-Linear Dimensionality Reduction • MDS – MultiDimensionalScaling • Classical MDS: an algebraic solution • Construct a squared proximity matrix using some normalization (“double centering”) • Extract d largest eigenvectors / eigenvalues • Multiply each eigenvector with sqrt(eigenvalue) • Each row is the coordinates of its corresponding point

  17. Non-Linear Dimensionality Reduction • MDS – MultiDimensionalScaling • Classical MDS: an algebraic solution e1 e2 e3 e4 e5 x1 Each vector adds a dimension to the mapping x2 x3 x4 x5 …

  18. Non-Linear Dimensionality Reduction • Non-metric MDS: Optimization problem • Example: Sammon’s projection • Start from random positions for each element • Define stress of the system: • In each step, move towards positions that reduce the stress (gradient descent) • Continue until convergence

  19. Non-Linear Dimensionality Reduction • Spectral embedding: • Create a graph of nearest neighbors • Compute the graph laplacian (relates to probability of walking on each edge in a random walk) • Compute Eigenvalues – why? • Computing Eigenvalues is like multiplying the matrix by itself many many times (towards infinity), which is like performing random walks over and over until we reach a stable point • Again, the eigenvectors are the coordinates • Does not preserve distances like MDS – instead it groups together points that are likely neighbors

  20. Non-Linear Dimensionality Reduction • Other non-linear methods • Locally Linear Embedding (LLE): express each point as a linear combination of its neighbors • Isomap: Takes adjacency graph as input, and calculate MDS of the geodesic distances (distances on the graph) • Self Organizing Maps (SOM): Next part…

  21. Self Organizing Maps& recent applications June 2013 Computer Graphics Course

  22. Self Organizing Maps (SOM) • Originated from neural networks • Created by Kohonen, 1982 • Also known as Kohonen Maps • TeuvoKohonen: A Finnish researcher, learning and neural networks • Due to SOM, became the most cited Finnish scientist! • More than 8,000 citations • So what is it?

  23. What is a SOM? • A type of neural network • What is a neuron? • A function with several inputs and one output • In this case – usually a linear combination of the input according to weights

  24. What is a SOM? no connection (feedback/feed forward) between neurons neurons weights (mik) input (xk)

  25. Training a SOM • Start from random weights • For each input X(t) at iteration t: • Find the Best Matching Cell (BMC) (also called Best Matching Unit or BMU) for X(t) • Update weights for each neuron close to the BMU • Weights are updated according to a decaying learning rate and radius

  26. Training a SOM X(1) neurons (mi) BMC(1) X(2) BMC(2)

  27. Training a SOM – The Math • Best Matching Cell: mc for whichis minimal • Another option for BMC: maximal dot product x(t)Tmc(t) • Weight adaptation: • is a learning rate dependant of both the time and the distance of mi from the BMC mc

  28. Training a SOM – The Math • Example (motion map): distance between BMC and mi learning rate kernel width maximum number of iterations height and width of the neuron map

  29. Presenting a SOM • Option 1: at each node present the data that relates to vector mi (3D data, colors, continuous spaces) • So for a color map with 3 inputs,if a neuron weights are (0.7, 0.2, 0.3) we would show a reddish color with 0.7 red component , 0.2 green component and 0.3 blue component • For a map of points on the plane with 2 inputs, we would draw a point for each neuron in position (Wx, Wy)

  30. Presenting a SOM • Option 1: at each node present the data that relates to vector mi (3D data, colors, continuous spaces)

  31. Presenting a SOM • Option 2: give each neuron a representation from the training set X which is closest to vector mi

  32. More Examples

  33. More Examples

  34. More Examples

  35. Motion Map • Motion Map: Image-based Retrieval and Segmentation of Motion Data • Sakamato, Kuriyama, Kenko • SCA: Symposium on Computer Animation 2004 • Goal: Presenting the user with a grid of postures in order to select a clip of motion data from a large database • Perform clustering on the SOM instead of the abstract data

  36. Motion Map • Example results: 436 posture samples from 55K frames of 51 motion files

  37. Motion Map • Example results: Clustering based on SOM

  38. Motion Map - Details • A map of posture samples is created from all motion files together • Each sample similarity to its closest sample is over a given threshold to reduce computation time • A standard SOM is calculated • Each posture is then connected to a hash table of the motion files that contain similar postures • Clustering the SOM enables display of a simplified map to the user (next page)

  39. Motion Map - Details • Simplified map after SOM clustering: 17 dance styles

  40. Procedural Texture Preview • Eurographics 2012 • Goal: Present the user with a single image which shows all possibilities of a procedural texture • Method overview: • Selecting candidate vectors of parameters which maximize completeness, variety and smoothness • Organizing the candidates in a SOM • Synthesis of a continuous map

  41. Procedural Texture Preview • Results texture parameters thumbnails of random parameters texture preview in a single image

  42. Procedural Texture Preview - Details • Selecting candidates for the parameters map using the following optimizations:C = a set of dense samplesX = the candidates in the parameter map • Completeness: minimize • Variety: maximize • Smoothness: minimize

  43. Procedural Texture Preview - Details • A standard SOM will jointly optimize the completeness and the smoothness • To optimize the variety as well, the SOM implementation switches between minimizing Ev and maximizing Ec • Instead of regular learning rate, at each step the candidates (weights vectors) are replaced by a new candidate according to the above optimizations

  44. Procedural Texture Preview - Details • After the candidate selection, an image is synthesized which smoothly combines all selected candidates • Stitching is done using standard patch based texture synthesis methods (Graphcut Textures, Kwarta et al, TOG 2003)

  45. Procedural Texture Preview • Some more results

  46. That’s all folks! • Questions?

More Related