1 / 40

Dimensionality Reduction and Metric Learning STAT 946

Dimensionality Reduction and Metric Learning STAT 946. Ali Ghodsi Spring 2009. Knowledge is power. 1113152312050407073009301615230518512351481245284 6547832594759089614824800

gerow
Download Presentation

Dimensionality Reduction and Metric Learning STAT 946

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Dimensionality Reduction and Metric Learning STAT 946 Ali Ghodsi Spring 2009

  2. Knowledge is power 1113152312050407073009301615230518512351481245284 6547832594759089614824800 654732840983257094385436588578437504396701347686593598534976943764309670431604376456058211436245958798573489570695765328537847078748290698576058635487940873982647645840494326524373468531864387645734673474734645064482543763846548457364832453298649326493246934250450325043275756873659843556986985698696

  3. Two Problems Classical Statistics • Infer information from small data sets (Not enough data) Machine Learning • Infer information from large data sets (Too many data)

  4. Other Names for ML • Data mining, • Applied statistics • Adaptive (stochastic) signal processing • Probabilistic planning or reasoning are all closely related to the second problem.

  5. Applications Machine Learning is most useful when the structure of the task is not well understood but can be characterized by a dataset with strong statistical regularity. • Search and recommendation (e.g. Google, Amazon) • Automatic speech recognition and speaker verification • Text parsing • Face identification • Tracking objects in video • Financial prediction, fraud detection (e.g. credit cards) • Medical diagnosis

  6. Tasks • Supervised Learning: given examples of inputs and corresponding desired outputs, predict outputs on future inputs. e.g.: classification, regression • Unsupervised Learning: given only inputs, automatically discover representations, features, structure, etc. e.g.: clustering, dimensionality reduction, Feature extraction

  7. Dimensionality Reduction • Dimensionality: The number of measurements available for each item in a data set. • The dimensionality of real world items is very high. • For example: The dimensionality of a 600 by 600 image is 360,000. • The Key to analyzing data is comparing these measurements to find relationships among this plethora of data points. • Usually these measurements are highly redundant, and relationships among data points are predictable.

  8. Dimensionality Reduction • Knowing the value of a pixel in an image, it is easy to predict the value of nearby pixels since they tend to be similar. • Knowing that the word “corporation” occurs often in articles about economics, but not very often in articles about art and poetry then it is easy to predict that it will not occur very often in articles about love. • Although there are lots of measurements per item, there are far fewer that are likely to vary. Using a data set that only includes the items likely to vary allows humans to quickly and easily recognize changes in high dimensionality data.

  9. Data Representation

  10. Data Representation

  11. Data Representation

  12. -2.19 -3.19 -0.02 1.02 2 by 103 644 by 103 644 by 2 23 by 28 2 by 1 2 by 1 23 by 28

  13. Arranging words: Each word was initially represented by a high-dimensional vector that counted the number of times it appeared in different encyclopedia articles. Words with similar contexts are collocated

  14. Different Features

  15. Glasses vs. No Glasses

  16. Beard vs. No Beard

  17. Beard Distinction

  18. Glasses Distinction

  19. Multiple-Attribute Metric

  20. Embedding of sparse music similarity graph Platt, 2004

  21. Reinforcement learning Mahadevan and Maggioini, 2005

  22. Semi-supervised learning Use graph-based discretization of manifold to infer missing labels. Belkin & Niyogi, 2004; Zien et al, Eds., 2005 Build classifiers from bottom eigenvectors of graph Laplacian.

  23. c et al, 2003, 2005 Learning correspondences How can we learn manifold structure that is shared across multiple data sets?

  24. Mapping and robot localization • Bowling, Ghodsi, Wilkinson 2005 Ham, Lin, D.D. 2005

  25. The Big Picture

  26. Manifold and Hidden Variables

  27. Reading • Journals: Neural Computation, JMLR, ML, IEEE PAMI • Conferences: NIPS, UAI, ICML, AI-STATS, IJCAI, IJCNN • Vision: CVPR, ECCV, SIGGRAPH • Speech: EuroSpeech, ICSLP, ICASSP • Online: citesser, google • Books: • Elements of Statistical Learning, Hastie, Tibshirani, Friedman • Learning from Data, Cherkassky, Mulier • Machine Learning, Mitchell • Neural Networks for pattern Recognition, Bishop • Introduction to Graphical Models, Jordan et. al

More Related