1 / 13

Data Visualization and Feature Selection: New Algorithms for Nongaussian Data

Data Visualization and Feature Selection: New Algorithms for Nongaussian Data. Howard Hua Yang and John Moody NIPS ’ 99. Contents. Data visualization Good 2-D projections for high dimensional data interpretation Feature selection Eliminate redundancy Joint mutual information ICA.

anthonylang
Download Presentation

Data Visualization and Feature Selection: New Algorithms for Nongaussian Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Visualization and Feature Selection: New Algorithms for Nongaussian Data Howard Hua Yang and John Moody NIPS’99

  2. Contents • Data visualization • Good 2-D projections for high dimensional data interpretation • Feature selection • Eliminate redundancy • Joint mutual information • ICA

  3. Introduction • Visualization of input data and feature selection are intimately related. • Input variable selection is the most important step in the model selection process. • Model-independent approaches to select input variables before model specification. • Data visualization is very important for human to understand the structural relation among variables in a system.

  4. Joint mutual information for input/feature selection • Mutual information • Kullback-Leibler divergence • Joint mutual information

  5. Conditional MI • When • Use joint mutual information instead of the mutual information to select inputs for a neural network classifier and for data visualization.

  6. Data visualization methods • Supervised methods based on JMI • cf) CCA • Unsupervised methods based on ICA • cf) PCA • Efficient method for JMI

  7. Application to Signal Visualization and Classification • JMI and visualization of radar pulse patterns • Radar pattern • 15-dimensional vector, 3 classes • Compute JMIs, select inputs

  8. Radar pulse classification • 7 hidden units • Experiments • all inputs vs. 4 selected inputs • 4 inputs with the largest JMI vs. randomly selected 4 inputs

  9. Conclusions • Advantage of single JMI • Can distinguish inputs when all of them have the same • Can eliminate the redundancy in the inputs when one input is a function of other inputs

More Related