130 likes | 311 Views
Data Visualization and Feature Selection: New Algorithms for Nongaussian Data. Howard Hua Yang and John Moody NIPS ’ 99. Contents. Data visualization Good 2-D projections for high dimensional data interpretation Feature selection Eliminate redundancy Joint mutual information ICA.
E N D
Data Visualization and Feature Selection: New Algorithms for Nongaussian Data Howard Hua Yang and John Moody NIPS’99
Contents • Data visualization • Good 2-D projections for high dimensional data interpretation • Feature selection • Eliminate redundancy • Joint mutual information • ICA
Introduction • Visualization of input data and feature selection are intimately related. • Input variable selection is the most important step in the model selection process. • Model-independent approaches to select input variables before model specification. • Data visualization is very important for human to understand the structural relation among variables in a system.
Joint mutual information for input/feature selection • Mutual information • Kullback-Leibler divergence • Joint mutual information
Conditional MI • When • Use joint mutual information instead of the mutual information to select inputs for a neural network classifier and for data visualization.
Data visualization methods • Supervised methods based on JMI • cf) CCA • Unsupervised methods based on ICA • cf) PCA • Efficient method for JMI
Application to Signal Visualization and Classification • JMI and visualization of radar pulse patterns • Radar pattern • 15-dimensional vector, 3 classes • Compute JMIs, select inputs
Radar pulse classification • 7 hidden units • Experiments • all inputs vs. 4 selected inputs • 4 inputs with the largest JMI vs. randomly selected 4 inputs
Conclusions • Advantage of single JMI • Can distinguish inputs when all of them have the same • Can eliminate the redundancy in the inputs when one input is a function of other inputs