250 likes | 401 Views
Visualization of High dimensional Datasets. Jahangheer Shaik. Why do we need Visualization?. Data visualization techniques are often required to obtain meaningful insights by reducing the cognitive load to effectively convert the data into information and knowledge for subsequent applications. .
E N D
Visualization of High dimensional Datasets Jahangheer Shaik
Why do we need Visualization? Data visualization techniques are often required to obtain meaningful insights by reducing the cognitive load to effectively convert the data into information and knowledge for subsequent applications. • Noise? • Distribution? • Classes? • Structure?
Line Graphs • Line graphs are used for displaying single valued or piecewise continuous functions of one variable
Problems • Different types of lines (colored, dashed) have to be used to distinguish between the labeled classes • Each of the dimensions may have different scale
Bar Charts, Histograms • Histograms visualize discrete probability density functions
Scatter Plot • Most popular tool • Helps find clusters, outliers, trends, correlations etc • Glyphs, icons, colors etc may be used for better understanding • Not very intuitive when dimensions increase
Eigen vectors(contd..) • A transformation matrix transforms a vector from its original position to another position • If the transform results in the vector itself then the vector and all multiples of it would be eigen vector of transformation matrix
Properties of eigen vectors • Eigen vectors can be found for only square matrices • Given a n x n matrix, there are ‘n’ eigen vectors • It’s the direction that matters not scale • Eigen vectors are orthogonal to each other
Linear Discriminant Analysis • Maximizes the ratio of between class variance to within class variance
Dimensions: Orthogonality • Dimensions are organized such that they are orthogonal to each other • Inselberg points out that orthogonality uses up the space rapidly
Star Coordinate Projection J. Shaik and M. Yeasin, "Visualization of High Dimensional Data using an Automated 3D Star Co-ordinate System," Proceedings of IEEE IJCNN'06, Vancouver, Canada., pp. 1339-1346, 2006