130 likes | 263 Views
MicroArray Data Analysis Candice Quadros & Amol Kothari. Neural Network for classification. Harnessing the power of a neural network for classifying samples. Neural Network for classification. Reduce the no. of genes :
E N D
Neural Network for classification • Harnessing the power of a neural network for classifying samples.
Neural Network for classification • Reduce the no. of genes : • We have to reduce the data dimensionality, i.e. reduce the no. of genes to consider. • PCA can be used to select most informative genes, but it is computationally expensive to obtain the Eigen vectors for high dimensional data. • Use the method suggested by Golub et al. to obtain the informative genes.
Neural Network for classification • Steps in classification • Obtain the informative genesusing Golub’s method. • Normalize the genes by shifting them to the mean & dividing by the standard deviation. • Train the neural network by using the training data & targets, and get the weights. • Classify the test data using the weights obtained above.
Neural Network for classification • Results obtained:
Hierarchical Merging: When to stop? • Question: When to stop the merging? • Suggested Solutions: • Diameter(C) MaxD • Avg(sim(Oi,Oj)) ≥ (Oi,Oj C) • Difficult to estimate the parameters in high dimensions.
Hierarchical Merging: When to stop? • Another solution: When m clusters are present, stop merging. • Problem: The m clusters might contain single point clusters. • Use the concept of MinPts (from DBScan). A set of points is a significant cluster only if the set has MinPts. • When there are m significant clusters, then stop.
Hierarchical Merging: When to stop? No. of Significant Clusters No. of iterations
Visualization of data: Vizstruct • Equation used: • How do weigh each dimension, i.e. how do we select λ? Default value = 0.5 • Use the Eigen Values of each dimension to obtain the value of λ.
Visualization of data: Vizstruct • Steps for visualization: • Project the data into Eigen space. • The Eigen values of each dimension i = λi • Now use the same formulae for calculating the 2D point: Where λi = Eigen value of the ith dimension
Visualization of data: Vizstruct • Results: • The visualization obtained by this method is more representative of the data, compared to Vizstruct. • Demo