120 likes | 332 Views
Breast Cancer Diagnosis via Neural Network Classification. Jing Jiang May 10, 2000. Outline. Introduction and Motivation K-mean, k-nearest neighbor and maximum likelihood classification Back propagating multi-layer perceptron Support vector machine (SVM) Learning vector quantization (LVQ)
E N D
Breast Cancer Diagnosis via Neural Network Classification Jing Jiang May 10, 2000
Outline • Introduction and Motivation • K-mean, k-nearest neighbor and maximum likelihood classification • Back propagating multi-layer perceptron • Support vector machine (SVM) • Learning vector quantization (LVQ) • Linear programming
Introduction and Motivation • The data file contains the 30 attributes of both benign and malignant fine needle aspirates (FNAs). • Our goals are to find a discriminating function to determine if an unknown sample is benign or malignant and choose a pair of the 30 attributes which will be used in diagnosis. • Linear program has done a good job in solving this problem. • We expect the neural network classification algorithms can be useful for this problem.
K-mean • First we use k-mean algorithm to find the cluster of the training data set. • K-mean algorithm doesn’t give up the discriminating function
For 100 nearest neighbors we have, For 20 nearest neighbors we have For maximum likelihood algorithm we have, KNN and ML
BP-MLP • After careful choice of network parameters, we get the same Cmat and C-rate for the 30 attribute and any 2 attribute problem. • It is interesting to note they are the same as the result we get for ML method • The low classification rate can be due to the fact that the data is not linearly separable.
Support Vector Machine • For attribute 1 and 23, we have 6 errors in the testing. • For attribute 14 and 28, we have 8 errors in testing. • It takes a long time to train a SVM for the 30 attribute problem, even for 2 attribute, it is time consuming too.
LVQ • While using LVQ for attribute 1 and 23, the number of errors is 8. • For attribute 14 and 18, we have 25 errors. • The training is faster than SVQ, but so far we are only able to handle the 2 attribute problem, not a 30 attribute problem.
Linear Program • The algorithm used is similar to SVM, but simpler. • We device a separation plane and try to minimize the error. • For 30 attribute we have only 3 errors • For 2 attribute, the best combinations give 2 errors.
Conclusion • We tried various neural network classification algorithm. It seems as far the simpler linear programming gives a better result. More exploration need to be done. • BP is not very good at dealing with non-separable data. • SVM is a good candidate, but takes a long time to train. • LVQ is comparable with SVM. • An question remain to be answered, why the maximum likelihood method give the same result as the BP.