240 likes | 248 Views
This study focuses on the early detection of ovarian cancer by analyzing the mass spectrum of blood samples. The research compares different classification algorithms, such as decision trees, support vector machines, and neural networks, to identify the most informative points in the mass-spectrum curves. The results show promising accuracy rates, and future work suggests exploring other datasets and combining methods with genetic algorithms.
E N D
Diagnosisof Ovarian CancerBased on Mass Spectrum of Blood Samples Hong Tang Committee: Eugene Fink Lihua Li Dmitry B. Goldgof
Outline • Introduction • Previous work • Feature selection • Experiments
Motivation Early cancer detection is criticalfor successful treatment. • Five year survival for ovarian cancer: • Early stage: 90% • Late stage: 35% 80% are diagnosed at a late stage.
Motivation • Desired features ofcancer detection: • Early detection • High accuracy • Low cost
102 100 intensity 10–2 10–4 0 5,000 10,000 15,000 20,000 ratio of molecular weight to electrical charge Mass spectrum We can detect some early-stage cancersby analyzing the blood mass spectrum.
Blood Mass spectrum Data mining Results Mass spectrum
Outline • Introduction • Previous work • Feature selection • Experiments
Initial work • Vlahou et al. (2001): Manual diagnosis of bladder cancer based on mass spectra • Petricoin et al.(2002): Application of clustering to mass spectra for the ovarian-cancer diagnosis
Later work Decision trees Adam et al. (2002): 96% accuracy for prostate cancer Qu et al. (2002): 98% accuracy for prostate cancer Clustering Petricoin et al. (2002): 80% accuracy for prostate cancer Neural networks Poon et al. (2003): 91% accuracy for liver cancer
Outline • Introduction • Previous work • Feature selection • Experiments
Cancer Healthy Statistical difference: Feature selection intensity 200400600 ratio of molecular weight to electrical charge
Cancer Healthy Feature selection intensity 200400600 ratio of molecular weight to electrical charge Window size: minimal distance between selected points
Outline • Introduction • Previous work • Feature selection • Experiments
Learning algorithms • Decision trees (C4.5) • Support vector machines (SVMFu) • Neural networks (Cascor 1.2)
Control variables • Number of features, 1–64 • Window size, 1–1024
Decision trees , SVM , Neural networks Learning curveData set 1 100 90 accuracy (%) 80 70 60 150 50 250 200 100 training size
Learning curveData set 2 100 90 accuracy (%) 80 70 60 150 50 250 0 200 100 training size Decision trees , SVM , Neural networks
Learning curveData set 3 100 90 accuracy (%) 80 70 60 150 50 250 0 200 100 training size Decision trees , SVM , Neural networks
Main results • Automated detection of ovarian cancer by • analyzing the mass spectrum of the blood • Identification of the most informative points of the mass-spectrum curves • Experimental comparison of decision trees, SVM and neural networks
Future work • Experiments with other data sets • Other methods for feature selection • Combining with genetic algorithm