240 likes | 339 Views
Diagnosis of Ovarian Cancer Based on Mass Spectrum of Blood Samples. Hong Tang. Committee: Eugene Fink Lihua Li Dmitry B. Goldgof. Outline. Introduction Previous work Feature selection Experiments. Motivation. Early cancer detection is critical for successful treatment.
E N D
Diagnosisof Ovarian CancerBased on Mass Spectrum of Blood Samples Hong Tang Committee: Eugene Fink Lihua Li Dmitry B. Goldgof
Outline • Introduction • Previous work • Feature selection • Experiments
Motivation Early cancer detection is criticalfor successful treatment. • Five year survival for ovarian cancer: • Early stage: 90% • Late stage: 35% 80% are diagnosed at a late stage.
Motivation • Desired features ofcancer detection: • Early detection • High accuracy • Low cost
102 100 intensity 10–2 10–4 0 5,000 10,000 15,000 20,000 ratio of molecular weight to electrical charge Mass spectrum We can detect some early-stage cancersby analyzing the blood mass spectrum.
Blood Mass spectrum Data mining Results Mass spectrum
Outline • Introduction • Previous work • Feature selection • Experiments
Initial work • Vlahou et al. (2001): Manual diagnosis of bladder cancer based on mass spectra • Petricoin et al.(2002): Application of clustering to mass spectra for the ovarian-cancer diagnosis
Later work Decision trees Adam et al. (2002): 96% accuracy for prostate cancer Qu et al. (2002): 98% accuracy for prostate cancer Clustering Petricoin et al. (2002): 80% accuracy for prostate cancer Neural networks Poon et al. (2003): 91% accuracy for liver cancer
Outline • Introduction • Previous work • Feature selection • Experiments
Cancer Healthy Statistical difference: Feature selection intensity 200400600 ratio of molecular weight to electrical charge
Cancer Healthy Feature selection intensity 200400600 ratio of molecular weight to electrical charge Window size: minimal distance between selected points
Outline • Introduction • Previous work • Feature selection • Experiments
Learning algorithms • Decision trees (C4.5) • Support vector machines (SVMFu) • Neural networks (Cascor 1.2)
Control variables • Number of features, 1–64 • Window size, 1–1024
Decision trees , SVM , Neural networks Learning curveData set 1 100 90 accuracy (%) 80 70 60 150 50 250 200 100 training size
Learning curveData set 2 100 90 accuracy (%) 80 70 60 150 50 250 0 200 100 training size Decision trees , SVM , Neural networks
Learning curveData set 3 100 90 accuracy (%) 80 70 60 150 50 250 0 200 100 training size Decision trees , SVM , Neural networks
Main results • Automated detection of ovarian cancer by • analyzing the mass spectrum of the blood • Identification of the most informative points of the mass-spectrum curves • Experimental comparison of decision trees, SVM and neural networks
Future work • Experiments with other data sets • Other methods for feature selection • Combining with genetic algorithm