170 likes | 287 Views
We mine the data using Support Vector Machines and create the confusion matrix. SVM Classification of Multiple Tumor Types. 78.25% accuracy. DNA Microarray Data. Oracle Data Mining. Green=Correct Red=Errors.
E N D
We mine the data using Support Vector Machines and create the confusion matrix SVM Classification of Multiple Tumor Types 78.25% accuracy DNA Microarray Data Oracle Data Mining Green=Correct Red=Errors We feed multiple cancer types data into the Oracle DB: 16,063 genes, 144 cancer patients. • Multiple Examples of tumor tissue (public data from Whitehead/MIT)
SVM Classification of Multiple Tumor Types 78.25% accuracy Oracle Data Mining’s SVM models are able to accurately predict the multi-class tumor problem with 78.25% accuracy. Green=Correct Red=Errors
Identify Biomarkers for DLBC Lymphoma Treatment Outcome Attribute Importance identifies genes correlated with Lymphoma cancer.
Find a Cure for Lymphoma • Literature search on Lymphoma • Set up a project workspace • Set up a meeting • Check lab protocols • Store cell histology images • Analyze gene expression results • Study the markers • Find a lead
Study the Markers • Statistical analysis • Protein sequence analysis (Swissprot) • BLAST Search • Protein secondary structure study • Search of genes and genetic disorders (OMIM) • Pathway modeling
Statistical Analysis Create an External Table to read data from lymphoma.txt.
Statistical Analysis Calculate Mean and Standard Deviation The t-test shows that the PKC expression levels in cured and fatal patients are significantly different.
Protein sequence analysis Load SwissProt into Oracle XML DB Load SwissProt into XML DB to learn more about expressed genes of interest
Load SwissProt into XML DB FTP SwissProt data and schema into Oracle XML DB
Load SwissProt into XML DB Access XML schema using XML Spy (XML editor) which connects to the database using WebDAV
Register the XML Schema Once schema is registered, XML DB automatically generates tables