120 likes | 268 Views
Goals. Goal #1 : Are there differentially expressed (DE) genes unique to follicular lymphoma (FL) and diffuse large B-cell lymphoma (DLBCL)? Goal #2 : Can we use those genes to distinguish between these two subtypes? Goal #3 : Is the classification generalizable across different datasets?
E N D
Goals Goal #1: Are there differentially expressed (DE) genes unique to follicular lymphoma (FL) and diffuse large B-cell lymphoma (DLBCL)? Goal #2: Can we use those genes to distinguish between these two subtypes? Goal #3: Is the classification generalizable across different datasets? Goal #4: Do epigenetic modifications between these two subtypes differ? Goal #5: Do these subtypes play different biological roles?
Non-Hodgkin's B-cell Lymphomas • Lymphoma is a cancer in the lymphatic cells of the immune system • Follicular lymphoma (FL) can acquire similar properties to diffuse large B-cell lymphomas (DLBCL) derived from normal B-cell germinal centers • FL is an indolent (slow-growing) cancer caused by t(14;18) that results in over-expression of the anti-apoptotic protein BCL2 FL DLBCL http://en.wikipedia.org/wiki/Diffuse_large_B-cell_lymphoma http://en.wikipedia.org/wiki/Follicular_lymphoma
First 500 probes with the most missing values in Ruiz-Vela et al. 2008 Samples
Top 200 genes are able to cluster DLBCL vs FL in Shipp et al. 2002 Well known genes found: - BCL2A1, EZH2, MIF, LDHA, CLU, CTSD, TRAP1 No clustering when bottom genes are used in Ship et al. 2002 Samples
SVM training performance sensitivity: 0.929 specificity: 1 PREDICTED TRUE threshold probability = 0.5
Top 200 genes classification performance sensitivity: 0.8 specificity: 1 PREDICTED TRUE threshold probability = 0.5 591/1000 have at most 1 misclassification
Differentially expressed genes from another dataset shows a slight decrease in performance sensitivity: 0.6 specificity: 1 PREDICTED TRUE threshold probability = 0.5 924/1000 have at most 2 misclassifications
Supervised Machine Learning • Split training (14 FL, 44 DLBCL) and test (5 FL, 14 DLBCL) sets • Feature selection using limma • 14 genes overlap between Shipp et al. 2002 and Ruiz-Vela et al. 2008 • Select top 200 genes • Optimize svm parameters • radial kernel • cost 50, gamma 0.00125 • Classify training set using 10-fold CV • Classify entire data • Classify entire data with Ruiz-Vela et al. 2008 DE genes
Future Work • Apply better methods for matching probes to genes across different data sets • Apply SVM regression for DLBCL prognosis • Extend DLBCL vs FL classification to other lymphomas and non-lymphoma tumor types References • Margaret A Shipp et al., “Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning,” Nature Medicine 8, no. 1 (2002): 68-74. • Ruiz-Vela A, Aggarwal M, de la Cueva P, Treda C et al. Lentiviral (HIV)-based RNA interference screen in human B-cell receptor regulatory networks reveals MCL1-induced oncogenic pathways. Blood 2008 Feb 1;111(3):1665-76. PMID: 18032706 • Manuel Rodríguez-Paredes and Manel Esteller, “Cancer epigenetics reaches mainstream oncology,” Nature Medicine 17, no. 3 (March 2011): 330-339. • Shaknovich R, Geng H, Johnson NA, Tsikitas L et al. DNA methylation signatures define molecular subtypes of diffuse large B-cell lymphoma. Blood 2010 Nov 18;116(20):e81-9. • Gemma http://www.chibi.ubc.ca/Gemma • CMA http://www.bioconductor.org/packages/2.3/bioc/html/CMA.html • ermineJ http://www.bioinformatics.ubc.ca/ermineJ/ Acknowledgement • CIHR Bioinformatics Graduate Training Program