230 likes | 313 Views
Integration II. Prediction. Kernel-based data integration. SVMs and the kernel “trick” Multiple-kernel learning Applications Protein function prediction Clinical prognosis. SVMs. These are expression measurements f rom two genes for two populations (cancer types)
E N D
Integration II Prediction
Kernel-based data integration • SVMs and the kernel “trick” • Multiple-kernel learning • Applications • Protein function prediction • Clinical prognosis
SVMs These are expression measurements from two genes for two populations (cancer types) The goal is to define a cancer type classifier... [Noble, Nat. Biotechnology, 2006]
SVMs These are expression measurements from two genes for two populations (cancer types) The goal is to define a cancer type classifier... One type of classifier is a “hyper-plane” that separates measurements from two cancer types [Noble, Nat. Biotechnology, 2006]
SVMs These are expression measurements from two genes for two populations (cancer types) The goal is to define a cancer type classifier... One type of classifier is a “hyper-plane” that separates measurements from two cancer types E.g.: a one-dimensional hyper-plane [Noble, Nat. Biotechnology, 2006]
SVMs These are expression measurements from two genes for two populations (cancer types) The goal is to define a cancer type classifier... One type of classifier is a “hyper-plane” that separates measurements from two cancer types E.g.: a two-dimensional hyper-plane [Noble, Nat. Biotechnology, 2006]
SVMs Suppose that measurements are separable: there exists a hyperplane that separates two types Then there are an infinite number of separating hyperplanes Which to use? [Noble, Nat. Biotechnology, 2006]
SVMs Suppose that measurements are separable: there exists a hyperplane that separates two types Then there are an infinite number of separating hyperplanes Which to use? The maximum-margin hyperplane Equivalently: minimizer of [Noble, Nat. Biotechnology, 2006]
SVMs Which hyper-plane to use? In reality: minimizer of trade-off between 1. classification error, and 2. margin size penalty loss
SVMs This is the primal problem This is the dual problem
SVMs What is K? The kernel matrix: each entry is sample inner product one interpretation: sample similarity measurements completely described by K
SVMs Implication: Non-linearity is obtained by appropriately defining kernel matrix K E.g. quadratic kernel:
SVMs Another implication: No need for measurement vectors all that is required is similarity between samples E.g. string kernels
Protein Structure Prediction Protein structure Sequence similarity Protein sequence
Kernel-based data fusion Core idea: use different kernels for different genomic data sources a linear combination of kernel matrices is a kernel (under certain conditions)
Kernel-based data fusion Kernel to use in prediction:
Kernel-based data fusion In general, the task is to estimate SVM function along with coefficients of the kernel matrix combination This is a type of well-studied optimization problem (semi-definite program)
Kernel-based data fusion Same idea applied to cancer classification from expression and proteomic data
Kernel-based data fusion • Prostate cancer dataset • 55 samples • Expression from microarray • Copy number variants • Outcomes predicted: • Grade, stage, metastasis, recurrence