100 likes | 207 Views
Estimating variable structure and dependence in multi-task learning via gradients. By: Justin Guinney, Qiang Wu and Sayan Mukherjee Presented by: John Paisley. Outline. Outline of presentation General problem Review of single-task solution Extension to multi-task Experiments.
E N D
Estimating variable structure and dependence in multi-task learning via gradients By: Justin Guinney, Qiang Wu and Sayan Mukherjee Presented by: John Paisley
Outline • Outline of presentation • General problem • Review of single-task solution • Extension to multi-task • Experiments
General problem • Have small number of high dimensional data, x, with corresponding response variable, y (fully supervised) • Want to simultaneously build a classification or regression function and learn important features, as well as correlation between features (to know if two features are important in the same way) • Xuejun presented their single-task solution. This paper extends this to the multi-task setting.
Single-Task Solution (classification) • By Taylor expansion, estimate the classification function as • Seek to minimize expected error • Where is a weight function and ? And phi is a convex loss function. • To solve this, regularize in RKHS
Single-Task (regression) • Use the response variable for each input and only learn the gradient.
Single-Task (solution and value of interest) • By representer theorem, this has solution of the form • The gradient outer product (GOP) is the matrix with all feature information. This is approximated as This paper Xuejun’s paper Matlab
GOP • This matrix is central to their paper because it tells all the information about the importance of each feature. The diagonal can be used to rank each feature’s importance and the off diagonal tells how features are correlated (therefore if two features are important in the same way, only one need be selected). • My confusion: • I take this to mean that which would resolve previous page • However, constructing a discrete Gaussian kernel in Matlab, this isn’t true (and makes no sense to me why it should be true).
Extension to multi-task • Very logical extension. They assume a base function and have a task-specific correction. • Classification RKHS regularization • Regression RKHS regularization