1 / 10

A Bayesian Matrix Factorization Model for Relational Data UAI 2010

Relational Learning via Collective Matrix Factorization SIGKDD 2008. A Bayesian Matrix Factorization Model for Relational Data UAI 2010. Authors: Ajit P. Singh & Geoffrey J. Gordon Presenter: Xian Xing Zhang. Basic ideas.

magar
Download Presentation

A Bayesian Matrix Factorization Model for Relational Data UAI 2010

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Relational Learning via Collective Matrix Factorization SIGKDD 2008 A Bayesian Matrix Factorization Model for Relational DataUAI 2010 Authors: Ajit P. Singh & Geoffrey J. Gordon Presenter: Xian Xing Zhang

  2. Basic ideas • Collective matrix factorization is proposed for relational learning when an entity participates in multiple relations. • Several matrices (with different types of support) are factored simultaneously with shared parameters • CMF is extended to a hierarchical Bayesian model to enhance the sharing of statistics strength

  3. An example of application • Functional Magnetic Resonance Imaging (fMRI): • fMRI data can be viewed as a relation (real valued), Response(stimulus, voxel) ∈ [0, 1] • stimulus side-information: a relation (binary) Co-occurs(word, stimulus) ∈ {0, 1} (which is collected as the statistics of whether the stimulus word co-occurs with other commonly-used words in large) • The goal is to predict unobserved values of the Response relation

  4. Basic model description • In fMRI example, the Co-occurs relation is an m×n matrix X; the Response relation is an n×r matrix Y. • Likelihood of each matrix X and Y: • Co-occurs (p_X) is modeled by the Bernoulli distribution, Response (p_Y) is modeled by a Gaussian.

  5. Hierarchical Collective Matrix Factorization • Information between entities can only be shared indirectly, through another facto: e.g., in f(UV’), two distinct rows of U are correlated only through V . • The hierarchical prior acts as a shrinkage estimator for the rows of a factor, pooling information indirectly, through Θ.

  6. Bayesian Inference • Hessian Metropolis-Hastings: • In random walk Metropolis-Hastings it samples from a proposal distribution defined by a Gaussian with mean equal to the sample at time t, F_i(t) and covariance matrix , which is problematic. • HMH uses both the gradient and Hessian to automatically construct a proposal distribution at each sampling step. This is claimed as the main technical contribution of the UAI2010 paper.

  7. Related work

  8. Experiment setting • The Co-occurs(word, stimulus) relation is collected by measuring whether or not the stimulus word occurs within five tokens of a word in the Google Tera-word corpus. • Hold-out prediction: • Fold-in prediction (to predict a new row in Y)

  9. Experiment results

  10. Discussions • Existing methods force one to choose between ignoring parameter uncertainty or making Gaussianity assumptions. • Non-Gaussian response types significantly improve predictive accuracy. • While non-Gaussianity complicates the construction of proposal distributions for Metropolis-Hastings, it does have a significant impact on predictive accuracy

More Related