150 likes | 275 Views
Bayesian Sets. Zoubin Ghahramani and Kathertine A. Heller NIPS 2005. Presented by Qi An Mar. 17 th , 2006. Outline. Introduction Bayesian Sets Implementation Binary data Exponential families Experimental results Conclusions. Introduction. Inspired by “Google TM Sets”
E N D
Bayesian Sets Zoubin Ghahramani and Kathertine A. Heller NIPS 2005 Presented by Qi An Mar. 17th, 2006
Outline • Introduction • Bayesian Sets • Implementation • Binary data • Exponential families • Experimental results • Conclusions
Introduction • Inspired by “GoogleTM Sets” • What do Jesus and Darwin have in common? • Two different views on the origin of man • There are colleges at Cambridge University named after them • The objective is to retrieve items from a concept of cluster, given a query consisting of a few items from that cluster
Introduction • Consider a universe of items , which can be a set of web pages, movies, people or any other subjects depending on the application • Make a query of small subset of items , which are assumed be examples of some cluster in the data. • The algorithm provides a completion to the query set, . It presumably includes all the elements in and other elements in that are also in this cluster.
Introduction • View the problem from two perspectives: • Clustering on demand • Unlike other completely unsupervised clustering algorithm, here the query provides supervised hints or constraints as to the membership of a particular cluster. • Information retrieval • Retrieve the information that are relevant to the query and rank the output by relevance to the query
Bayesian Sets • Very simple algorithm • Given and , we aim to rank the elements of by how well they would “fit into” a set which includes • Define a score for each : • From Bayes rule, the score can be re-written as:
Bayesian Sets • Intuitively, the score compares the probability that x and were generated by the same model with the sameunknown parameters θ, to the probability that x and came from models with different parameters θ and θ’.
Sparse Binary Data • Assume each item is a binary vector where each component is a binary variable from an independent Bernoulli distribution: • The conjugate prior for a Bernoulli distribution is a Beta distribution: • For a query where
Sparse Binary Data • The score can be computed as: • If we take a log of the score and put the entire data set into one large matrix X with J columns, we can compute a vector s of log scores for all points using a single matrix vector multiplication: where and
Exponential Families • If the distribution for the model is not a Bernoulli distribution, but in the form of exponential families: we can use the conjugate prior: so that the score is:
Experimental results • The experiments are performed on three different datasets: the Grolier Encyclopedia dataset, the EachMovie dataset and NIPS authors dataset. • The running times of the algorithm is very fast on all three datasets:
Conclusions • A simple algorithm which takes a query of a small set of items and returns additional items from belonging to this set. • The score is computed w.r.t a statistical model and unknown model parameters are all marginalized out. • With conjugate priors, the score can be computed exactly and efficiently. • The methods does well when compared to Google Sets in terms of set completions. • The algorithm is very flexible in that it can be combined with a wide variety of types of data and probabilistic model.