170 likes | 184 Views
Explore the benefits and limitations of using dependency networks as an alternative to Bayesian networks for collaborative filtering and data visualization. Learn about the representation of joint distribution, conditional distribution, learning dependency networks, and evaluation criteria. Discover how dependency networks can be applied to predictive relationships and improve the accuracy of models. Also, delve into the use of dependency networks for data visualization and their advantages compared to Bayesian networks.
E N D
Dependency Networks for Collaborative Filtering and Data Visualization UAI-2000 발표: 황규백
Abstract • Dependency networks • An alternative for the Bayesian network • A (cyclic) directed graph • Basic properties of dependency networks • Dependency networks for collaborative filtering • Dependency networks for data visualization
Introduction • A dependency network • A collection of regression/classification models among variables combined using Gibbs sampling • Disadvantages • Not useful for encoding a causal relationships • Advantages • Quite useful for encoding a predictive relationships
Representation of Joint Distribution • In Bayesian networks • In dependency networks • Via ordered Gibbs sampler • Initialize each variable randomly. • Resample each Xi according to • Theorem 1: • An ordered Gibbs sampler applied to a dependency network for X, where each Xi is discrete and each local distribution p(xi|pai) is positive, has a unique stationary distribution for X.
Conditional Distribution • Gibbs sampling is used. • Not so disadvantageous • Learning • Not representing the causal relationships • Each local distribution can be learned without regard to acyclicity constraints. • Consistency and inconsistency • Inconsistent dependency networks • All conditional distributions are not obtainable from a single joint distribution p(x). • Theorem 2: • If a dependency network for X is consistent with a positive distribution p(x), then the stationary distribution defined in Theorem 1 is equal to p(x).
Other Properties of Dependency Networks • Markov networks and dependency networks • Theorem 3: • The set of positive distributions consistent with a dependency network structure is equal to the set of positive distributions defined by a Markov network structure with the same adjacencies. • Defining the same distributions, however, representational forms are different. • Potentials vs. Conditional probabilities • Minimality of the dependency network • For every node Xi, and for every parent paij, Xi is not independent of paij given the remaining parents of Xi. • Theorem 4: • A minimal consistent dependency network for a positive distribution p(x) must be bi-directional.
Learning Dependency Networks • Each local distribution for Xi is simply a regression/classification model for xi with X \ {xi} as inputs. • Generalized linear models, neural networks, support-vector machines, … • In this paper, the decision tree was used. • A simple hill-climbing approach with a Bayesian score
Collaborative Filtering • Preferences prediction • Implicit/explicit voting • Binary/non-binary preferences • Bayesian network approach • In a dependency network
Datasets for Collaborative Filtering • MS.COM(Webpages), Nielsen(TV show), MSNBC(Stories in the site)
Evaluation Criteria and Experimental Procedure • Accuracy of the list given by a predictive model • Average accuracy of a model • A case in the test set • <input set | measurement set> (randomly partitioned) • <0, 1, 1, 0, 1, 0, | 1, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1>
Results on Accuracy • Higher score indicates better performance.
Results on the Prediction Time • Number of predictions per second
Results on Computational Resources • Computational resources for model learning
Data Visualization • Predictive relationships (not causal) • Bayesian networks often interfere with the visualization of such relationships. • Dependent or independent • Example • DNViewer • Media Metrix data
DNViewer • A dependency network for Media Metrix data
DNViewer for Local Distribution • Local probability distribution
Summary and Future Work • The dependency network • defines a joint distribution for variables. • is easy to learn from data. • is useful for collaborative filtering and data visualization. • is for conditionals. • The Bayesian network • is for joint probability distribution.