1 / 17

Bayesian Optimization Algorithm, Decision Graphs, and Occam’s Razor

Bayesian Optimization Algorithm, Decision Graphs, and Occam’s Razor. Martin Pelikan, David E. Goldberg, and Kumara Sastry IlliGAL Report No. 2000020 May 2000. Abstract. The use of various scoring metrics for Bayesian networks.

lalasa
Download Presentation

Bayesian Optimization Algorithm, Decision Graphs, and Occam’s Razor

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bayesian Optimization Algorithm, Decision Graphs, and Occam’s Razor Martin Pelikan, David E. Goldberg, and Kumara Sastry IlliGAL Report No. 2000020 May 2000.

  2. Abstract • The use of various scoring metrics for Bayesian networks. • The use of decision graphs in Bayesian networks to improve the performance of the BOA. • BDe metric for Bayesian networks with decision graphs.

  3. Bayesian Networks • Two basics components in Bayesian Networks • A scoring metric for discriminates the networks • A search algorithm for finding the best scoring metric value • BOA (in previous works) • The complexity of the considered models was bounded by the maximum number of incoming edges into any node. • To search the space of networks, a simple greedy algorithm was used due to its efficiency.

  4. Bayesian-Dirichlet Metric • BDe metric combines the prior knowledge about the problem and the statistical data from a given data set. • Bayes theorem • The higher the p(B|D), the more likely the network B is a correct model of the data.  Bayesian scoring metric, or the posterior probability • Even more, we use a fixed data set D.

  5. Bayesian-Dirichlet Metric • p(B) : prior probability of the network B • BDe metric gives preference to simpler networks • But, it’s not enough!

  6. Bayesian-Dirichlet Metric • p(B|D) • Data is a multinomial sample • Parameters are independent • The parameters associated with each variable are independent (global parameter independence) • The parameters associated with each instance of the parents of a variable are independent (local parameter independence) • Dirichlet distribution • No missing data (complete data)

  7. Bayesian-Dirichlet Metric • Often referred to K2 metric

  8. Minimum Description Length Metric • Not good for using prior information

  9. Constructing a Network • Constructing a best network is NP-complete. • Most of the commonly used metrics can be decomposed into independent terms each of which corresponds to one variable. • Empirical results show that more sophisticated search algorithms do not improve the obtained result significantly.

  10. Decision Graphs in Bayesian Networks • The use of local structures as decision trees, decision graphs, and default tables to represent equalities among parameters was proposed • The network construction algorithm takes an advantage of using decision graphs by directly manipulating the network structure through the graphs.

  11. Decision Graphs • A decision graph is an extension of a decision tree in which each non-root node can have multiple parents.

  12. Advantages of Decision Graph • Much less parents can be used to represent a model • Learning more complex class of models, called Bayesian multinets • Performs smaller and more specific steps what results in better models with respect to their likelihood. • Network complexity measure can be incorporated into the scoring metir

  13. Bayesian Score for Networks with Decision Graphs

  14. Operators on Decision Graphs split merge

  15. Constructing BN with DG • Initialize a decision graph Gi for each node xi to a graph containing only a single leaf. • Initialize the network B into an empty network. • Choose the best split or merge that does not result in a cycle in B. • If the best operator does not improve the score, finish.

  16. Constructing BN with DG • Execute the chosen operator • If the operator was a split, update the network B by adding a new edge. • Go to (3)

  17. Experiments • One-max • 3-deceptive • Spin-glass • Graph bisection

More Related