1 / 23

Learning Bayesian Network using Genetic Algorithms

Learning Bayesian Network using Genetic Algorithms. Dhirubhai Ambani Institute of Information & Communication Technology. Mentor Prof Suman Mitra DA-IICT, Gujarat. 200701195. Ashish Kalya. Introduction to Bayesian Network. DAG represents Bayesian Structure

tarmon
Download Presentation

Learning Bayesian Network using Genetic Algorithms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Learning Bayesian Network using Genetic Algorithms Dhirubhai Ambani Institute of Information & Communication Technology Mentor Prof Suman Mitra DA-IICT, Gujarat 200701195 Ashish Kalya

  2. Introduction to Bayesian Network • DAG represents Bayesian Structure • Conditional Probabilities distributions form Bayesian parameters Image Source: 1may,2011,”bayesiangraph.png” http://www.ra.cs.uni-tuebingen.de/software/JCell/images/docbook/bayesianGraph.png

  3. Two Approaches for Learning Bayesian Structure • Constraint based • Finds a Bayesian network structure whose conditional independence constraints match those found in the data. • Heuristic Search methods • Traverse the search space heuristically to find the DAG that can best explain the data (i.e. could have generated the data). Traverse space looking for high-scoring structures • Example : K2 algorithm

  4. The Need for Heuristic Search Algorithms Ideally we would search the space of all DAGs exhaustively and find the DAG which maximizes the Bayesian scoring criterion. However, for a large (not that large!) number of nodes this becomes infeasible [6]: Number of Nodes Number of DAGs 0 1 1 1 2 3 3 25 4 543 5 29281 6 3781503 7 1138779265 8 783702329343

  5. Genetic Algorithms (GAs) OVERVIEW • Inspired by the biological evolution process • Encoding: each individual coded as a string of certain finite length called chromosome, generally a binary string • Fitness function : gives fitness value Space of strings Fitness Function Set of Rational Numbers

  6. Components of GAs • Selection of individual as parents is inversely proportional to its fitness value • Crossover: strings(chromosome) are randomly mixed to form new offspring. • Mutation: randomly changes a string(chromosome) Parents: 00000000Offspring: 11100000 11111111 00011111 Parents: 00000000 Offspring: 00100000

  7. The Evolutionary Cycle parents crossover & mutation selection modified offspring initiate & evaluation population evaluate evaluated offspring Introduction to Genetic Algorithms

  8. Related Work • a genetic algorithm based upon the score-based greedy algorithm approach has been proposed in[2]. • Semantic crossover and mutation operator have been introduced in [3]. • Encoding of individuals using dual chromosome has been done [7], we have used that approach for generating our initial population

  9. Scoring Metric == Fitness Function • Maximum Likely hood Estimate (MLE) • Bayesian Information Criteria (BIC) • BIC punishes network complexity as simple networks are desirable [1]

  10. Stimulation Details • Algorithm: GA with elitist model • Encoding : DAG represented by a string A11 A12 . . . A21 A22 . . . A1N A2N . . ANN where A is the adjacency matrix • Fitness function : -1 * Bayesian Information criteria metric • One point crossover and crossover rate=0.9 • Mutation rate = 0.01, 0.1 and variable rate (see fig.)

  11. Stimulation Details • Initial population: A set of random DAGs • How do you generate Random DAGs • Upper triangular matrix is always acyclic • Permute the order of nodes and then rearrange the matrix correspondingly • Example: permute (1 2 3)= (1 3 2) 1 2 3 1 3 2 1 1 3 2 2 3

  12. ASIA Network Structure “A very small belief network for a fictitious medical example about whether a patient has tuberculosis, lung cancer or bronchitis, related to their X-ray, dyspnea, visit-to-Asia and smoking status.” [8] Image scorce:http:1 may,2011,”asia.png”,//www.stanford.edu/class/cs221/project2_files/asia.png

  13. Stimulation Details • Algorithms stimulated for 500, 1000, 2000 and 5000 cases • Number of generations considered: 50 , 100 , 150 and 250 • Size of population considered: 10, 20 , 50 and 100

  14. Issues faced • Both crossover and mutation operators generate individuals which are not DAGs. • Need to find cyclic directed graphs and remove cycles

  15. Modified GA with elitist model • A simple directed graph G has a directed cycle if and only if there is a back-edge in DFS(G) [5] • Ones all new individuals have been generated then we check if there is any back edge present and if found remove them. parents crossover & mutation selection remove cycle initiate & evaluation population evaluate evaluated offspring

  16. Observations and Analysis(1) Observed values: • (AF)average of fitness value of the best individual of each run of GA over 10 runs • (AH) average of Hamming distance between the string representing best individual of each run of GA and string representing the original structure over 10 runs • Example: Hamming distance is 2 0000011 0010010

  17. Observations and Analysis (2) • For 500 and 1000 cases, AF values lesser than that for original network were reached quite frequently. • For higher number of cases larger population size gives better results but for 500 cases no such difference was observed between population sizes of 50 and 100. • Smaller data size reduces the impact of population size and number of generations

  18. Observations and Analysis (3) • For very close or similar values of AF, we have very different AH • For normal GA, low and variable mutation gave comparatively better results. For modified GA , high mutation gave better results very clearly. • Modified GA performed better

  19. Conclusions • Less than 0.00000032 fraction of the search space was explored • Results for AH using modified GA are comparable with those obtained from K2 algorithm for ASIA network[9] • GAs make sense because approximate answers are acceptable specially so when number of cases is not large.

  20. Future Work • Remove cycles by making informed choice about which edge to remove • Need to carry these stimulations with more complex and large networks

  21. Acknowledgements • I would like to thank Prof. Suman Mitra for the initial conceptualization of idea. His regular inputs and study material provided by him were of great help.

  22. References • [1] Richard E. Neapolitan, Learning Bayesian Networks, Prentice Hall Series in Artificial Intelligence, Prentice Hall, December 2000. • [2] P. Larrañaga, M. Poza, Y. Yurramendi, R. H. Murga, and C. M. H. Kuijpers, “Structure learning of Bayesian networks by genetic algorithms: a performance analysis of control parameters,” IEEE Trans. On Pattern Analysis and Machine Intelligence, vol 18, no 9, 1996. • [3] S. Shetty, M. Song, Structure learning of Bayesian networks using a semantic genetic algorithm-based approach, in: Third International Conference on Information Technology: Research and Education, 2005, ITRE 2005, pp. 454–458. • [4] Etxeberria, R, Larranaga, P, and Pikaza, J M 1997. "Analysis of the behaviour of the genetic algorithms when searching Bayesian networks from data", Pattern Recognition Letters Vol. 18 No 11-13 pp 1269-1273. • [5] Jorgen Bang-Jensen, Gregory Z. Gutin, Digraphs theory algorithm and applications, 2nd ed.,Springer, 2010. • [6] McKay, B. D.; Royle, G. F.; Wanless, I. M.; Oggier, F. E.; Sloane, N. J. A.; Wilf, H. (2004), "Acyclic digraphs and eigenvalues of (0,1)-matrices", Journal of Integer Sequences 7, http://www.cs.uwaterloo.ca/journals/JIS/VOL7/Sloane/sloane15.html, Article 04.3.3. • [7] J. Lee, W. Chung and E. Kim, “Structure Learning of Bayesian Networks Using Dual Genetic Algorithm,” IEICE Trans. Inf. & Syst., 2007 • [8] (22 April 2011) “Norsys Bayes Net Library” [online] http://www.norsys.com/networklibrary.html# • [9] Murphy,K.P. (2002) Bayes Net Toolbox. Technical Report, MIT Artificial Intelligence Laborator

  23. Thank You

More Related