Automatic Threshold Selection for Conditional Independence Tests in Learning Bayesian Networks

Automatic Threshold Selection for conditional independence tests in learning a Bayesian network Rafi Bojmel supervised by Dr. Boaz Lerner

Overview • Machine Learning (ML) investigates the mechanisms by which knowledge is acquired through experience. • Hard-core ML based applications: • Web search engines, On-line help services • Document processing (text classification, OCR) • Biological data analysis, Military applications • The Bayesian network (BN) has become one of the most studied machine learning models for knowledge representation, probabilistic inference and recently also classification

Recent visit to Asia Smoker Tuberculosis Lung cancer Bronchitis Either Tuberculosis or Lung cancer Positive X-ray Dyspnea(shortness-of-breath) BN Example (1) Chest Clinic (Asia) Problem

Markov Blanket of Lung cancer BN Example (2) Chest Clinic (Asia) Problem Recent visit to Abroad Smoker Lung cancer Bronchitis Tuberculosis Either Tuberculosis or Lung cancer Dyspnea (shortness-of-breath) Positive X-ray

Bayesian Networks Inference (e.g., classification) Learning Bayesian networks Structure learning Parameter learning Search-and-score Structure/Graph Bayesian network Constraint-based

Asia Smoker Tuberculosis Lung cancer Bronchitis Either X-ray Dyspnea #1 0 1 0 0 1 0 0 1 #2 0 0 0 0 1 0 0 1 #3 1 1 0 1 1 1 0 1 #4 0 1 1 1 0 1 1 0 #5 1 0 1 0 0 1 1 0 #6 0 0 0 0 0 0 0 1 … … … … … … … … … BN Structure Learning • Database  Training Set  Model Construction  Test set  Bayesian inference (classification) • Two main approaches in the area of BN Structure learning: • Search-and-Score, uses heuristic search method • Constraint based, analyzes dependency relationships among nodes, using conditional independence (CI) tests. The PC algorithm is a CB based algorithm.

Xi,Xj = any two nodes in the graphI*(Xi,Xj|{S}) = Normalized Conditional Mutual Information{S} = subset of variables (other than Xi,Xj) PC algorithm (1) • Inputs: • V: set of variables (and corresponding database) • I*(Xi,Xj|{S}) <> ε: A test of conditional independence • ε: Threshold • Order{V}: Ordering of V • Output: • Directed Acyclic Graph (DAG)

PC algorithm (2) • The algorithm contains three stages: • Stage I: Start from the complete graph and find an undirected graph using conditional independence tests • Stage II: Find some head to head (V-Structures) links( X – Y – Z becomes X  Y  Z ) • Stage III: Orient all those links that can be oriented

Recent visit to Asia Smoker Tuberculosis Lung cancer Bronchitis V-structure Either Tuberculosis or Lung cancer V-structure Positive X-ray Dyspnea(shortness-of-breath) PC Algorithm Simulation Stage II Stage III Stage I Precise Structure END

Threshold Selection – existing methods The “risk” in selecting the wrong threshold: Too small  too many edges causality run-time Too large  loose important edges inaccuracy • Arbitrary (trial-and-error) selection • Disadvantages: haphazardness, inaccuracy, time • Likelihood or Classifier Accuracy based selection • Disadvantages: exponentially run-time

Threshold selection - Novel Technique (1) Mutual information Probability Density Functions based: • Calculate the MI values, I*(Xi,Xj | {S}), for different sizes (orders) of condition set, S. • Create histograms (PDF estimation technique). • Techniques to define the best threshold automatically: • Zero-Crossing-Decision (ZCD) • Best-Candidate (BC)

Threshold selection - Novel Technique (2)

Zero-Crossing-Decision (ZCD) ZCD (order=0) ZCD (order=1)

Experiment and Results • Classification experiments with 8 real-world databases have been performed (UCI Repository) • Databases sizes: 128 - 3,200 cases. • Graph sizes: 5 - 17 nodes. • Dimension of class variable: 2 - 10.

Summary • The PC algorithm requires selecting a threshold for structure learning, which is a time-consuming process that also undermines automatic structure learning. • Initial examination of our novel techniques testifies that there is a potential of both enjoying the automatic process and improving performance. • Further research is executed in order to valid and improve the proposed techniques.

Automatic Threshold Selection for Conditional Independence Tests in Learning Bayesian Networks

Automatic Threshold Selection for Conditional Independence Tests in Learning Bayesian Networks

Presentation Transcript

Supervised by: Dr. Qadria AL-Deab Dr. Areej Altaweel

Supervised by: Pr.Dr.Magdy Moussa

Supervised by: Prof.Dr.Magdy Moussa

Prepared by: Eng. Mohamed Hassan Supervised by: Dr. Ashraf Aboshosha icgst

Boaz and Ruth

Boaz Vaadia

FILTERS Presented by: Mohammed Alani Supervised By: Dr. Nazila Safavi

Supervised by Dr. Mutasem Baba’

Boaz, Kinsman Redeemer

Wattan Basheer Supervised by: Dr. Rami Arafh

Supervised by: Dr. Norman Allott and Dr. Catherine Coxon

Supervised By: Undertaken By:

Supervised By: Dr. Luai M. Malhis Examiners Committee: Dr. Raed Al-Qadi

Supervised by Dr. W.H.Siew Presented by Yu Wang

THE CLIENT-SERVER MODEL Supervised By: Dr. Hatem Muharram

Andrew Davies Supervised by Dr. Mark Johnson and Dr. Christine Maggs

Lerner Index

SUPERVISED BY Dr. Andrew Ross Dr. Karima Saci Dr. Richard Kirkham Dr. Raymond Abdulai