290 likes | 720 Views
Introduction of Bayesian Network. 4 / 20 / 2005 CSE634 Data Mining Prof. Anita Wasilewska 105269827 Hiroo Kusaba. References. [1] D. Heckerman: “ A Tutorial on Learning with Bayesian Networks ” , In “ Learning in Graphical Models ” , ed. M.I. Jordan, The MIT Press, 1998.
E N D
Introduction of Bayesian Network 4 / 20 / 2005 CSE634 Data Mining Prof. Anita Wasilewska 105269827 Hiroo Kusaba Software Engineering Laboratory
References • [1] D. Heckerman: “A Tutorial on Learning with Bayesian Networks”, In “Learning in Graphical Models”, ed. M.I. Jordan, The MIT Press, 1998. • [2] http://www.cs.huji.ac.il/~nir/Nips01-Tutorial/ • [3]Jiawei Han:”Data Mining Concepts and Techniques”,ISBN 1-53860-489-8 • [4] Whittaker, J.: Graphical Models in Applied Multivariate Statistics, John Wiley and Sons (1990) Software Engineering Laboratory
Contents • Brief introduction • Review • A little review of probability • Bayes theorem • Bayesian Classification • Steps of using Bayesian Network Software Engineering Laboratory
Random variables X, Y, Xi, Θ Capitals • Condition (or value) of a variable x, y, xi, θ small • Set of a variable X, Y, Xi, Θ in Capital bold • Set of a condition (or value) x, y, xi, θ small bold • P(x/a) : Probability that an event x occurs (or happens) under the condition of a Software Engineering Laboratory
What is Bayesian Network ? • Network which express the dependencies among the random variables • Each node has posterior probability which depends on the previous random variable • The whole network also express the joint probability distribution from all of the random variables • Pa is parent(s) of a node i Software Engineering Laboratory
How is it used ? • Bayesian Learning • Estimating dependencies between the random variables from the actual data • Bayesian Inference • When some of the random variables are defined it calculate the other probabilities • Patiants condition as a random variable, from the condition it predicts the desease Software Engineering Laboratory
What is so good about it? • Conditional independencies and graphical expression capture structure of many real-world distributions. [1] • Learned model can be used for many tasks • Supports all the features of probabilistic learning • Model selection criteria • Dealing with missing data and hidden variables Software Engineering Laboratory
Example of Bayesian Network • Structure of a network • Conditional Probability • X,Y,Z are random variables which takes either 0 or 1 • p(X), p(Y|X), p(Z|Y) X Y Z Software Engineering Laboratory
Example of Bayesian Network 2 • What is the Joint probability of P(X, Y, Z)? • P(X, Y, Z) = P(X)*P(Y|X)*P(Z|Y) Software Engineering Laboratory
A little Review of probability 1 • Probability : How likely is it that an event will happen? • Sample Space S • Element of S: elementary event • An event A is a subset of S • P(A) ≧ 0 • P(S) = 1 Software Engineering Laboratory
A little review of probability 2 • Discrete probability distribution • P(A) = Σs∈A P(s) • Conditional probability distribution • P(A|B) = P(A, B) / P(B) • If the events are independent • P(A, B) = P(A)*P(B) • Bayes Theorem A B Software Engineering Laboratory
Bayes Theorem Software Engineering Laboratory
Example of Bayes Theorem • You are about to be tested for a rare desease. How worried should you be if the test result is positive ? • Accuracy of the Test is P(T) = 85% • Chance of Infection P(I) = 0.01% • What is P(I / not T) • http://www.gametheory.net/Mike/applets/Bayes/Bayes.html Software Engineering Laboratory
Bayesian Classification • Suppose that there are m classes, Given an unknown data sample, xthe Bayesian classifier assigns an unknown sample x to the class c if and only if Software Engineering Laboratory
We have to maximize • In order to reduce computationclass conditional independence is made Software Engineering Laboratory
Example of Bayesian Classificationin the text book[3] • Customer under 30 and income is “medium” and student and credit rating is “fair”, which category does the customer belongs? Buy or not. Software Engineering Laboratory
X Y Z Bayesian Network • Network which express the dependencies among the random variables • The whole network also express the joint probability distribution from all of the random variables • Pa is parent(s) of a node i Pai are a subset Software Engineering Laboratory
Steps to apply Bayesian Network • Step1 Create a Bayesian Belief Network • Include all the variables that are important in your system • Use causal knowledge to guide the connections made in the graph • Use your prior knowledge to specify the conditional distributions • Step2 Calculate the p(xi|pai) for your goal Software Engineering Laboratory
Example from [1] • Example to make a BN from the prior knowledge • BN to find a credit card fraud • Define random variables • Fraud(F):Probability that owner is a fraud • Gas(G):Bought a gas in 24 hours • Jewelry(J):Bought a jewelry in 24 hours • Age(A):Age of owner of the card • Sex(S):Gender of the owner of the card Software Engineering Laboratory
Give orders to random variables • Define dependencies, but you have to be careful. F A S G J F J S G A Software Engineering Laboratory
Next topic • Training with Bayesian Network • Bayes Inference • If the training data is complete • If the training data is missing • Network Evaluation Software Engineering Laboratory
Thank you for listening. Software Engineering Laboratory