110 likes | 272 Views
Bayesian Networks for Data Mining. David Heckerman Microsoft Research (Data Mining and Knowledge Discovery 1, 79-119 (1997)). The Bayesian approach #1 Question What is Bayesian probability?. A person’s degree of belief in certain event. Personal (subjective)
E N D
Bayesian Networks for Data Mining David Heckerman Microsoft Research (Data Mining and Knowledge Discovery 1, 79-119 (1997))
The Bayesian approach#1 QuestionWhat is Bayesian probability? • A person’s degree of belief in certain event. • Personal (subjective) • Your degree of belief that the coin will land heads.
The Classical approach • Physical property of the world. • Repeated trials (frequency) • The probability that a coin will land heads.
#2 QuestionWhat are the advantages and disadvantages of the Bayesian and classical interpretation of probability? Bayesian probability: + Reflects an expert’s knowledge. + Compiles with rules of probability • Arbitrary Classical probability: + Objective, unbiased. - Not available in most situations.
Bayes Theorem Posterior = (likelihood X prior) / evidence
Bayesian Networks • Graphical model that encodes the joint probability distribution (JPD) for a set of variables X. • It is a directed acyclic (not cyclic) graph. • Each node represents one variable and contains a set local probability distributions (LPD) associated with each variable.
Bayesian Networks • Nodes • Parents • Children • Conditional probability tables • Construction
Inference The computation of a probability of interest given a model is known as probabilistic inference P(X|e)=P(x,e)/P(e) = cP(X,e) Example on board.
Learning • Learning from data • Refine the structure and LPD of a BN • Combine prior knowledge with data • Result: IMPROVED KNOWLEDGE
Question #3Mention at least 3 advantages of Bayesian Networks for data analysis. Explain each one. • Handle incomplete data sets • Learning about causal relationships • Combine domain knowledge + data • Avoid over fitting.