280 likes | 424 Views
CS4445/B12 Provided by: Kenneth J. Loomis. Homework 3: Solutions. Topology of the Network. This is the naïve topology, we can easily construct the graph for this dataset. likes. genre. critics-reviews. rating. IMAX.
E N D
CS4445/B12 Provided by: Kenneth J. Loomis Homework 3: Solutions
Topology of the Network • This is the naïve topology, we can easily construct the graph for this dataset. likes genre critics-reviews rating IMAX • Then we need to calculate the conditional probability tables for each of the nodes in this network. • This will be shown in three steps for each node: • Build a frequency table • Add 1 to account for “fake” instances that prevent zero probability. • Convert the frequency table to a conditional probability table.
Likes: Calculate Frequencies likes genre critics-reviews rating IMAX • Notice that there are no parents for this node.
Likes: Include “fake” instances likes genre critics-reviews rating IMAX • Add 1 to each of the frequency counts.
Likes: Convert to CPT likes genre critics-reviews rating IMAX • Notice that we add two “fake” instances, so we add 2 to the total number of instance:
Genre: Calculate Frequencies likes genre critics-reviews rating IMAX • Notice that there is a parent for this node.
Genre: Include “fake” instances likes genre critics-reviews rating IMAX • Add 1 to each of the frequency counts.
Genre: Convert to CPT likes genre critics-reviews rating IMAX • Notice that there are • 5+3=8 “noes” and • 9+3=12 “yesses”.
Critics-Reviews: Calculate Frequencies likes genre critics-reviews rating IMAX • Notice that there is a parent for this node.
Critics-Reviews: Include “fake” instances likes genre critics-reviews rating IMAX • Add 1 to each of the frequency counts.
Critics-Reviews: Convert to CPT likes genre critics-reviews rating IMAX • Notice that there are • 5+3=8 “noes” and • 9+3=12 “yesses”.
Rating: Calculate Frequencies likes genre critics-reviews rating IMAX • Notice that there is a parent for this node.
Rating : Include “fake” instances likes genre critics-reviews rating IMAX • Add 1 to each of the frequency counts.
Rating : Convert to CPT likes genre critics-reviews rating IMAX • Notice that there are • 5+2=7 “noes” and • 9+2=11 “yesses”.
IMAX: Calculate Frequencies likes genre critics-reviews rating IMAX • Notice that there is a parent for this node.
IMAX: Include “fake” instances likes genre critics-reviews rating IMAX • Add 1 to each of the frequency counts.
IMAX: Convert to CPT likes genre critics-reviews rating IMAX • Notice that there are • 5+2=7 “noes” and • 9+2=11 “yesses”.
Topology of the Network • This is the network with the Conditional Probability Tables likes genre critics-reviews rating IMAX
Classification • Classify • genre = action, • critics-reviews = neutral, • rating = R, • IMAX = TRUE, • likes = ? likes genre critics-reviews rating IMAX
Classification • Classify • genre = action, • critics-reviews = neutral, • rating = R, • IMAX = TRUE, • likes = ? • Recall how we classify using a Bayesian network. • Find the argument (yes or no) that maximizes the probability. • argmax( Pr(yes), Pr(no) ) • = argmax( Pr (genre = action, critics-reviews = neutral, rating = R, IMAX = TRUE | likes = yes ), • Pr (genre = action, critics-reviews = neutral, rating = R, IMAX = TRUE | likes = no ) • = argmax( ( Pr (genre = action | likes = yes) * Pr(critics-reviews = neutral | likes = yes) * • Pr (rating = R | likes = yes) * Pr (IMAX = TRUE | likes = yes) * Pr (likes = yes) ) , • ( Pr(genre = action | likes = no) * Pr (critics-reviews = neutral | likes = no ) * • Pr (rating = R| likes = no) * Pr (IMAX = TRUE | likes = no ) * Pr (likes= no) ) )
Classification • Pr(yes) • =Pr(genre = action, critics-reviews = neutral, rating = R, IMAX = TRUE | likes = yes ) • = Pr(genre = action | likes = yes) * Pr(critics-reviews = neutral | likes = yes) * • Pr (rating = R | likes = yes) * Pr (IMAX = TRUE | likes = yes) * Pr (likes = yes) • = likes genre critics-reviews rating IMAX
Classification • Pr(no) • =Pr(genre = action, critics-reviews = neutral, rating = R, IMAX = TRUE | likes = no ) • = Pr(genre = action | likes = no) * Pr(critics-reviews = neutral | likes = no) * • Pr (rating = R | likes = no) * Pr (IMAX = TRUE | likes = no) * Pr (likes = no) • = * * * ≈ 0.021524 likes genre critics-reviews rating IMAX
Classification • Thus we have the following: • argmax( Pr(yes), Pr(no) ) = argmax( 0.008609, 0.021524) • And the highest probability is for likes = no. • Classified as • genre = action, • critics-reviews = neutral, • rating = R, • IMAX = TRUE, • likes = no
Classifying instances with missing values Notice we have missing values this time… We could calculate the probability of the missing values thusly: Pr(IMAX = ? | likes = no) = Pr(IMAX = FALSE | likes = no) + Pr(IMAX = TRUE | likes = no) = Since this will always be 1 (which will not alter the product we can ignore them). • Classify • genre = action, • critics-reviews = ?, • rating = R, • IMAX = ?, • likes = ? • argmax( Pr(yes), Pr(no) ) • = argmax( Pr (genre = action, rating = R | likes = yes), Pr (genre = action, rating = R | likes = no)) • = argmax( ( Pr (genre = action | likes = yes) * Pr (rating = R | likes = yes) * Pr (likes = yes) ) , • ( Pr(genre = action | likes = no) * Pr (rating = R| likes = no) * Pr (likes= no) ) ) • = argmax( , = ) = argmax( 0.056818, 0.100446 ) • Hence this instance is classified as likes = no
CPT: Critics-Reviews likes genre • Notice that we have a two parents for critics-reviews: • likes • genre • The conditional probability table must reflect this. critics-reviews IMAX rating
Critics-Reviews: Calculate Frequencies Ordered from left-to-right, top-to-bottom Notice that we have 2x3 number of possible parent values.
Critics-Reviews: Include “fake” instances Ordered from left-to-right, top-to-bottom Add 1 to every recorded frequency.
Critics-Reviews: Convert to CPT Ordered from left-to-right, top-to-bottom