30 likes | 121 Views
t1.id. r1.id. t1.cd. t1.la. T2.id. r2.id. t2.la. s1.la. s1.f. t2.cd. s2.la. r1.id. t1.id. t2.id. s1.f. s2.f. s2.f. r2.id. t1.id. r1.id. t2.id. r2.id. C. W. W. B. I. C. C. B. I. C. <. l. d. b. 2. 1. t. e. a. r. n. e. e. s. s1.folder. s2.folder. d.f.
E N D
t1.id r1.id t1.cd t1.la T2.id r2.id t2.la s1.la s1.f t2.cd s2.la r1.id t1.id t2.id s1.f s2.f s2.f r2.id t1.id r1.id t2.id r2.id C W W B I C C B I C < l d b 2 1 t e a r n e e s s1.folder s2.folder d.f d.f d.f d.f d.f d.f d.f d.f d.folder d.folder d.folder d.folder d.f d.f d.f d.f d.folder d.folder d.f d.f d.folder Structure Refinement in First Order Conditional Influence Language Sriraam Natarajan, Weng-Keen Wong, Prasad Tadepalli School of EECS, Oregon State University Synthetic Data Set Prior Program Irrelevant attributes First-order Conditional Influence Language (FOCIL) Relevant attributes Weighted Mean{ If {task(t), doc(d), role(d,r,t)} then t.id, r.id, t.creationDate, t.lastAccessed Qinf (Mean) d.folder If {doc(s), doc(d), source(s,d) } then s.folder, s.lastAccessed Qinf (Mean) d.folder } Weighted Mean{ If {task(t), doc(d), role(d,r,t)} then t.id, r.id Qinf (Mean) d.folder If {doc(s), doc(d), source(s,d) } then s.folder Qinf (Mean) d.folder } “Unrolled” Network for Folder Prediction Learned Program Weighted Mean{ If {task(t), doc(d), role(d,r,t)} then t.id, r.id Qinf (Mean) d.folder If {doc(s), doc(d), source(s,d) } then s.folder Qinf (Mean) d.folder } Conclusions • Data is expensive – Exploit prior knowledge in structure search • Derived the CBIC score for our setting • Learned the “true” network in the synthetic dataset • Folder dataset: Learned the best network with only relevant attributes • Folder dataset with irrelevant attributes: Scoring metric • Conditional BIC score = -2 *CLL + dmlogN • Different instantiations of the same rule share parameters • Conditional Likelihood: EM – Maximize the joint likelihood • CBIC score with penalty scaled down • Greedy Search with random restarts Prior Network Future work • Different scoring metrics • BDeu • Bias/Variance • Choose the best combining rule that fits the data • Structure refinement in large real-world domains Folder Prediction Learned Network Issues • What is the correct complexity penalty in the presence of multi-valued variables? • Counting the # of parameters may not be the right solution • What is the right scoring metric in relational setting for classification? • Can the search space be intelligently pruned?