110 likes | 265 Views
Statistical Relational AI. 590AI/DMSL Seminar Autumn 2003. Overview. The AI view The data mining view The statistical view Applications Relational extensions of statistical models Statistical extensions of first-order logic Major problem types Crosscutting issues Plan. The AI View.
E N D
Statistical Relational AI 590AI/DMSL Seminar Autumn 2003
Overview • The AI view • The data mining view • The statistical view • Applications • Relational extensions of statistical models • Statistical extensions of first-order logic • Major problem types • Crosscutting issues • Plan
The AI View Statistical Relational AI Probability First-Order Logic Propositional Logic
The Data Mining View • Most databases contain multiple tables • Data mining algorithms assume one table • Manual conversion: slow, costly bottleneck • Important patterns may be missed • Solution:Multi-relational data mining
The Statistical View • Most statistical models assume i.i.d. data(independent and identically distributed) • A few assume simple regular dependence (e.g., Markov chain) • This is a huge restriction – Let’s remove it! • Allow dependencies between samples • Allow samples with different distributions
Applications • Bottom line: Using statistical and relational information gives better results • Web search (Brin & Page, WWW-98) • Text classification (Chakrabarti et al, SIGMOD-98) • Marketing (Domingos & Richardson, KDD-01) • Record linkage (Pasula et al, NIPS-02) • Gene expression (Segal et al, UAI-03) • Information extraction (McCallum & Wellner, IIW-03) • Etc.
Relational Extensions of Statistical Models • Probabilistic relational models(Friedman et al, IJCAI-99) • Relational Markov networks(Taskar et al, UAI-02) • Relational Markov models(Anderson et al, KDD-02) • Relational dependency networks(Neville & Jensen, MRDM-03) • Stochastic graph grammars(Oates et al, SRL-03) • Etc.
Statistical Extensions ofFirst-Order Logic • Knowledge-based model construction (Wellman et al, 1992) • Stochastic logic programs (Muggleton, 1996) • PRISM (Sato & Kameya, 1997) • Bayesian logic programs (Kersting, 2000) • CLP(BN) (Costa et al, 2003) • Etc.
Major Problem Types • Collective classification • Link discovery • Link-based search • Link-based clustering • Co-clustering • Learning across distributions • Object identification • Etc.
Cross-Cutting Issues • Propositionalization and aggregation • Efficient inference and learning • Incorporation of knowledge • Integration with databases • Generative vs. discriminative learning • Time-changing data • Structuring the field • Etc.
Plan • Next week: Overview of backgroundby Matt Richardson • Review uncertainty, first-order logic(Good source: Russell & Norvig, AIMA 2nd ed.) • Following weeks: Paper presentations • Volunteer to present a paper • List at www.cs.washington.edu/590ai • Email pedrod@cs.washington.edu • Subscribe to 590ai mailing list