120 likes | 256 Views
Lecture 10. KDD Presentation (3 of 3): Rule Induction. Wednesday, February 7, 2001 William H. Hsu Department of Computing and Information Sciences, KSU http://www.cis.ksu.edu/~bhsu Readings: “Using Inductive Learning to Generate Rules for Semantic Query Optimization”, Hsu and Knoblock.
E N D
Lecture 10 KDD Presentation (3 of 3): Rule Induction Wednesday, February 7, 2001 William H. Hsu Department of Computing and Information Sciences, KSU http://www.cis.ksu.edu/~bhsu Readings: “Using Inductive Learning to Generate Rules for Semantic Query Optimization”, Hsu and Knoblock
Presentation Outline • Paper • “Using Inductive Learning to Generate Rules for Semantic Query Optimization” • Authors: C.-N. Hsu and C. A. Knoblock • In Advances in Knowledge Discovery in Databases (Fayyad, Piatetsky-Shapiro, Smyth, Uthurusamy, eds.) • Overview • Learning semantic knowledge • Rule induction • Purpose: semantic query optimization (SQO) • Analogue: inductive logic programming (ILP) • Knowledge representation: Horn clauses • Idea: use reformulation of queries to learn (induce) rules • Application of Machine Learning to KDD: Issues • Rules: Good hypothesis language for performance element (SQO)? • How are goals of database query speedup achieved? • Key strengths: straightforward induction method; can use domain theory
Deductive System for Inductive Learning • Recall: Definition of Induction • Induction: finding h such that <xi, f(xi)> D . (B D xi) | f(xi) • A | B means Alogically entailsB • xiith target instance • f(xi) is the target function value for example xi (data set D = {<xi, f(xi)>}) • Background knowledgeB (e.g., inductive bias in inductive learning) • Idea • Design inductive algorithm by inverting operators for automated deduction • Same deductive operators as used in theorem proving Training Examples Theorem Prover New Instance Classification of New Instance (or “Don’t Know”) Assertion { c H } Inductive bias made explicit Induction as Inverted Deduction:Design Principles
Induction as Inverted Deduction:Example • Deductive Query • “Pairs <u, v> of people such that u is a child of v” • Relations (predicates) • Child (target predicate) • Father, Mother, Parent, Male, Female • Learning Problem • Formulation • Concept learning: target function f is Boolean-valued • i.e., target predicate • Components • Target function f(xi):Child (Bob, Sharon) • xi:Male (Bob), Female (Sharon), Father (Sharon, Bob) • B: {Parent (x, y) Father (x, y). Parent (x, y) Mother (x, y).} • What satisfies <xi, f(xi)> D . (B D xi) | f(xi)? • h1: Child (u, v) Father (v, u). - doesn’t use B • h2: Child (u, v) Parent (v, u). - uses B
Induction as Inverted Deduction:Advantages and Disadvantages • Advantages (Pros) • Subsumes earlier idea of finding h that “fits” training data • Domain theory B helps define meaning of “fitting” the data: B D xi | f(xi) • Suggests algorithms that search H guided by B • Theory-guided constructive induction [Donoho and Rendell, 1995] • akaKnowledge-guided constructive induction [Donoho, 1996] • Disadvantages (Cons) • Doesn’t allow for noisy data • Q: Why not? • A: Consider what <xi, f(xi)> D . (B D xi) | f(xi) stipulates • First-order logic gives a huge hypothesis space H • Overfitting… • Intractability of calculating all acceptable h’s
C: Pass-Exam Study C2: Know-Material Study Inverting Resolution:Example C2: Know-Material Study C1: Pass-Exam Know-Material Resolution C1: Pass-Exam Know-Material Inverse Resolution C: Pass-Exam Study
Semantic Query Optimization (SQO)Methodology • Goals (Section 17.1) • Use semantic rules to find “shortcuts” to queries • Example: all CIS 864 students have studied basic probability • Query: “Find all CIS 864 students who have had courses in probability and stochastic processes” • Can drop condition • Learn rules from data • Observe when query can be simplified • Generalize over these “training cases” • Background (Section 17.2) • Queries: Datalog select-from-where subset of Structured Query Language (SQL) • Semantic rules: Horn clauses (cf. Prolog) • Learning Framework (Section 17.3) • Concept: SatisfyInputQuery (+ iff instance, i.e., tuple, satistifes query) • Algorithm for dropping constraints (generalization): greedy min-set-cover • Heuristic (preference bias): gain/cost ratio
Learning Framework and Algorithm • Given: Few Example Queries, Data Set D (Many Tuples) • Methodology (Sections 17.3-4) • Step 1 (Optimizer): optimize queries by dropping constraints if possible • Use Greedy-Min-Set-Cover algorithm • Call learning module to add rules to rule base • Step 2 (Find Alternative Queries): • 2a (Construct Candidate Constraints): use gain/cost ratio (number of – cases excluded / syntactic length of constraint) • Rationale: Occam’s Razor bias, min-set-cover (ratio-bounded approximation) • 2b (Search for Constraints): build on newly-introduced relations • Step 3 (Update Rule Bank): apply newly discovered rules • Put newly-induced rules into rule base • Use inference engine (Prolog) to generate facts that will shorten query search
Design Rationale • Problem (Sections 17.1-4) • How to generalize well over reformulable queries? • Want to make sure inducer does not overfit observed pattern of training examples • Solution Approach (Section 17.3-4) • Idea: Occam’s Razor bias • Prefer shorter hypotheses, all other things being equal • Why does this work? • Types of Bias • Preference bias • Captured (“encoded”) in learning algorithm • Compare: search heuristic • Language bias • Captured (“encoded”) in knowledge (hypothesis) representation • Compare: restriction of search space • akarestriction bias
Experimental Method • Experimental Results (Section 17.5) • Improvement using SQO by rule induction (Table 17.4) • Reformulation using induced rules improves short and long queries (about uniformly) • Speedup • Breakdown of savings by NIL queries vs. overall • Claims (Section 17.5) • SQO is scalable: can use rule induction on large DBs • SQO is general: can apply other search techniques, heuristics
Summary:Content Critique • Key Contribution • Simple, direct integration of inductive rule learning with SQO • Significance to KDD: good way to apply ILP-like learning in DB optimization • Applications • Inference • Decision support systems (DSS) • Strengths • Somewhat generalizable approach • Significant for KDD • Applies to other learning-for-optimization inducers • Formal analysis of SQO complexity • Experiments: measure • Speedup learning % time saved • How wasted time is saved (NIL queries, short vs. long queries) cf. performance profiling • Weaknesses, Tradeoffs, and Questionable Issues • Insufficient comparison of alternative heuristics (MDL, etc.) • Empirical performance of exhaustive search?
Summary:Presentation Critique • Audience: Researchers and Practitioners of • AI (machine learning, intelligent database optimization) • Database management systems • Applied logic • Positive and Exemplary Points • Good, abstract examples illustrating role of SQO and ILP • Real DB optimization example (3 Oracle DBs) • Negative Points and Possible Improvements • Insufficient description of analytical hypothesis representations • Semantics: not clear how to apply other algorithms of rule induction • Decision tree • First-order ILP (e.g., FOIL)