400 likes | 609 Views
Inductive Logic Programming. Content. Introduction to ILP Basic ILP techniques An overview of the different ILP systems The application field of ILP Summary. Introduction to ILP. Inductive logic programming (ILP) = Inductive concept learning (I) Logic Programming (LP) Goal:
E N D
Content • Introduction to ILP • Basic ILP techniques • An overview of the different ILP systems • The application field of ILP • Summary
Introduction to ILP • Inductive logic programming (ILP) =Inductive concept learning (I) Logic Programming (LP) • Goal: • Develop a theoretical framework for induction • Build practical algorithms for inductive learning of relational concepts described in the form of logic programs • Background: • ILP theory based on the theory of LP • ILP algorithms based on experimental and application oriented ML research • Motivation: • Use of an expressive representational formalism as proportional logic • Use background knowledge in learning (in AI the use of domain knowledge is essential for achieving intelligent behaviour)
Introduction to ILP 2 • Inductive learning with background knowledge:Given a set of training examples E and background knowledge B find a hypothesis H, expressed in some concept description language L, such that H is complete and consistent with respect to the background knowledge B and the examples E • A hypothesis H is complete with regard to the background knowledge B and examples E if it covers all the positive examples i.e., if • A hypothesis H is consistent with respect to the background knowledge B and examples E if it covers none of the negative examples i.e.,
Introduction to ILP 4 • Example: The task is to define the target relation daughter(X,Y) • Background knowledge consists of ground facts about the predicates female(X) and parent(Y,X):parent(ann, mary) female(ann)parent(ann, tom) female(marry)parent(tom, eve) female(eve)parent(tom, ian) • Training examples:+: daughter(marry, ann) daughter(eve, tom)-: daughter(tom, ann) daughter(eve, ann) • Possible target relation: • Here the target relation is:
Introduction to ILP 5 • Dimension of ILP • Learning either a single concept or multiple concepts • Requires all the training examples to be given before the learning process (batch learners) or accepts examples one by one (incremental learners) • The learner may rely on an oracle to verify the validity of generalisation and/or classify examples generated by the learner (interactive; non interactive) • The learner may try to learn a concept from scratch or can accept an initial hypothesis (theory) which is then revised in the learning process. The latter system is called theory revision. • Existing ILP systems • Empirical ILP system: Batch non-interactive system that learns single predicates from scratch • Interactive ILP system: Interactive and incremental theory revision system that learns multiple predicates
Content • Introduction to ILP • Basic ILP techniques • Generalisation techniques • Specialisation techniques • An overview of the different ILP systems • The application field of ILP • Summary
-subsumption • STRUCTURING THE HYPOTHESIS SPACE: Introducing partial ordering into a set of clauses based on the -subsumption • Def: A substitution is a function from variables to terms. The application of a substitution to a W is obtained by replacing all occurences of each variable in W by the same term • Def:Let c and c' be two program clauses. Clause c -subsumes c' if there exits a substitution , such that • Def: Two clauses c and d are -subsumption equivalent if c -subsumes d and d -subsumes c. • Def: A clause is reduced if it is not -subsumption equivalent to any proper subset of itself.
-subsumption (2) • Example1: Let c be the clause: c = daughter(X,Y) parent(Y,X).A substitution applied to clause c is obtained by applying to each of its literal: c = daughter(mary, ann) parent(ann, mary). • Example2: Clause c -subsumes the clause c' = daughter(X,Y) female(X), parent(Y,X)under the empty substitution • Example3: Clause c -subsumes the clause c' = daughter(mary,ann) female(mary),parent(ann,mary),parent(ann,tom)under the substitution
-subsumption (3) • -subsumption introduces the syntactic notation of generality: • Clause c is at least as general as clause c' ( ), if c -subsumes c' • Clause c is more general than clause c' ( ), if holds and does not hold • c' is a refinement (specialisation) of c • c is a generalisationof c'
-subsumption (4) • -subsumption is important for learning: • It provides a generality order for hypotheses, thus structuring the hypothesis space • It can be used to prune large parts of the search space: • If generalising c to c' all the examples covered by cwill be also covered by c'This property is used to prune the search of more general clauses when e is a negative example: if c is inconsistent then all its generalisations will also be inconsistent. Hence, the generalisation of c do not need to be considered. • When specialising c to c' an example not covered by c will not be covered by any of its specialisations either.This property is used to prune the search of more specific clauses:if c does not cover a positive example none of its specialisation will do. Hence, the specialisations of c do not need to be considered.
-subsumption (5) • .Techniques based on the -subsumption: • Bottom-up: creating the least general generalisation from the training examples, relative to the background knowledge • Top-down searching of refinement graphs
Least General Generalisation • Properties of -subsumption: • If c -subsumes c' then c logically entails c', the reverse is not always true • The relation defines a lattice over the set of reduced clauses.This means that any two clauses have a least upper bound (lub) and a greatest lower bound (glb). • Def: The least general generalisation(lgg) of two reduced clauses c and c', denoted by lgg(c,c'), is the least upper bound of c and c' in the -subsumption lattice. • Example: • Let c and c' be two clauses:c= daughter(mary,ann) female(mary), parent(ann,mary).c'= daughter(eve,tom) female(eve), parent(tom,eve). • lgg of c and c': daughter(X,Y) female(X), parent(Y,X).
Least General Generalisation 2 • Computation of lgg with -subsumption: • lgg of terms • and V is a variable which represents • and at least one of s and t is a variable in this case, V is a variable which represents • Example: where V stands for lgg(a,b)
Least General Generalisation 3 • lgg of atoms • If atoms have the same predicate symbol p • lgg of literals • If and are atoms, then is computed as defined above • If both and are negative literals and then • If is a positive and is a negative literal, or vice versa,is undefined • Example:
Least General Generalisation 4 • lgg of clauseLet and . Then • Examples:If and thenwhere X stands for lgg(mary,eve) and Y stands for lgg(ann,tom)
Least General Generalisation 5 • Search lgg( , ) Input: and are atoms or literalsOutput: lgg( , ) = function lgg( , ): beginif head( ) head( ) then := ; return( )else := ; := whiledo find_position( , , , ); generate_variable ; substitute( , , , , )endwhile := ; return( )endifend
Least General Generalisation 6 Input: and are terms or literals Output: and terms procedure find_position( , , , ) begin ifthenreturn( , ) endif if ( is atomic or is atomic) then return( , ) endif ifthenreturn( , ) endif i:=1 while do find_position( , , , ) if thenreturn( , ) endif i:= i+1 endwhile end
Least General Generalisation 7 Input: and are terms or literals, and terms, X variable Output: and with substitution X for and procedure substitute( , , , , X) begin ifandthenreturn(X,X) endif ifthenreturn( , ) endif if ( is atomic or is atomic) thenreturn( , ) endif ; ; if thenreturn( , ) endif i:=1 whiledo substitute( , , , , X); i:= i+1 endwhile end
Generalisation techniques • Start from the most specific clause that covers a given positive example and then generalise the clause until it cannot be further generalised without covering negative examples. • Generalisation operator: Let L be a language bias, a generalisation operator maps a clause c to a set of clauses which are generalisations of c: • Generalisation operators perform two basic syntactic operations on a clause: • Apply an inverse substitution to the clause • Remove a literal from the body of the clause • Basic generalisation techniques: • Relative least generalisation (rlgg) • Inverse resolution
Relative Least Generalisation 1 • Relative least generalisation: The relative least generalisationrelativeto background knowledgeB K is the conjunction of ground facts and are positive example • Example: • Positive example: and and B as before where K is a conjunctionresult:
Relative Least Generalisation 2 Search for rlgg( , ) Input: and are two clauses in the form = { } and = { } Output: rlgg( , ) = function rlgg( , ): begin k := 1; l := 1; while k < n do while l < m do lgg( , , L); ; l := l+1 endwhile k := k+1 endwhile ; return( ) end
Inverse resolution • Basic idea: invert the resolution rule of deductive inference (Robinson) e.g., invert the SLD-Resolution proof procedure for definite programs • Example: Given the theory suppose we want to derive uproposition w resolves with to give v which is then resolved with derive u.
Inverse resolution 2 • Example: first order derivation tree for family example and andLet Suppose we want to derive the fact from
Inverse resolution 3 • Inverse resolution inverts the resolution process using the generalisation operator based on the inverting substitution • Given a W, an inverse substitution of a substitution is a function that maps terms in to variable such that • Example: • Take and the substitution : • Applying the inverse substitution the original clause is obtained • Example: Inverse substitution with places • Let and • Applying to W: • The specifies that the first occurrence of the subterm ann in the term is replaced by variable Xand the second occurrence is replaced by Y. • The use of places ensures that
Inverse resolution 4 • Example: Inverse Resolution • B consists of two clauses and • Let • The learner encounters the positive example: • The inverse resolution processes as follows: • It attempts to find which will together with entail and can be added to the current hypothesis Choosing the inverse resolution step generates becomes the current hypothesis H such that • It takes andBy computing using it generalise with respect to B, yieldingIn the H can be replaced by which together b entailsThe induced hypothesis is
Specialisation techniques • They search the hypothesis space in top-down manner, from general to specific hypotheses using -subsumption based on a specialisation operator (refinement operator) • Refinement operator: Given a language bias L, a refinement operator maps a clause c to a set of clause which are specialisations (refinements) of c • This operator typically computes only the set of minimum (most general) specialisations of a clause under -subsumptionIt employs two basic syntactic operations on a clause: • Apply a substitution to the clause • Add a literal to the body of the clause
Specialisation techniques (2) • Basic specialisation technique is top-down search of the refinement graph • Top-down learners start from the most general clauses and repeatedly refine them until they no longer cover negative examples
Specialisation techniques 2 • For a selected L and a given B the hypothesis space of program clauses is a lattice structured by the -subsumption generality ordering • In this lattice a refinement graph can be defined and used to direct the search from general to specific hypotheses • The refinement graph is a directed, acyclic graph in which nodes are program clauses and edges correspond to the basic refinement operators: substituting a variable with a term, adding a literal to the body of the clause. • First used Model Inference System (MIS, Shapiro 1983)
Specialisation techniques 4 • Generic Top-Down-Algorithm:Input:E the set of positive examples, B the background knowledgeL the description languageOutput: Hypothesis Hprocedure top_down_ILP( E, B, H) begin ; repeat choose repeat find the best refinement ; untilC is acceptable ; until hypothesis H satisfies stopping criterionreturn( H)end
Content • Introduction to ILP • Basic ILP techniques • An overview of the different ILP systems • The application field of ILP • Summary
An overview of the different ILP systems • Prehistory(1970) • Plotkin lgg, rlgg 1970-1971 • Early enthousiasm (1975-1983) • Vere 1975,1977, Summers 1977, Kodratoff and Jouanaud 1979 • Shapiro 1981, 1983 • Dark ages (1983-87)
An overview of the different ILP systems 2 • Renaissance (1987- ...) • MARVIN - Sammut & Banereji 1986, DUCE Muggleton 1987 • Helft 1987 • QuMAS - Mozetic 1987 Linus Lavrac et. al. 1991 • CIGOL - Muggleton &Buntine 1988 • ML-SMART Bergadano Giordana & Saitta 1988 • CLINT - De Raedt & Bruynooghe 1988 • BLIP, MOBAL Morik, Wrobel et. al. 1988 1989 • GOLEM - Muggleton &Feng 1990 • FOIL - quinland 1990 mFOIL - Dzerowski 1991 FOCL - Brunk & Pazzani FFOIL - Quinland 1996 • CLAUDIEN De Raedt & Bruynooghe 1993 • PROGOL - Muggleton et al 1995 • FORS- Karalic & Bratko 1996 • TILDE - Blockeel & De Raedt 1997 • .... • Studies Comparing the inductive learning system Foil with its versions FOCL, mFOIL, FFOIL, FOIL-I, FOSSIL, FOIDL and IFOIL
Content • Introduction to ILP • Basic ILP techniques • An overview of the different ILP systems • The application field of ILP • Summary
The application field of ILP • Application areas: • Knowledge acquisition for expert systems • Knowledge discovery in databases • Scientific knowledge discovery • Logic programs synthesis and verification • Inductive data engineering • Successful application • Finite element mesh design • Structure-activity prediction for drug design • Protein secondary-structure prediction • Predicting mutagenicity of chemical compounds
Summary • Inductive logic programming (ILP) =Inductive concept learning (I) Logic Programming (LP) • Empirical vs. Interactive ILP systems • Generalisation vs. specialisation techniques