40 likes | 257 Views
Linker zijkant. Planten schema paludarium. Rechter zijkant. 8. 6. 7. 23. 22. 1. 10. 2. 9. 27. 15. 11. 20. 16. 13. 12. 19. 4. 14. 24. 17. 21. 3. 18. 28. 5. 29. 25. 26. 26. 26. 26. Boven aanzicht aquarium. 1Beker plant 2oncidium flexuosum 3varen soort
E N D
Inductive Logic Programming and its use in Datamining Filip ZeleznyCenter of Applied Cybernetics Faculty of ElectrotechnicsCzech Technical University in Prague
Structure of Talk • Intro: ML & Datamining • ILP: Motivation, Concept • Basic Technique • Some Applications • Novel Approaches • Conclusions
Introduction • Machine Learning (ML) • a subfield of artificial intelligence, studies artificial systems that improve their behavior on the basis of experience, described formally by data. This is often achieved by reasoning analogically, or by building a model of the given domain on the basis of the data. • E.g. Pattern recognition by a trained neural network • Data Mining (DM) • is concerned with discovering understandably formulated knowledge that is valid but previously unseen in given data. This is often achieved by employing ML methods producing human-understandable models with predictive (e.g. predict an object attribute knowing the other attributes) or descriptive (e.g. find a frequently repeating pattern in data) capabilities. • E.g. ‘Shopping bag rule’: sausage mustard
ILP: Points of View • Software Engineering View • ILP synthesizes logic programs from examples • ... but the programs may be used for data classification • Machine Learning View • ILP develops theories about data using predicate logic • ... but the theories are as expressive as algorithms (Turing machine)
Data Mining Example 1 • Table of cars: • Predict the attribute ‘ affordable ’ ! • Rule discovered: • Attribute learning is appropriate. • size=small & luxury=low affordable
Data Mining Example 2 (1)[L. De Raedt, 2000] • Positive Examples • Negative Examples
Data Mining Example 2 (2)[L. De Raedt, 2000] • How to represent in AVL? • Assume fixed number of objects • Problem 1: exchange objects 1 & 2 • exponential number of different representations for the same entity
Data Mining Example 2 (3)[L. De Raedt, 2000] • Problem 2: Positional relations • explosion of false atributes • Problem 3: Variable number of objects • explosion of empty fields • explosion of entire table We need a structural representation!
Data Mining Example 2 (3) • Could be done with more relations (tables) • BUT! Standard ML / Datamining Algorithms can work with 1 relation only • Neural nets, AQ (rules), C4.5 (decision trees), … We need multirelational learning algorithms!
The Language of Prolog- Informal Introduction (1) • Ground facts (Predicate w. constants) add(1,1,2). • Variables add(X,0,X). • Functions e.g. s(X) - successor of X • Rules (implications) add(s(X),Y,s(Z)) add(X,Y,Z).add(0,X,X).
The Language of Prolog- Informal Introduction (2) • Invertibility minus(A,B,C) add(B,C,A). • Functions can be avoided (flattening) suc(X,Y) X is Y-1. (built-in arithmetics) add(0,X,X). add(X,Y,Z) suc(A,X) & suc(B,Z) & add(A,Y,B).
Deduction (in Logic Programming) Apriori (background) knowledge about integers Theory (hypothesis) about addition suc(X,Y) X is Y-1. add(0,X,X). add(X,Y,Z) suc(A,X) & suc(B,Z) & add(A,Y,B). add(1,1,2), add(3,5,8), add(4,1,5), ... add(1,3,5), add(8,7,6), add(1,1,1), ... Positive examples of addition Negative examples of addition
Induction(in Inductive Logic Programming) Apriori (background) knowledge about integers Positive and negative examples of addition suc(X,Y) X is Y-1. add(1,1,2), add(3,5,8), add(4,1,5), ... add(1,3,5), add(8,7,6), add(1,1,1), ... add(0,X,X). add(X,Y,Z) suc(A,X) & suc(B,Z) & add(A,Y,B). Theory (hypothesis) about addition
Basic ILP Technique (1) • Search through a clause implication lattice • From general to specific (top-down) • From specific to general (bottom-up) add(X,Y,Z) add(X,Y,Z) suc(A,X) add(X,Y,Z) suc(B,Z) add(X,Y,Z) suc(A,X), suc(B,X) ... etc. add(X,Y,Z) suc(A,X) & suc(B,Z) & add(A,Y,B)
Basic ILP Technique (2) • Clauses usually constructed one-by-one • e.g. specialize until covers no negatives,then begin a new clause for the rest of positives • Implication is undecidable • instead use syntactic. subsumtion (NP - hard) • measure generality of clause with background knowledge • Efficiency: use strong bias! • syntactical: • indicate input/output vars; maximum clause length • semantical: e.g. preference heuristics
Protein Structure Prediction(1) [Muggleton, 1992] • Predict the secondary structure of protein • examples: • alpha(Protein, Position). - residue at Position in Protein is in alpha helix. • negatives: all other residues • background knowledge: • position(Protein, Pos, Residue) • chem. properties of Residues • basic arithmetics • etc.
Protein Structure Prediction(2) [Muggleton, 1992] • Results • added to background knowledge, then 2nd search • again added to B for the 3rd search alpha0(A,B) ... position(A,D,O) & not_aromatic(O) & small_or_polar(O) & position(A,B,C) & very_hydrophobic(C) & not_aromatic(C) ...etc (22 literals) alpha1(A,B) oct(D,E,F,G,B,H,I,J,K) & alpha0(A,F) & alpha0(A,G). alpha2(A,B) oct(C,D,E,F,B,G,H,I,J) & alpha1(A,B) & alpha1(A,G) & alpha1(A,H).
Protein Structure Prediction(3) [Muggleton, 1992] • Final accuracy on testing set 81% • Best previous result (neural net) 76% • General-purpose bottom-up ILP system Golem used. • Experiment published in the « Protein Engineering » journal.
Mutagenecity Prediction[Srinivasan, 1995] • Predict mutagenecity (carcinogenecity) of chemicals with general system Progol [Muggleton] • Examples: compounds Active Inactive • Result: structural alert
Datamining in Telephony[Zelezny, Stepankova, Zidek 2000] • Discover frequent patterns of operations in an enterprise telephone exchange • Examples: history of calls + related attributes • Result: e.g. rule (lower case ~ constant) covers: • Predicates day, prefix, etc. in background knowledge. redirection(A,B,C,10) day(tuesday,A) &prefix(C,[5,0],2). redirection([15], [13,14,48], [5,0,0,0,0,0,0,0], 10). redirection([15], [14,18,58], [5,0,9,6,0,1,8,9], 10). redirection([22], [18,50,30], [5,0,0,0,0,0,0,0], 10). redirection([29], [13,35,56], [5,0,0,0,0,0,0,0], 10). redirection([29], [13,57,36], [5,0,0,0,0,0,0,0], 10).
Other Applications • Finite element mesh design • Control of dynamical systems • qualitative simulation • Software Engineering • Many more, especially in data mining
Descriptive ILP • Examples are interpretations (models) • is one example • Hypothesis must be true in all examples • Suited for data mining • finds ALL true hypothesis - maximum characterisation triangle(t,up) & circle(c1) & inside(c,t) &circle(c2) & right_of (c2,t) & class(positive) class(positive) triangle(X,Y) & circle(Z) & inside(Z,X).
Descriptive ILP – Application [Zelezny, Stepankova, Zidek / ILP 2000] • Call logging (mixed events) • Examples of single events(sets of actions and their logs) • Such as t(time(19,43,48),[1,2],time(19,43,48),e,li,empty,d,empty,empty,ex,[0,6,0,2,3,3,0,5,3,3],empty,anstr([0,0,5,0,0,0]),fe,fe,id(4)). t(time(19,43,48),[1,2],time(19,43,50),e,lb,e(relcause),d,dr,06,ex,[0,6,0,0,0,0,0,0,0,0],empty,anstr([0,0,5,0,0,0]),fe,fe,id(5)). ex_ans([0,6,0,2,3,3,0,5,3,3],[1,2]). hangsup([0,6,0,2,3,3,0,5,3,3]).
Descriptive ILP – Application [Zelezny, Stepankova, Zidek / ILP 2000] • Results • Rules that describe actions in terms of logging records • Such as ex_ans(RNCA1,DN1):- t(D1,IT1,DN1,ET1,e,li,empty,d,EF1,FI1,ex,RNCA1,empty,ANTR1,CO1,DE1,ID1), IT2=ET1, ANTR2=ANTR1, t(D2,IT2,DN2,ET2,e,lb,RC2,d,EF2,FI2,ex,RNCA2,empty,ANTR2,CO2,DE2,ID2), samenum(RNCA1,RNCA2).
Upgrades of Propositional Learnes:1st-order Decision Trees • Upgrades the C4.5 algorithm • E.g. Tilde [Blockheel, De Raedt] ? - circle(C1) ? - triangle(T,up) & inside(C1,T) class(positive) ? - circle(C2) & inside(C1,C2) class(positive) class(negative) class(positive)
More Upgrades of Propositional Learners • 1st-order association rules • the WARMR system [Dehaspe] • upgrade of Apriori • 1st-order Bayesian Nets • 1st-order Clustering • 1st-order Distance Based Learning [Zelezny / ILP 2001]
Concluding Remarks • Advantages of ILP • Theoretical: Turing-equivalent expressive power • Practical: rich but understandable language, integration of background knowledge, MULTI-relational data mining • Problems still to be solved... • efficiency, handling numbers, user interfaces
Find out more • ON • ML and DM literature, sources • Our ML and DM group • What we do • How you can participate • Etc. http://cyber.felk.cvut.cz/gerstner/machine-learning