17.4 Rule-Based Expert Systems (1/9)

17.4 Rule-Based Expert Systems (1/9) • Expert Systems • One of the most successful applications of AI reasoning technique using facts and rules • “AI Programs that achieve expert-level competence in solving problems by bringing to bear a body of knowledge [Feigenbaum, McCorduck & Nii 1988]” • Expert systems vs. knowledge-based systems • Rule-based expert systems • Often based on reasoning with propositional logic Horn clauses. (c) 2000-2002 SNU CSE Biointelligence Lab

17.4 Rule-Based Expert Systems (2/9) • Structure of Expert Systems • Knowledge Base • Consists of predicate-calculus facts and rules about subject at hand. • Inference Engine • Consists of all the processes that manipulate the knowledge base to deduce information requested by the user. • Explanation subsystem • Analyzes the structure of the reasoning performed by the system and explains it to the user. (c) 2000-2002 SNU CSE Biointelligence Lab

17.4 Rule-Based Expert Systems (3/9) • Knowledge acquisition subsystem • Checks the growing knowledge base for possible inconsistencies and incomplete information. • User interface • Consists of some kind of natural language processing system or graphical user interfaces with menus. • “Knowledge engineer” • Usually a computer scientist with AI training. • Works with an expert in the field of application in order to represent the relevant knowledge of the expert in a forms of that can be entered into the knowledge base. (c) 2000-2002 SNU CSE Biointelligence Lab

17.4 Rule-Based Expert Systems (5/9) • To prove OK • The inference engine searches fro AND/OR proof tree using either backward or forward chaining. • AND/OR proof tree • Root node: OK • Leaf node: facts • The root and leaves will be connected through the rules. • Using the preceding rule in a backward-chaining • The user’s goal, to establish OK, can be done either by proving both BAL and REP or by proving each of COLLAT, PYMT, and REP. • Applying the other rules, as shown, results in other sets of nodes to be proved. (c) 2000-2002 SNU CSE Biointelligence Lab

17.4 Rule-Based Expert Systems (7/9) • Consulting system • Attempt to answer a user’s query by asking questions about the truth of propositions that they might know about. • Backward-chaining through the rule is used to get to askable questions. • If a user were to “volunteer” information, bottom-up, forward chaining through the rules could be used in an attempt to connect to the proof tree already built. • The ability to give explanations for a conclusion • Very important for acceptance of expert system advice. • Proof tree • Used to guide the explanation-generation process. (c) 2000-2002 SNU CSE Biointelligence Lab

17.4 Rule-Based Expert Systems (8/9) • In many applications, the system has access only to uncertain rules, and the user not be able to answer questions with certainty. • MYCIN [Shortliffe 1976]: Diagnose bacterial infections. (c) 2000-2002 SNU CSE Biointelligence Lab

17.4 Rule-Based Expert Systems (9/9) • PROSPECTOR [Duda, Gaschnig & Hart 1979, Campbell, et al. 1982] • Reason about ore deposits. • The numbers (.75 and .5 in MYCIN, and 5, 0.7 in PROSPECTOR) are ways to represent the certainty or strength of a rule. • The numbers are used by these systems in computing the certainty of conclusions. (c) 2000-2002 SNU CSE Biointelligence Lab

17.5 Rule Learning • Inductive rule learning • Creates new rules about a domain, not derivable from any previous rules. • Ex) Neural networks • Deductive rule learning • Enhances the efficiency of a system’s performance by deducting additional rules from previously known domain rules and facts. • Ex) EBG (explanation-based generalization) (c) 2000-2002 SNU CSE Biointelligence Lab

17.5.1 Learning Propositional Calculus Rules (1/9) • Train rules from given training set • Seek a set of rules that covers only positive instances • Positive instance: OK = 1 • Negative instance: OK = 0 • From training set, we desire to induce rules of the form • We can make some rule more specific by adding an atom to its antecedent to make it cover fewer instances. • Cover: If the antecedent of a rule has value True for an instance in the training set, we say that the rule covers that instance. • Adding a rule makes the system using these rules more general. • Searching for a set of rules can be computationally difficult. • here, we use “greedy” method which is called separate and conquer. (c) 2000-2002 SNU CSE Biointelligence Lab

17.5.1 Learning Propositional Calculus Rules (2/9) • Separate and conquer • First attempt to find a single rule that covers only positive instances • Start with a rule that covers all instances • Gradually make it more specific by adding atoms to its antecedent. • Gradually add rules until the entire set of rules covers all and only the positive instances. • Trained rules can be simplified using pruning. • Operations and noise-tolerant modifications help minimize the risk of overfitting. (c) 2000-2002 SNU CSE Biointelligence Lab

17.5.1 Learning Propositional Calculus Rules (3/9) • Example: loan officer in a bank • Start with the provisional rule . • Which cover all instances. • Add an atom it cover fewer negative instances-working toward covering only positive ones. • Decide, which item should we added ? • From by (c) 2000-2002 SNU CSE Biointelligence Lab

17.5.1 Learning Propositional Calculus Rules (5/9) • Rule covers the positive instances 3,4, and 7, but also covers the negative instance 1. • So, select another atom to make this rule more specific. • We have already decided that the first component in the antecedent is BAL, so we have to consider it. We select RATING because is based on a larger sample. (c) 2000-2002 SNU CSE Biointelligence Lab

17.5.1 Learning Propositional Calculus Rules (6/9) • We need more rules which cover positive instance 6. To learn the next rule, eliminate from the table all of the positive instances already covered by the first rule. (c) 2000-2002 SNU CSE Biointelligence Lab

17.5.1 Learning Propositional Calculus Rules (7/9) • Begin the process all over again with reduced table • Start with the rule . • Finally we get which covers only positive instances with first rule, so we are finished. APP=INC=0.25, arbitrarily select APP. This rule covers negative instances 1, 8, and 9  we need another atom to the antecedent. Select RATING, and we get This rule covers negative example 9. (c) 2000-2002 SNU CSE Biointelligence Lab

17.5.2 Learning First-Order Logic Rules (1/10) • Inductive logic programming (ILP) • Concentrate on methods for inductive learning of Horn clauses in first order predicate calculus (FOPC) and thus PROLOG program. • FOIL [Quinlan, 1990] • The Objective of ILP • To learn a program, ,consisting of Horn clauses, ,each of which is of the form where, the are atomic formulas that unify with ground atomic facts. • : should evaluate to True when its variables are bound to some set of values known to be in the relation we are trying to learn (positive instance: training set). • : should evaluate to False when its variables are bound to some set of values known not to be in the relation (negative instance). (c) 2000-2002 SNU CSE Biointelligence Lab

17.5.2 Learning First-Order Logic Rules (2/10) • We want to cover the positives instances and not cover negative ones. • Background knowledge • The ground atomic facts with which the are to unify. • They are given-as either subsidiary PROLOG programs, which can be run and evaluated, or explicitly in the form of a list of facts. • Example: A delivery robot navigating around in a building finds through experience, that it is easy to go between certain pairs of locations and not so easy to go between certain other pairs. (c) 2000-2002 SNU CSE Biointelligence Lab

17.5.2 Learning First-Order Logic Rules (3/10) • A, B, C: junctions • All of the other locations: shops • Junction(x) • Whether junction or not. • Shop(x,y) • Whether shop or not which is connected to junction x. (c) 2000-2002 SNU CSE Biointelligence Lab

17.5.2 Learning First-Order Logic Rules (4/10) • We want a learning program to learn a program, Easy(x,y) that covers the positive instances in but not the negative ones. • Easy(x,y) can use the background subexpressions Junction(x) and Shop(x,y). • Training set (c) 2000-2002 SNU CSE Biointelligence Lab

17.5.2 Learning First-Order Logic Rules (5/10) • For all of the locations named in , only the following pairs give a value True for Shop: • The following PROLOG program covers all of the positive instances of the training set and none of the negative ones Easy(x, y) :- Junction(x), Junction(y) :- Shop(x, y) :- Shop(y, x) (c) 2000-2002 SNU CSE Biointelligence Lab

17.5.2 Learning First-Order Logic Rules (6/10) • Learning process: generalized separate and conquer algorithm (GSCA) • Start with a program having a single rule with no body • Add literals to the body until the rule covers only (or mainly) positive instances • Add rules in the same way until the program covers all (or most) and only (with few exceptions) positive instances. (c) 2000-2002 SNU CSE Biointelligence Lab

17.5.2 Learning First-Order Logic Rules (7/10) • Practical ILP systems restrict the literals in various ways. • Typical allowed additions are • Literals used in the background knowledge • Literals whose arguments are a subset of those in the head of the clause. • Literals that introduce a new distinct variable different from those in the head of the clause. • A literal that equates a variable in the head of the clause with another such variable or with a term mentioned in the background knowledge. • A literal that is the same (except for its arguments) as that in the head of the clause. (c) 2000-2002 SNU CSE Biointelligence Lab

17.5.2 Learning First-Order Logic Rules (8/10) • The literals that we might consider adding to a clause are • ILP version of GSCA • First, initialize first clause as Easy(x, y) :- • Add Junction(x), so Easy(x, y) :- Junction(x) covers the following instances • Include more literal ‘Junction(y)’  Easy(x, y) :- Junction(x), Junction(y) Junction(x), Junction(y), Junction(z)Shop(x,y), Shop(y,x), Shop(x,z)Shop(z,y), (x=y) • Positive instances • Negative instances (c) 2000-2002 SNU CSE Biointelligence Lab

17.5.2 Learning First-Order Logic Rules (9/10) • But program does not cover the following positive instances. • Remove the positive instance covered by Easy(x,y):-Junction(x), Junction(y) from to form the to be used in next pass through inner loop. • : all negative instance in + the positive instance that are not covered yet. • Inner loop create another initial clause “Easy(x,y) :-” • Add literal Shop(x,y) : Easy(x,y) :- Shop(x,y)  cover no negative instances, so we are finished with another pass through inner loop. • Covered positive instance by this rule ( remove this from ) (c) 2000-2002 SNU CSE Biointelligence Lab

17.5.2 Learning First-Order Logic Rules (10/10) • Now we have Easy(x,y) :- Junction(x), Junction(y) :- Shop(x, y) • To cover following instance • Add Shop(y, x) • Then we have Easy(x,y) :- Junction(x), Junction(y) :- Shop(x, y) :- Shop(y, x) • This cover only positive instances. (c) 2000-2002 SNU CSE Biointelligence Lab

17.5.3 Explanation-Based Generalization (1/2) • Example: “Block world” • General knowledge of the “Block world”. • Rules • Fact • We want to proof “ ” • Proof is very simple  (c) 2000-2002 SNU CSE Biointelligence Lab

17.5.3 Explanation-Based Generalization (2/2) • Explanation: Set of facts used in the proof • Ex) explanation for “ ” is “ ” • From this explanation, we can make “ ” • Replacing constant ‘A’ by variable ‘x’, then, we have “Green(x)” • Then we can proof “ ”, as like the case of “ ” • Explanation Based Generalization • Explanation-based generalization (EBG): Generalizing the explanation by replacing constant by variable • More rules might slow down the reasoning process, so EBG must be used with care-possibly by keeping information about the utility of the learned rules. (c) 2000-2002 SNU CSE Biointelligence Lab

Additional Readings (1/4) • [Levesque & Brachman 1987] • Balance between logical expression and logical inference • [Ullman 1989] • DATALOG • [Selman & Kautz 1991] • Approximate theory: Horn greatest-lower-bound, Horn least-upper-bound • [Kautz, Kearns, & Selman 1993] • Characteristic model (c) 2000-2002 SNU CSE Biointelligence Lab

Additional Readings (2/4) • [Roussel 1975, Colmerauer 1973] • PROLOG interpreter • [Warren, Pereira, & Pereira 1977] • Development of efficient interpreter • [Davis 1980] • AO* algorithm searching AND/OR graphs • [Selman & Levesque 1990] • Determination of minimum ATMS label: NP-complete problem (c) 2000-2002 SNU CSE Biointelligence Lab

Additional Readings (3/4) • [Kautz, Kearns & Selman 1993] • TMS calculation based on characteristic model • [Doyle 1979, de Kleer 1986a, de Kleer 1986b, de Kleer 1986c, Forbus & de Kleer 1993, Shoham 1994] • Other results for TMS • [Bobrow, Mittal & Stefik 1986], [Stefik 1995] • Construction of expert system (c) 2000-2002 SNU CSE Biointelligence Lab

Additional Readings (4/4) • [McDermott 19982] • Examples of expert systems • [Leonard-Barton 1987] • History and usage of DEC’s expert system • [Kautz & Selman 1992], [Muggleton & Buntine 1988] • Predicate finding • [Muggleton, King & Sternberg 1992] • Protein secondary structure prediction by GOLEM (c) 2000-2002 SNU CSE Biointelligence Lab

17.4 Rule-Based Expert Systems (1/9)

17.4 Rule-Based Expert Systems (1/9)

Presentation Transcript

Lecture 2

Rule-Based Classifiers

Rule Based Systems

Traditional Expert-Based Information Delivery Systems

Rule-Based Deduction Systems

Artificial Intelligence Lecture No. 16

Themes of Presentations

Expert Systems

Backward-Chaining Rule-Based Systems

Rule-Based Deduction

Artificial Intelligence Chapter 17 Knowledge-Based Systems

Rule Based Expert System

Introduction to Rule-Based Reasoning

IDDS: Rules-based Expert Systems

IDDS: Rules-based Expert Systems

Fuzzy Rule-based Models

Rule-based Knowledge (Expert) Systems

Frame-based Expert Systems

Web Enabled Expert Systems using Hyperlink-based Inference

Artificial Intelligence

Chapter 9 Rules and Expert Systems

DT228/3 Intelligent Systems Development