IT/CS 811 Principles of Machine Learning and Inference

IT/CS 811 Principles of Machine Learning and Inference Case-based reasoning and learning Prof. Gheorghe Tecuci Learning Agents Laboratory Computer Science Department George Mason University

Overview Introduction Protos: A case-based reasoning and learning system Knowledge representation and organization Learning Recommended reading

Case-based reasoning • Case-Based Reasoning (CBR) is a name given to a reasoning method that uses specific past experiences rather than a corpus of general knowledge. • It is a form of problem solving by analogy in which a new problem is solved by recognizing its similarity to a specific known problem, then transferring the solution of the known problem to the new one. • CBR systems consult their memory of previous episodes to help address their current task, which could be: • planning of a meal, • classifying the disease of a patient, • designing a circuit, etc.

Classification tasks Classification is assigning a given input to one of the categories in a pre-enumerated list. Many case-based reasoning systems perform classification tasks. What is a classification?

Case-based reasoning for classification tasks Case-based reasoning for classification is a kind of instance-based learning, where the instances have a more complex (structural) description. In CBR systems, a concept ci is represented extensionally as a collection of examples (called exemplars or cases) ci = {ei1, ei2, ...}. Then ‘a’ belongs to the concept ci if ‘a’ is similar to an element eij of ci, and this similarity is greater than the similarity between ‘a’ and any other concept exemplar.

The PROTOS system Protos is a case-based problem solving and learning system for heuristic classification tasks. The main features of the system will be presented in the context of a task for the classification of hearing disorders. In Protos, a concept ci is represented extensionally as a collection of examples (called exemplars or cases): ci = {ei1, ei2, ...}. Classifying an input NewCase involves searching for a concept exemplar ejk that strongly matches NewCase. If such an exemplar is found then Protos asserts that NewCase belongs to the concept cj (the concept whose exemplar is ejk).

The classification and learning algorithm Input: a set of exemplar-based categories C = {c1, c2, ... , cn} and a case (NewCase) to classify. REPEAT Classify: Find an exemplar of ci  C that strongly matches NewCase and classify NewCase as ci. Explain the classification. Learn: If the expert disagrees with the classification or explanation then acquire classification and explanation knowledge and adjust C so that NewCase is correctly classified and explained. UNTIL the expert approves the classification and explanation.

The PROTOS system: explaining a classification C is the set of all concepts recognized by the system, each concept being represented extensionally as a set of representative exemplars. How could one explain the classification of a case to a concept? Explaining the classification involves showing the line of reasoning used during matching. Which would be a simple type of explanation?

Explaining a classification (cont.) Which would be a simple type of explanation? The simplest explanation is a list of the matched features of the case and the exemplar. Which would be a more detailed explanation? A more detailed explanation may include justifications of the flexible matches performed as, for instance, in the case of classifying chairs: 'pedestal' was matched with 'legs(4)' because both are specializations of seat support" or 'seat' was matched with 'backrest' because seat enables 'hold(person)' and backrest enables 'hold(person)'

The PROTOS system: learning When would a CBR system like PROTOS need to learn? When it makes mistakes. What kind of mistakes could PROTOS make? Errors of classification and errors of explanation. How could it learn?

The PROTOS system: learning How could it learn? Adjust the categories so that the case will be properly classified and explained. Which is a simple way to assure that the case will be correctly classified in the future? Add the case to the correct category as a new exemplar.

Knowledge representation and organization In Protos, the exemplars and the cases to be classified are represented as collections of features. The description of a case (or exemplar) may be incomplete, in the sense that it does not include some of the features present in other case descriptions. Also, the features with which cases are described may not directly indicate category membership. Therefore, one has to make inferences.

Sample case description Case to be classified as belonging to one of the categories: {normal_ear, cochlear_noise, cochlear_age, otitis_media, ...}: NewCase: sensorineural: mild notch_at_4K history: noise speech: normal oc_acoustic_reflex: normal oi_acoustic_reflex: elevated i_acoustic_reflex: normal c_acoustic_reflex: normal static: normal tympanogram: a air: normal

Organization of the exemplars and concepts

Remindings and Censors Remindings associate features with categories or particular exemplars. Such associations provide Protos with hypotheses during classification, which restrict its search for a matching exemplar. For example, "air: normal" would be a reminder of the category "normal_ear". Remindings are compiled from explanations of the relevance of features to categories or exemplars. A reminding has a strength that estimates the conditional probability p(category|feature) or p(exemplar|feature). Censors are negative remindings. A censor is a feature that tends to rule out a classification. For example, "temperature: fever" would be a censor for the category "healthy_patient".

Prototypicality and difference links Prototypicality ratings provide a partial ordering on exemplars within a category. Exemplars of a category which have the highest family resemblance (i.e. are most similar to other members of the category) are most prototypical. A difference link connects two exemplars (in the same or different categories) and records important featural differences between them which may suggest alternate classifications and better exemplars for use during classification.

Case classification Hypothesize classifications based on the case's features by using remindings and censors. The remindings and censors associated with the features of a new case are combined to produce an ordered list of possible classifications. Attempt to confirm a hypothesis by matching the new case with prototypical exemplars. A process of knowledge-based pattern matching determines the similarity of the case and each exemplar. It uses previously acquired domain knowledge to explain how features of the case provide the same evidence as features of the exemplar. Overall similarity of the two cases is asserted by evaluating the quality of the resulting explanation and the importance of unmatched features. If a match is imperfect, Protos searches for a more similar exemplar by traversing difference links associated with the current exemplar. If the match is strong (i.e., adequately explained), it is presented to the user for approval and discussion. If it is weak, Protos considers other hypotheses and exemplars. It reports failure if its hypotheses are exhausted without finding an adequate match.

Use of reminders to suggest classifications of input Some of the features of NewCase are reminders for two possible diagnosis, "normal_ear" and "cochlear_age“:

Matching of the input with an exemplar When the individual remindings are combined, "normal_ear" is the strongest hypothesis. Protos retrieves the most prototypical exemplar of "normal_ear" and attempts to match it to the NewCase to confirm the hypothesis: NewCase Case p8447L of Normal_Ear sensorineural: mild notch_at_4K history: noise speech: normal --------------------------------- speech: normal oc_acoustic_reflex: normal ------------------ oc_acoustic_reflex: normal oi_acoustic_reflex: elevated ----------------- oi_acoustic_reflex: elevated i_acoustic_reflex: normal --------------------- i_acoustic_reflex: normal c_acoustic_reflex: normal -------------------- c_acoustic_reflex: normal static: normal ------------------------------------ static: normal tympanogram: a -------------------------------- tympanogram: a - air: normal --------------------------------------- air: normal

Protos-teacher dialog Protos believes the match to be strong since all the exemplar's features are matched. However the teacher rejects the classification of the NewCase as a "normal_ear". Dialog with the user helps Protos to improve its knowledge. Protos asks about the features of the NewCase that were not matched by the exemplar and is told that all are incompatible with "normal_ear". How could this information be used? This information will be used to define a difference link associated with the “normal_ear” exemplar, which will point toward the current (but not yet known) classification of the NewCase. Protos also asks whether the exemplar has additional features that discriminate it from NewCase, but the teacher does not identify any.

Protos-teacher dialog (cont.) Protos then tries to confirm its second hypothesis, "cochlear_age", but it fails to find a good match with any exemplar. This means that Protos exhausted all the hypotheses. What can be done in this situation? Protos asks the user to classify the NewCase and it is told that the classification is "cochlear_noise". Since the system has no exemplar of this category, NewCase is retained as an exemplar.

Protos-teacher dialog (cont.) Dialog with the user helps Protos to acquire general knowledge of "cochlear_noise". Protos asks the teacher to explain the relevance of each case feature to the classification and receives explanations such as: "history:noise is required by cochlear_noise" How could this information be used? The feature “is required” indicates Protos to define "history:noise" as a strong remainding to "cochlear_noise".

Protos-teacher dialog (cont.) Another explanation is: "notch_at_4k is usually caused by cochlear_noise" How could this information be used? The relationship “is usually caused” indicates to define "notch_at_4k" as a less strong remainding to "cochlear_noise". Protos provides a set of relationships that the user may use in explanations. Protos also installs a difference link between the exemplar p8447L of the "normal_ear" and the NewCase and annotates it with the features of NewCase that the teacher has previously stated that are incompatible with normal_ear.

The role of explanations in Protos What role do explanations play in learning? Explanations describe the relevance of exemplar features to categories. From such explanations Protos extracts remindings and assesses their strength. Such an explanation is: "history:noise is required by cochlear_noise" Explanations describe how different features provide equivalent evidence for classification. Such explanations provide knowledge to match features that are not identical. Examples of such explanations are: notch_at_4k is definitionally equivalent to notch_4k or if the category is cochlear_noise then c_acoustic_reflex: normal is sometimes interchangeable with c_acoustic_reflex: elevated

The learning algorithm GIVEN: a new case FIND: a classification of the case and an explanation of the classification Search for an exemplar that matches the new case IF not found THEN {classification failure} Ask teacher for classification Acquire explanations relating features to classification Compile remindings Retain case as an exemplar ELSE IF the teacher disapproves THEN {discrimination failure} Reassess remindings Discuss featural matches with the teacher Ask for discriminating features Remember unmatched features to add difference link ELSE {classification is correct} Increase exemplar's prototypicality rating IF match is incompletely explained THEN {explanation failure} Ask the teacher for explanation of featural equivalence IF not given THEN Retain case as exemplar ELSE {processing was successful}

Exercise At a high level of generality, CBR systems can be described as performing the following 4-step process: 1. Retrieve the most similar case 2. Use the case to produce a tentative solution of the input problem 3. Revise the proposed solution 4. Learn from this experience to improve performance in the future. Explain how each of this step is performed in Protos.

Exercise What is the difference between instance-based learning, on one hand, and case-based reasoning and learning, on the other hand? What are the relative characteristic features of instance-based learning? • Instance-based learning: • is a special (simplified) type of case-based reasoning for classification tasks; • uses a simple (feature-vector) representation of the exemplars; • compensates the lack of guidance from general domain knowledge by using a large number of examples. What are the relative characteristic features of CBR? • Case-based reasoning and learning: • is used for other tasks besides classification; • in general, a case has a complex structure, not just a feature vector; • a retrieved case is generally modified, when applied to a new problem; • utilizes general domain knowledge.

Exercise What is the difference between case-based reasoning, on one hand, and analogical reasoning, on the other hand? What are the relative characteristic features of case-based reasoning? All the cases are from the same domain. Therefore it does not need to deal with the ACCESS problem. It is generally regarded as a special type of analogical reasoning. What are the relative characteristic features of analogical reasoning? The sources and the target are generally from different domains.

Recommended reading Porter B.W, Bareiss R., Holte R.C., Concept Learning and Heuristic Classification in Weak-Theory Domains, in Readings in Knowledge Acquisition and Learning, Morgan Kaufmann, 1992. Bareiss R., Porter B.W, Murray K.S., Supporting Start-to-Finish Development of Knowledge Bases, Machine Learning, 4, 259-283, 1989. Bareiss R., Exemplar-Based Knowledge Acquisition, Academic Press, 1989. Aamodt A., Knowledge Acquisition and learning by experience – the role of case-specific knowledge, in Tecuci G. and Kodratoff Y. (eds.), Machine Learning and Knowledge Acquisition: Integrated Approaches, pp.197-245, Academic Press, 1995. Sycara K. Miyashita K., Learning control knowledge through case-based acquisition of user optimization preferences in ill-structured domains, in Tecuci G. and Kodratoff Y. (eds.), Machine Learning and Knowledge Acquisition: Integrated Approaches, pp. 247-275, Academic Press, 1995.

IT/CS 811 Principles of Machine Learning and Inference