580 likes | 719 Views
Addressing Perceptions of Case-Based Reasoning. David W. Aha Head, Adaptive Systems Section Navy Center for Applied Research in AI Naval Research Laboratory, Code 5514 Washington, DC david.aha@nrl.navy.mil. Invited Talk 2007 International Conference on Case-Based Reasoning 13 August 2007
E N D
Addressing Perceptions of Case-Based Reasoning David W. Aha Head, Adaptive Systems Section Navy Center for Applied Research in AI Naval Research Laboratory, Code 5514 Washington, DC david.aha@nrl.navy.mil Invited Talk 2007 International Conference on Case-Based Reasoning 13 August 2007 Belfast, Northern Ireland
Goals of this presentation • Raise awareness on how to assess CBR R&D methods • Assess CBR R&D methods we’re publishing • Relate CBR’s R&D methods to those used in AI • Beg for your forgiveness? Case-Based Reasoning
Outline • Perceptions • Objectives • Survey • Findings • Interpretation
Outline • Perceptions • Story: Gnats, envy, & self-doubt • Quest • Objectives • Survey • Findings • Interpretation
What perceptions of case-based reasoning (CBR) exist? • Among active CBR researchers/practitioners • Among others
My perception Artificial Intelligence … Case-Based Reasoning …
Gnats, envy, & self-doubt Gnat UK Gnat Observation: CBR perceived differently by others
Gnats In CBR (Pal & Shiu, 2004; Kolodner, 1993) expertise is embodied in a library of past cases… <long, accurate description of CBR> The major problem with CBR is that it lacks a sound theoretical framework for its application and has only achieved limited success. - Anonymous senior AI researcher/proposer, 2005 “Case based reasoning is often limited to surface features that may not be relevant to the operational military situation. (There is a need for deeper underlying reasoning, including analogical reasoning.)” - Anonymous ONR Program Manager, 2007
Gnats • Artificial Intelligence: CBRnot taught? • AIMA(Russell & Norvig, 2002-) • 90% market share (1000+ universities, 91 countries) • CBR: Not discussed • IBL (3 pages) Statistical_Learning • ML: Prevailing view is CBR = Instance-based learning ML • No (e.g., 61% of papers in ECCBR-06 not related to ML) • Yet there is a relationship • e.g., “CBR is a technique within the field of machine learning…” (Beltrán-Ferruz et al., ECCBR-06)
Gnats Why isn’t some AI-related CBR research published at IC/ECCBR? • Computational analogy • “CBR systems…tend to use only minimal first-principles reasoning…[and] rely on feature-based descriptions…[or] use domain-specific and task-specific similarity metrics. This can be fine for a specific application, but being able to exploit similarity computations that are more like what people do could make such systems…more understandable to their human partners.” (Forbus et al., IAAI-02) • Episodic memory • “Episodic memory can be thought of as the mother-of-all CBR problems – how to store and retrieve cases about everything relevant in an entity’s existence. Most CBR research has avoided these issues.” (Nuxoll & Laird, ICCM-04)
Gnats Some Stereotypical perceptions of CBR How could any misperceptions be addressed?
Not interested in giving you yet another content survey We have existing surveys of CBR (e.g., KER 2005 special issue)
Methodological approach envy? • e.g., Experimental study of ML (Langley & Kibler, 1991), Crafting papers on ML (Langley, ICML-00), … • Possibly: • We haven’t had received much proselytizing on this…yet • My awareness of these issues has increased; worth reviewing Envy Am I (unnecessarily) wishing for something? Formal foundations envy? • e.g., Bayesian, first-order logic, decision theory, COLT, … • But we have this: • e.g., Cover & Hart, 1967; Richter, FLAIRS-07; Richter & Aamodt, 2005 KER • And we’re the ultimate chameleons, even within AI
Paper content recommendations • State the research goals and evaluation criteria • Specify the component (e.g., learning) & overall perf. task • Describe rep’n and organization of knowledge & data • Explain the system components (if any) • Evaluate the approach • Empirical, theoretical, psychological, novel functionality • Describe related work • Explain similarities/differences with your work • State the limitations • Propose solutions Envy Crafting papers (e.g., on ML) (Langley, ICML-00) • Content • Evaluation strategy • Communication
Self-Doubt Summary questions • How should (royal) we respond to possible misperceptions of CBR? • i.e., Other than to survey the field’s contents and its foundations • Why are some folks ignoring CBR? • How can we attract them? • Does this concern our research methodologies (and/or their communication) rather than our research focus? Proposal: Examine our research methodologies Realization: This requires a framework for investigation
Quest: Identify, characterize, & compare CBR research methods Don Quixote (Scott Gustafson)
Outline • Perceptions • Objectives • Questions • Conjectures/Hypotheses • Survey • Findings • Interpretation
Questions • How should we describe CBR to others? • i.e., in the context of AI • What R&D methodologies are we using? • Does CBR R&D differ from AI R&D?
Conjectures/Hypotheses • AI research is dominated by two methodologies (Cohen, 1991) • Model-centered (neat) (i.e., proving theorems on formal models) • System-centered (scruffy) • CBR research is not (currently) dominated by both • Dominated only by system-centered papers, which often lack models for deriving claims, generating predictions, and explaining behavior • CBR research suffers from similar methodological problems • Model- and system-centered papers differ in whether they conduct evaluations, assess performance, and describe expectations • The designation of CBR conference publications are distinguished by their research methodologies • Oral vs. poster presentations • Best paper nominees from others
Outline • Perceptions • Objectives • Survey • Case base • Retrieval • Reuse • Revision • Findings • Interpretation
Case #1 Cohen, Paul R. (1991). A Survey of the Eighth National Conference on Artificial Intelligence: Pulling together or pulling apart? AI Magazine, 12(1), 16-41. Summary of (Cohen, 1991) Paul R. Cohen (circa ~2007) • Conclusion: AI research follows two incomplete, complementary methodologies • Proposes: MAD (Modelling, Analysis, & Design) mixed methodology Case Base Read 150 Papers! (can you imagine?) Frameworks for assessing AI R&D Methods Paul R. Cohen (circa ~1991) Recommendation: Make this required reading for AI researchers
(Cohen, 1991): 40 citations (Google Scholar, 8/1/07) Many of our suggestions are similar to the excellent points made by Cohen (1991) in his discussion of AI, but they seem worth instantiating for the field of machine learning (Langley & Kibler, 1991 “Experimental Study of ML”) There are two ways in which the fields proceed. One is through the development and synthesis of models of aspects of perception, intelligence, or action, and the other is through the construction of demonstration systems (Brooks, 1991 Science). As Cohen (1991) demonstrated in his analysis of the papers presented at AAAI90, we are, as a discipline, just learning how to perform real, systematic experimentation. One hears a lot of talk about AI as an experimental science, but typically the “experiments" amount merely to writing a computer program that is supposed to validate some hypothesis by its very existence. (Pollock, 1992 Artificial Intelligence) The importance of this link has been highlighted by several researchers, some even going so far as to state that AI will not advance as a science until the gap between those who construct models and those who build systems is closed. (Jennings, 1995 Artificial Intelligence). Cohen (1991) discovered that only 43% of the papers that described implemented systems report any kind of analysis of their contributions. Even of the papers that do describe evaluatory experiments, very few go beyond evaluating the programs to analyzing the scientific claims that the programs were written to demonstrate. (Ram & Jones, 1995 Philosophical Psychology) Methodological issues are by no means resolved (Cohen, 1991), but they are much discussed and a consensus is emerging on the importance of combining theoretical and empirical investigations. (Bundy, 1998 book) As Cohen (1991) points out, most research papers in AI, or at least at an AAAI conference, exploit benchmark problems; yet few of them relate the benchmarks to target tasks. (Howe & Dahlman, 2002 JAIR)
Case #1 Cohen, Paul R. (1991). A Survey of the Eighth National Conference on Artificial Intelligence: Pulling together or pulling apart? AI Magazine, 12(1), 16-41. Framework for assessing AI R&D Methods My Query 1. Retrieve Case #1 Retrieval • My query: • Identify R&D methodologies being used in CBR • Compare results with general AI and other AI subfields • Case #1: (Cohen, 1991) • Develop and apply framework for analyzing AI R&D methodologies • Identify R&D methodologies being used • Propose novel R&D methodology (MAD)
MAD Framework (Cohen, 1991) Note: It does not (completely) eliminate subjective assessments!
Hypotheses AAAI-90 Data MAD Methodology Results Analyze AAAI-90 2. Reuse Metrics ECCBR-06 Data Results MAD Framework Analyze ECCBR-06 ICCBR Audience (at lunch) Adapted Hypotheses Reuse My Query Case Base 1. Retrieve (Cohen, 1991) MAD Framework
Outline • Perceptions • Objectives • Survey • Findings • Results • Analysis & Patterns • Followup • Interpretation
25 43 36 Models Algs 4 1 3 37 Systems Categorizing Papers: AAAI-90 (Cohen, 1991) Papers Models (3 4) Algs (5 6) Systems (7 8) AAAI-90 Categories Model-Centered: (M A) S Hybrid: (M A) S System-Centered: (M A) S
1 5 7 Models Algs 3 0 6 13 Systems Categorizing Papers: ECCBR-06 Papers Models (3 4) Algs (5 6) Systems (7 8) ECCBR-06 Categories Model-Centered: (M A) S Hybrid: (M A) S System-Centered: (M A) S
50% M 25% M 6% Hybrid 25% Hybrid 25 1 43 5 36 7 17% 29% 24% Models Models Algs Algs Models Algs 4 3 3% 3% 14% 19% 1 0 3 6 1% 2% Models Algs 8% 17% 0% 37 13 25% 36% Systems Systems Systems Systems Comparing MAD Categorizations of Papers AAAI-90 ECCBR-06
Results: ECCBR-06 Contingency table
χ2(2)=19.0, p<0.0001 χ2(6)=24.1, p<0.006 Analysis: Comparing ECCBR-06 with AAAI-90 Could this distribution of M-C, Hybrid, and S-C methodologies have arisen by chance, or does it reflect a real difference between ECCBR and AAAI? • The ECCBR/AAAI distinction is not independent of the research methodology class
χ2(2)=7.0, p<0.003 • The methodological choice of whether an evaluation was conducted is not independent of the paper’s class. • Model-centered and hybrid papers include evaluations significantly more frequently than do system-centered papers. Analysis: Examining ECCBR-06 χ2(2)=0.4, p>0.8 • Unlike AAAI-90, the methodological choice of an example is independent of the paper’s class.
χ2(2)=9.9, p<0.007 • Like AAAI-90, Model-centered and hybrid papers are more likely than system-centered papers to include (any type of) performance assessment. χ2(2)=0.6, p>0.7 • Surprisingly, and unlike AAAI-90, model-centered and hybrid papers do not provide (any types of) expectations more frequently than do system-centered papers. • Perhaps this warrants a follow-up analysis • Perhaps system-centered researchers make predictions not derived from models, which would be dangerous, or perhaps they are simply not stating the models, which is more likely. Analysis: Examining ECCBR-06 (cont.)
Patterns: Comparing ECCBR-06 with AAAI-90 Generous? Generous! Näive
ECCBR-06: Distinguishing Papers from Posters “Now we tread on hallowed ground” - Anon Hypothesis: Reviewers are human and subjective. While there’s probably a trend that oral presentations show more “maturity” than do posters, exceptions exist and this trend is probably not significant. • Results of analysis: I was wrong… • …assuming the presentation/use of models is indicative of a paper’s level of maturity
50% M 15% M 14% Hybrid 46% Hybrid 0 1 0 5 2 5 5% 23% 23% Models Models Algs Algs Models Algs 2 1 9% 0% 0% 15% 0 0 1 5 0 5% Models Algs 8% 38% 0% 8 5 36% Systems Systems 38% Systems Systems ECCBR-06 Oral Papers (22) Posters (13)
2(5)=9.3, p<0.1 • The poster/paper designation of an accepted paper at ECCBR-06 was not independent of the paper’s class. • Tentative conclusion: If you want your accepted paper to be an oral presentation, then present your work in the context of a model. 2(2)=6.0, p<0.05 ECCBR-06: Distinguishing Papers from Posters So: Will you think about this, and want to learn more? But maybe you are unconvinced…
2(1)=0.02, p>0.9 2(1)=0.05, p>0.8 2(1)=0.28, p>0.5 2(1)=0.44, p>0.5 ECCBR-06: Distinguishing Papers from Posters Nothing else (so far) distinguishes papers from posters
ECCBR-06: Distinguishing Best Paper Nominees? • But there were only 5 • Future work: Analyze after adding 9 ICCBR-07 nominees
Summary Hypotheses revisited • Unlike AAAI-90, CBR research is not dominated by both model-centered and system-centered methodologies • Dominated only by system-centered papers • CBR research suffers from similar methodological problems • Model- and system-centered papers differ in whether they: • Conduct evaluations • Assess performance • Describe expectations • The class of a paper in the MAD framework distinguishes • Oral vs. poster presentations • Best paper nominees from others
Outline • Perceptions • Objectives • Survey • Findings • Interpretation • A new case • Caveats • Next steps
(Aha, 2007?) 4. Retain 3. Revise A new case…hopefully How can we assess CBR R&D Methodologies? Case Base Frameworks for assessing AI R&D Methods 1. Retrieve (Cohen, 1991) MAD mixed methodology MAD Framework AAAI-90 2. Reuse Today’s Results MAD Framework ECCBR-06
A goal of AI research is to develop science & technology to support the design and analysis of intelligent systems Model- and system-centered methods are complementary Model-centered researchers typically develop algorithms for simpler problems, but with deeper analysis, expectations, and demos System-centered researchers typically build large systems to solve realistic problems, but w/o explicit expectations, analyses or demos See the MAD methodology (Cohen, 1991) Models are used to derive hypotheses & expectations Few systems merit attention on the basis of existence alone It is impossible to evaluate a system without predictions Creating benchmarks will not fix AI’s methodological problems Many of Cohen’s (1991) Points for AI Apply to CBR
Caveat #1: No cases for ECCBR-90, AAAI-06, etc. ! ? AAAI-90 ECCBR-06 Note: CBRW-91’s R&D methods differ greatly from ECCBR-06’s