90 likes | 213 Views
Justification/Explanation Evaluation Breakout Session. Stefano Bertolo Richard Fikes. AQUAINT PI Meeting Monterey, California June 11-13, 2002. 6/12/02. Straw Man Proposal. General Evaluation Principles Required Characteristics Desirable Characteristics. General Evaluation Principles.
E N D
Justification/Explanation Evaluation Breakout Session Stefano Bertolo Richard Fikes AQUAINT PI Meeting Monterey, California June 11-13, 2002 6/12/02
Straw Man Proposal • General Evaluation Principles • Required Characteristics • Desirable Characteristics
General Evaluation Principles • Scope of the evaluation • Not evaluating precision and recall • Are evaluating the quality of the justification(s) the system provides in support of the answer(s) it has returned for a given question • Independence of correctness and justification • Justification(s) will be evaluated whether or not the answer it/they justify is correct • Reward reasonable justifications for an incorrect answer • Penalize unreasonable or unhelpful justifications for a correct answer
Straw Man Proposal • General Evaluation Principles • Scope of the evaluation • Independence of correctness and justification • Required Characteristics • Desirable Characteristics
Required Characteristics • Accountability • Justifications must be able to identify the sources on which they depend • If a justification has multiple "steps" (where the meaning of "step" is system-dependent), the justification will need to identify the source(s) on which each step depends • A system will be penalized for each justification step that does not identify the source on which it depends • Understandability of justifications • The justification(s) that the system ranks as the most intelligible must be pronounced understandable by a panel of human scorers • The modality of the presentation is left undetermined and need not be fluent English
Required Characteristics • Meaningful ranking of justifications • Present justifications in an order that the user would find appropriate with respect to a prespecified criterion • If J1 is presented before J2, user should agree that – • Confidence – J1 encodes evidence that is at least as reliable as that encoded by J2 • Conciseness – J1 is at least as concise as J2 • Intelligibility – J1 is at least as easy to follow as J2
Straw Man Proposal • General Evaluation Principles • Scope of the evaluation • Independence of correctness and justification • Required Characteristics • Accountability • Meaningful ranking of justifications • Understandability of justifications • Desirable Characteristics
Desirable Characteristics • If a justification has a satisfactory score on required characteristics, it will be rewarded for various desirable characteristics • Natural language presentation Reward presenting justifications in a natural language • Justification clustering Reward partitioning justifications into clusters with a well-defined semantic which is clearly explained to the user • Justification persistence Reward justifications that can be saved and inspected off-line with no loss of information • Agent-accessible API Reward justifications being accessible to software agents via an API
Straw Man Proposal • General Evaluation Principles • Scope of the evaluation • Independence of correctness and justification • Required Characteristics • Accountability • Meaningful ranking of justifications • Understandability of justifications • Desirable Characteristics • Natural language presentation • Justification clustering • Justification persistence • Agent-accessible API