240 likes | 378 Views
Fully-Automated Reading. Bettina Schimanski Wednesday June 28, 2006. Dept. of Cognitive Science Dept. of Computer Science Rensselaer AI & Reasoning (RAIR) Lab Rensselaer Polytechnic Institute (RPI) Troy, NY 12180. Motivation.
E N D
Fully-Automated Reading Bettina Schimanski Wednesday June 28, 2006 Dept. of Cognitive Science Dept. of Computer Science Rensselaer AI & Reasoning (RAIR) Lab Rensselaer Polytechnic Institute (RPI) Troy, NY 12180
Motivation • Turning to written text and diagrams to learn, isn’tconsidered learning as it has been rigorously studied in computer science, cognitive science, and AI. In these disciplines, to learn is almost invariably to produce an underlying function f on the basis of a restricted set of pairs. • Yet learning by reading (LBR) underpins much of ‘every day’ human learning – e.g. educational systems, job training, IRS tax form instructions, product manuals. Rensselaer AI & Reasoning (RAIR) Lab
We do not consider the techniques of Latent Semantic Analysis (LSA), Text Extraction, and etc, as learning by reading because these techniques do not generate poised-for knowledge. poised-for knowledge, or simply p.f. knowledge is knowledge poised for the semantically correct generation of output that would provide overwhelming psychometric evidence that deep and durable learning has taken place. Poised-For Knowledge Rensselaer AI & Reasoning (RAIR) Lab
Learning • P.F. Learning is described by a continuum ranging from shallow to deep knowledge acquisition. • Shallow Learning: • Absorption of the semantic content explicitly present in the surface structure and form of the medium. • Deep Learning: • Reflective contemplation of semantic content with respect to prior knowledge, experience, and beliefs as well as imaginative hypothetical projections. Rensselaer AI & Reasoning (RAIR) Lab
Objective • Our objective is to construct a proof-of-concept LBR process capable of true p.f. learning – i.e. one that produces as output p.f. knowledge. • Specifically, we will prototype an extension to the Slate system that allows a human or machine reasoner to ‘crack’ Intelligence Analysis scenarios, such as Well-Logger #1 (Bringsjord), directly from English source material. Rensselaer AI & Reasoning (RAIR) Lab
How does one get information into Slate? • Fully-automated reading of files: • Slate will be able to read all case studies in its internal library • Approach: • Phase 1: Natural English Controlled English • Phase 2: Controlled English DRS • Phase 3: DRS Multi-Sorted Logic Rensselaer AI & Reasoning (RAIR) Lab
“Restricted” Languages Controlled Languages: A controlled language (CL) is simply a restricted subset of a full or natural language; e.g. controlled English is a subset of full English. Logically Controlled Languages: A logically controlled language (LCL) is an unambiguous controlled language that is translatable into logic (e.g. PENG, ACE, PNL, …). Rensselaer AI & Reasoning (RAIR) Lab
ACE (Attempto Controlled English) • University of Zurich – Fuchs, et. al. • Logically Controlled English (LCL) • Vocabulary comprised of a closed set of reserved function words and an open set of user-defined content words • Grammar is context-free • Principles of Interpretation deterministically disambiguate otherwise ambiguous phrases • Direct translation into Discourse Representation Structures Rensselaer AI & Reasoning (RAIR) Lab
“Learning by Reading” Process Rensselaer AI & Reasoning (RAIR) Lab
Phase 1: Natural English Controlled English Is manual transcription reasonable? • 20+ CLs (Allen & Barthe) in commercial use evidences that manual authoring in a CL is viable at scale. • Techniques for automated conversion from full English to controlled English are being investigated elsewhere (Mollá & Schwitter). WordNet has been used prior as the lexicon database for CELT, an ACE-like CL (Pease, et al). Rensselaer AI & Reasoning (RAIR) Lab
Phase 2: Controlled English DRS • Discourse Representation Structures are part of Discourse Representation Theory (Kamp); a linguistic theory of meaning via additive semantic contribution. • DRS is a syntactic variant of first-order logic (FOL) for the resolution of unbounded anaphora. • A DRS is a structure ((referents), (conditions)) Rensselaer AI & Reasoning (RAIR) Lab
DRS Example • ACE text input: • John talks to Mary. • He smiles at her. • She does not smile at him. Rensselaer AI & Reasoning (RAIR) Lab
“He smiles at her.” ((A, B, C, D), (John(A), Mary(B), talk(A, B), C=A, D=B, smile(C, D))) “John talks to Mary.” ((A, B), (John(A), Mary(B), talk(A, B))) “She does not smile at him.” ((A, B, C, D), (John(A), Mary(B), talk(A, B), C=A, D=B, smile(C, D), ((E, F), (E=B, F=A, smile(E, F))))) Rensselaer AI & Reasoning (RAIR) Lab
Phase 3: DRS Multi-Sorted Logic • Translation from DRS to MSL is akin to translation to FOL (Blackburn & Bos). • The extended DRSs of ACE are encumbered by a rarified encoding scheme and a micro-ontology. Straight-forward translation would interject the encoding/ontology into Slate’s KB. • Must map from ACE’s ontology to others (e.g. PSL) Rensselaer AI & Reasoning (RAIR) Lab
Well-Logger #1 • In this factually based Intelligence Analysis scenario about the potential possession of radiological bombs by terrorists, the analyst is given: • 14 Premises – explicitly set off for the analyst • 1 table containing necessary information from which, he/she must determine and justify which one of 12 possible conclusions follows from the given information. Rensselaer AI & Reasoning (RAIR) Lab
The Simplest Premise: • “If a person X has some sufficient iridium then X has some raw material.” Rensselaer AI & Reasoning (RAIR) Lab
<specification> <cond_s> <s> <s> <np> <vp> <np> <vp> <det> <nbar> <vbar> <var> <vbar> <n> <appos_coord> <v> <np> <v> <np> <appos> <det> <nbar> <det> <nbar> <var> <adj> <nbar> <adj> <nbar> <n> <n> If a person X has some sufficient iridium then X has some raw material . Parse Tree Rensselaer AI & Reasoning (RAIR) Lab
Raw DRS • “If a person X has some sufficient iridium then X has some raw material.” Rensselaer AI & Reasoning (RAIR) Lab
FOL A, B((person(A) material(B) iridium(B) sufficient(B) have(A, B)) C (material(C) raw(C) have(A, C))) “If a person X has some sufficient iridium then X has some raw material.” Rensselaer AI & Reasoning (RAIR) Lab
MSL A:people, B:material (iridium(B) sufficient(B) have(A, B)) C:material (raw(C) have(A, C))) “If a person X has some sufficient iridium then X has some raw material.” Rensselaer AI & Reasoning (RAIR) Lab
Implementation: What is left to be done? • Completed: • Phases 1 & 2: English → ACE → DRS • Part of Phase 3: FOL → MSL • In Progress: • Part of Phase 3: DRS → FOL Rensselaer AI & Reasoning (RAIR) Lab
View Reading Demo Rensselaer AI & Reasoning (RAIR) Lab