170 likes | 335 Views
An Interactive Dialogue System for Knowledge Acquisition in Cyc. Michael Witbrock 2003-08-09. Cyc: Large Scale Knowledge Base. >3,000 Microtheories. >1,600,000 Assertions. Cyc. >200,000 Rules. >30,000 Relations. >120,000 Concepts. rate of learning. amount known.
E N D
An Interactive Dialogue System for Knowledge Acquisition in Cyc Michael Witbrock 2003-08-09 IJCAI 2003
Cyc: Large Scale Knowledge Base >3,000 Microtheories >1,600,000 Assertions Cyc >200,000 Rules >30,000 Relations >120,000 Concepts
rate of learning amount known Frontier of human knowledge How did CYC get this far? Large effort to date 19 realtime years substantial investment 2003 1984 learning via natural language learning by discovery codify&enter each piece of knowledge, by hand CYC
(similarTo RingoStarr JerryAllison) (isa JerryAllison FamousHuman) (mostNotableIsa JerryAllison (PlayerOfInstrumentFn DrumInstrument)) (hasMembers TheBeatles-MusicGroup JerryAllison) (authorOfSong JerryAllison OctopussGarden-TheSong) Analogy Developer
(from-UnderspecifiedLocation JerryAllison Hillsboro) (residesInRegion JerryAllison Hillsboro) (originallyFromRegion JerryAllison Hillsboro) Phrase Disambiguator
Unknown Term (isa PeggySue Song-CW)
(isa JerryAllison (PlayerOfInstrumentFn DrumInstrument)) (occupation JerryAllison (PlayerOfInstrumentFn DrumInstrument)) Precision Suggestor
Concept Refinement Interview (playsInstrumentInMusicalGroup JerryAllison ?EXISTING-OBJECT-TYPE ?MUSICAL-PERFORMANCE-ORGANIZATION)
(aunts PeggySue-1 ?PERSON) (wife JerryAllison PeggySue-1) (implies (and (aunts ?Z ?Y) (wife ?X ?Y)) (uncles ?Z ?X)) Concept Refinement Interview: Why I asked
Concept Refinement Interview: Induction (isa JerryAllison ?NATIONALITY)
KE Rules Deductions Statement Question Knowledge Base Inductions Command Mixed-Initiative Dialogue User Interaction Agenda
Progress in Knowledge Entry Efficiency 50 40 30 20 10 Assertions per Hour Manual KE Feasibility Study 2000-10 RKF Year 1 2001-08 RKF Year 2 2002-09
Where do we want to go next? • Never make the user wait • Support both rapid and diligent parsing. • Improve ambiguity resolution • Deferred resolution by ambiguity tolerance • Anaphor resolution • Focus-based resolution • Transparent, correctable eager resolution • Conversational goals
Design Goal: What is “The World Health Organization”? What is “Severe Acute Respiratory Syndrome”? By what medium was the announcement made? What type of pathogen was the new pathogen? Interlocutor History / Status Text / Template Entry Suggestions for Efficient Knowledge Entry Graphical View of Added Knowledge
Parse Pipeline Syntactic Parser Semantic Parser Reformulator Natural Language Parse Tree Underspecified Semantic Formula Assertible Semantic Formula Knowledge Base
Advanced parsing and discourse modeling • Interaction with Cyc should be a conversation that the user controls. Cyc’s responses should be: • Relevant – Track thread and overall goals • Correctable – Detect, examine, correct • Learned – Do better next time • Move to declarative representation and use inference more • Parsing and resolution rules are more transparent, flexible, and learnable • Deferred and inter-sentential resolution with ambiguity tolerance • User/Cyc relationship as teacher/student (or student/teacher)