120 likes | 208 Views
Searching for Common Sense: Populating Cyc from the Web. Presented by Yu-Chung Shen 2007/05/03. Introduction. In the last twenty years , over 3 million facts and rules have been entered manually in the Cyc knowledge base by ontologists. Shouldn’t there be a better way ?
E N D
Searching for Common Sense: Populating Cyc from the Web Presented by Yu-Chung Shen 2007/05/03
Introduction • In the last twenty years , over 3 million facts and rules have been entered manually in the Cyc knowledge base by ontologists. • Shouldn’t there be a better way ? • Automating the process of gathering and verifying facts from the World Wide Web.
Knowledge acquisition from WWW • Gather information from the web preceeds in six stages • Choosing queries • Searching ( Google ) • Parsing results • KB consistency checking • Google verification • Reviewing and asserting
Choosing Queries and Generating Search Strings • Example : • Limited to a set of 134 binary predicates. • Generating search strings using templates.
Parsing search results into CycL sentences • Example :
Checking Cyc KB Consistency • Discard facts that are redundant or contradictory via inference. • Example : Fact : (foundingAgent PalestineIslamicJihad AugusteRodin) • Cyc know AugusteRodin died in 1917. • Cyc know PIJ was founded in 1989 . • The fact is contradictory . It will be discarded.
Google Verification • Guard against parser error. • Example : New Fact : (foundingAgent PalestineIslamicJihad xasdawqeqw) Search string :PIJ founder xasdawqeqw
Review and Assertion • Learned sentences are reviewed by a human curator. • If correct , assert learned sentences into Cyc knowledge base.
Experimental Results • The majority of the searches expanded , about 80% were peformed in the verification phase. The results were as follows : (GAFs : Ground atomic formulas . Atomic sentences in Cyc KB. )
Experimental Results • A human reviewer then went through the verified GAFs , and a sample of 53 of the unverified GAFs , and determined their actual correctness.
Conclusions • The work being done here is immediately useful as a tool that makes human knowledge entry faster , easier , and more effective. • Hope to provide Cyc with a mechanism to truly acquire knowledge by learning. • Q&A ?