240 likes | 292 Views
Cyc. Patrick McCauley CMSC 691S - Semantic Web Spring 2009. Why do we need Cyc?. Evolution of rule-based expert systems MYCIN - diagnosis of blood infections DENDRAL - chemical analysis Rule-based systems are brittle Cannot detect typos Only work within a specific domain
E N D
Cyc Patrick McCauley CMSC 691S - Semantic Web Spring 2009
Why do we need Cyc? • Evolution of rule-based expert systems • MYCIN - diagnosis of blood infections • DENDRAL - chemical analysis • Rule-based systems are brittle • Cannot detect typos • Only work within a specific domain • Brittleness due to lack of “common sense”
What is Cyc? • All apps can benefit from common sense • Began in 1984 by Doug Lenat • Initial goals: • KB and Ontology Building (pump priming) • NL Understanding / Interactive Dialog
Vocab 101 • Knowledge - underlying heuristics that allow us to reason • Data - Facts or statements about specific items in the world
Where to start? • How much does a system need to know in order to be useful? • What kinds of knowledge are necessary? • How should this knowledge be represented?
Priming the Pump • Need to encode basic, common sense knowledge “representing human consensus reality” - insulting to state these facts to another person • E.g. “You have ten fingers.” • Assumes ten is a number and that a person has a specific number of fingers. • E.g. “Cardinals are red.” • Assumes that cardinals are a type of bird and that birds have feathers which, in this case, are red. Also assumes “red” is a color.
As data grows, so do inconsistencies • Too much data gives rise to inconsistencies • Microtheory • Internally consistent data module • Explicitly represented logical context • Cyc knows or is told which MTs should be used to solve a problem
How to represent knowledge? • CycL - augmented FOPC • Each assertion in the KB carries a “truth value” • Monotonically false • Default false • Unknown • Default true • Monotonically true
What about external data? • SKSI - Semantic Knowledge Source Integration • CycL used to describe external DB columns
Natural Language Processing • Extremely difficult since human speech/language is ridiculously complex • Written text often violates proper grammar, but its meaning is understood by humans • Fred saw the plan flying over Zurich. • Fred saw the mountains flying over Zurich.
CycNL to the rescue! • Lexicon - “contains syntactic and semantic information about English words” • Relationships between English words and Cyc constants are stored
CycNL - Syntactic Parser • Uses a phrase-structure grammar, context free rules • Builds multiple tree structures for each phrase/sentence • However, some trees do not make “syntactic” sense
CycNL - Semantic Interpreter • Transforms results into CycL formulas • Result is “pure” CycL
How is this useful to Humans? • Ambient Research Assistant • flexibility and ease of communication are key • Must be capable of “learning” • Deciding what facts to learn • Learning those facts • Learning of rules • Generalizing rules • Testing and revision
Benefits of Assistant • Capable of searching much faster than humans • Availability - supercedes the 9-to-5
“Truly Intelligent” Assistant • Plan Recognition • Learning • NL
Acknowledgements • CYC Website • http://www.cyc.com/ • CYC: A Large-Scale Investment in Knowledge Infrastructure • http://www.csee.umbc.edu/691s/papers/cyc95.pdf • Mapping Ontologies into Cyc • http://www.csee.umbc.edu/691s/papers/mapping-ontologies-into-cyc_v31.pdf • Common Sense Reasoning – From Cyc to Intelligent Assistant • http://www.csee.umbc.edu/691s/papers/FromCycToIntelligentAssistant-IJHCS-LNAI3864.pdf
CycL is Cyc’s language "Bill Clinton belongs to the collection of U.S. presidents" and (#$isa #$BillClinton #$UnitedStatesPresident) "All trees are plants". (#$genls #$Tree-ThePlant #$Plant) "Paris is the capital of France." (#$capitalCity #$France #$Paris) "a fact about sets" (#$implies (#$and (#$isa ?OBJ ?SUBSET) (#$genls ?SUBSET ?SUPERSET)) (#$isa ?OBJ ?SUPERSET))