1 / 12

CYC: A Large-Scale Investment in Knowledge Infrastructure

CYC: A Large-Scale Investment in Knowledge Infrastructure. Douglas B. Lenat Presenter: Cristina Nicolae. CYC. an expert system which encodes in axioms knowledge of everyday objects and actions, like : • You have to be awake to eat.

lorant
Download Presentation

CYC: A Large-Scale Investment in Knowledge Infrastructure

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CYC: A Large-Scale Investment in Knowledge Infrastructure Douglas B. Lenat Presenter: Cristina Nicolae

  2. CYC • an expert system which encodes in axioms knowledge of everyday objects and actions, like: • You have to be awake to eat. • You can usually see people’s noses, but not their hearts. • Given two professions, either one is a specialization of the other or else they are likely to be independent of one another. • You cannot remember events that have not happened yet. • If you cut a lump of peanut butter in half, each half is also a lump of peanut butter; but if you cut a table in half, neither half is a table. • These assertions embody knowledge the CYC authors safely assume is already known about the world.

  3. CYC • Makes easy tasks like coreference resolution: • “The police arrested the demonstrators because they feared violence” • vs. • “The police arrested the demonstrators because they advocated violence” • or word sense disambiguation: • “The box is in the pen” • vs. • “The pen is in the box”

  4. CYC – Codifying Common Sense

  5. Issues and Lessons Learned • Assertions made are true only as a default (and in certain contexts). You can usually see people’s noses, but not their hearts. • heart during surgery can be seen • How likely is it for assertions to be true? • We don’t know the probabilities precisely (dozens of people, hundreds of thousands of rules)  • avoid numeric certainty factors • each assertion is true by default, and we have additional meta-assertions: Assertion A is less likely than assertion B. • Use first order predicate calculus with a series of second-order extensions (instead of frame-and-slot language)

  6. CYC – Numbers • 106general assertions in CYC’s knowledge base • 105 atomic terms (basic concepts) in the vocabulary • The exact numbers are not important:

  7. Commercial Applications (1/2) • Information retrieval • detailed user models (hobbies, job, family status, values, personality..) • integrating heterogeneous external information sources – users will find info without knowing how it is stored. • examining retrieved data, recognizing inconsistencies, contradictions with other sources, violations of common sense • Word processing • words spelled incorrectly as valid words • grammar checking • content checking – possible in the future (“Later, we will address…”) • flesh out incomplete (even outlined) sentences, incomplete bibliographic references

  8. Commercial Applications (2/2) • Simulations • greater fidelity of behavior of simulated agents • role-playing games: computer characters have hobbies, jobs, social cliques, chores, memories, factual knowledge; they change moods • Speech recognition and NLU • final “sanity check” on the transcribed sentence • generate captions for email messages based on an understanding of the message body

  9. Vaughan Pratt’s CYC Report – 1994 • CYC demos • consistency check of relational databases from different sources • peaceful/violent, communist/capitalist, date of birth • retrieving online images by caption • “someone relaxing”  3 men in beachwear holding surfboards • “someone at risk for skin cancer”  girl reclining on a beach • non-monotonicity (reclining on beach  beach umbrella  umbrella broken  cloudy) - but image database doesn’t contain any examples • - “a tree” does not obtain “A girl with presents in front of a Christmas tree” (since fixed)

  10. CYC Report (contd) • Other aspects of CYC • CYC’s knowledge is expressed as axioms (currently half a million) – manually obtained • staff = 22 individuals • 22 axioms per person per day

  11. CYC Report (contd) – Other experiments • CYC  bread is food, bread is edible stuff, but - even if CYC was told that bread is not drink, it still didn’t return that result at a subsequent query • CYC - people don’t need food (because there is no axiom that goes from lack of food to death) • CYC  PlanetEarth is bigger than PlanetVenus, but - doesn’t know the exact size of the Earth • CYC  Earth has a sky, but - doesn’t know what color it is • CYC  cost of a car between $6K - $80K, - nothing else

  12. CYC Report (contd) – Conclusions • The bulk of the tester’s questions were well beyond CYC’s present grasp • Expectations about number of questions answerable and general knowledge – disappointed • But the impression was that CYC is well along the path to having comprehensive general knowledge. What lacks: a quantitative measure of how far along.

More Related