Linguistic Knowledge Representation

Linguistic Knowledge Representation Scott Farrar Department of Linguistics

Overview of the Research • goal: to visualize scenes described by natural language

Example • input: “John’s hand is in his pocket.” • output: propositional description of scene (exists ?b background) (exists ?p person) (exists ?s shirt) (in ?p ?b) (vertical ?p) (wears ?s ?p) (contains (pocket ?c) (hand ?p))

Overview of the Research • goal: to visualize scenes described by natural language • subgoal: to represent the meaning of natural language on a computer • subgoal: to access the visual information about the entities in the domain • focus: deal with lexical ambiguity

Lexical Ambiguity When two words have the same form: The book is on the edge of the table. [area] The edge of the table is sharp. [line] The park is five blocks away. Kids like to play with blocks. Drums are fine instruments. Oil is stored in 55-gallon drums. The middle (of the bench) is wet. [center-part] Put the pan in the middle (between the bowls). [space-between] To vote “YES” check the upper box. [2d] Put your hand in the box. [3d]

Problems to Overcome • Specifying the relationship between linguistic and other forms of knowledge.

Knowledge Sources

Example L, CS, V John’s hand is in his pocket. L John owns the hand. CS The hand is physically attached to John. L, CS The hand is physically contained in the pocket, not the other way around. V A hand is smaller than a pocket. CS John’s hand is not in Bill’s pocket. CS John’s wants his hand to be in his pocket. L This event is occurring now. L, CS, V Hand is a body-part, not a person. L, CS, V A pocket is a container in clothing.

Problems to Overcome • Specifying the relationship between linguistic and other forms of knowledge. • Dealing with ambiguity and other issues of natural language processing (NLP).

Natural Language Processing lexicon “The king gave the people bread.” Grammar semantics syntax The king gave the people bread DT N VB DT N N (the king) (gave) (the people) (bread) other knowledge give: tense: past agent: the king recipient: the people theme: bread The people have bread. The people ate the bread.

What is Meaning? • Symbols, Representation, Extensions [010011101] moon

What is Meaning? • Meaning derives from words. • Meaning derives from structure. • Meaning derives from context. Focus on the Lexicon

The Computational Lexicon • form—what data structures comprise the lexicon? • organization—how are the data structures organized? • content—what information is contained in the data structures?

Form of the Lexicon • Features—Katz and Fodor (1963): knowledge is a conjunction of features (monadic predicates) bachelor (x)→ unmarried (x) & male (x) & young (x)

Form of the Lexicon • Frames—Minsky (1975): knowledge is organized around concepts give: <agent Person> <recipient Person> <theme PhysicalObject> slots values

KL-ONE • Brachman 1979 • Description logic (subsumption, classification) • Distinguishes between conceptual and instance knowledge • Inheritance • No defaults • KRYPTON, LOOM, CLASSIC

KRYPTON • Brachman, Fikes, and Levesque (1983) • Combines frame language with power of a logic or programming language • Terminology (T-box) – hierarchy of concepts • Assertions (A-box) – descriptors for individual objects

Organization of the Lexicon • Semantic network—Quillian (1966): knowledge is interconnected has-part animal lungs Sub has-part bird feathers Inst tweety

Hierarchy of Concepts Artifact Machine Tool Motorized- Machine Non-motorized- Machine … … automobile drill loom windmill

Problems to Overcome • Specifying the relationship between linguistic and other forms of knowledge. • Dealing with ambiguity and other issues of natural language processing (NLP). • Determining what role visual knowledge of objects and events has in the disambiguation process.

Content of the Lexicon dog Purely linguistic: “dog”, [da:g] dog collar not *collar dog dog+PL = dogs not *doges

Content of the Lexicon dog Commonsense: has-part (dog, tail) makes-noise (dog, “bark”) disjoint (dog cat) likes (Scott, terrier)

Content of the Lexicon dog Visual: shape: size: color: texture:

WordNet (see Demo)

Underlying Problem • concepts—hierarchical, definitional, dictionary-like • instances—factual, encyclopedia-like meaning is distributed and integrated

Goals of the Present Research • Construct a visual scene from natural language. • focus on lexical disambiguation • access and use the visual knowledge associated with lexical items • argue for an integrated approach to the design of the lexicon

Formalization of the Problem • input: a list of well-formed English utterances U, where U={u1,u2,u3,…,un}, |U|≥1, and U can be interpreted as a complete visual scene. Each member of U, ui, is a list of strings {s1,s2,…,sn}, where |ui |≥1. {John is standing on the bridge. John has his hands in his pockets. John is wearing a cap.}

Formalization of the Problem (cont.) • output: VU, such that VUis a visual scene based on U consisting of a 3-tuple <I, O, R> where I is a set of icons, O is a set of orientations for the icons, and R is a set of relations between the icons. (exists ?b background) (in ?p ?b) (exists ?p person) (vertical ?p) (exists ?s shirt) (wears ?s ?p) (contains (pocket ?c) (hand ?p))

Required Knowledge Components • A grammar G consisting of <L,…>, where L (the lexicon) is a set of lexical knowledge structures {w1,w2,…,wn} such that siin ui is orthoForm(wi). orthoForm() is a function that maps a lexical knowledge structure w to a string s, the orthographic form of w .

Formalization of Lexical Ambiguty If wm is a word in L then there is come other word wn in L such that orthoForm(wm)=orthoForm(wn).

Required Knowledge Components(cont.) • An ontology Ont consisting of <C, R> where C is the set of concepts in the domain and R is the set of relations over C. • A knowledge base KB consisting of <A, Ont> such that A is a set of assertions {a1, a2,…,an} about entities in the ontology Ont.

A Concept in the Ontology (subclass Substance SelfConnectedObject) (documentation Substance "An &%Object in which every part is similar to every other in every relevant respect.) (=> (and (subclass ?OBJECTTYPE Substance) (instance ?OBJECT ?OBJECTTYPE) (part ?PART ?OBJECT)) (instance ?PART ?OBJECTTYPE))

An Assertion in the KB (=> (forall ?x bridge) (exists ?y ?z location) (connects ?x ?y ?z)) (=> (forall ?x bridge) (size ?x large))

Visual Assertions in KB (yet to be completed) • Shape • Size • Color • Orientation • Texture

Conclusion • Word sense disambiguation is challenge to NLP. • NLP can benefit from a knowledge-rich approach. • A combination of visual and other commonsense assertions can enrich the lexicon. • An enriched lexicon is required for visualization of scenes described by NL.

Linguistic Knowledge Representation

Linguistic Knowledge Representation

Presentation Transcript

Knowledge Representation

Knowledge Representation

Knowledge Representation

Knowledge Representation

Non-Linguistic Representation

Knowledge Representation

Knowledge Representation

KNOWLEDGE REPRESENTATION

Non-Linguistic Representation

Knowledge Representation

Knowledge Representation

Knowledge Representation

Knowledge Representation

Knowledge representation

Knowledge Representation

Knowledge Representation

Knowledge Representation

Non-linguistic Representation

Knowledge Representation

Knowledge Representation

Knowledge Representation