350 likes | 372 Views
Explore the goal of visualizing scenes through natural language input and overcoming lexical ambiguity, with a focus on linguistic knowledge representation and dealing with issues of natural language processing.
E N D
Linguistic Knowledge Representation Scott Farrar Department of Linguistics
Overview of the Research • goal: to visualize scenes described by natural language
Example • input: “John’s hand is in his pocket.” • output: propositional description of scene (exists ?b background) (exists ?p person) (exists ?s shirt) (in ?p ?b) (vertical ?p) (wears ?s ?p) (contains (pocket ?c) (hand ?p))
Overview of the Research • goal: to visualize scenes described by natural language • subgoal: to represent the meaning of natural language on a computer • subgoal: to access the visual information about the entities in the domain • focus: deal with lexical ambiguity
Lexical Ambiguity When two words have the same form: The book is on the edge of the table. [area] The edge of the table is sharp. [line] The park is five blocks away. Kids like to play with blocks. Drums are fine instruments. Oil is stored in 55-gallon drums. The middle (of the bench) is wet. [center-part] Put the pan in the middle (between the bowls). [space-between] To vote “YES” check the upper box. [2d] Put your hand in the box. [3d]
Problems to Overcome • Specifying the relationship between linguistic and other forms of knowledge.
Example L, CS, V John’s hand is in his pocket. L John owns the hand. CS The hand is physically attached to John. L, CS The hand is physically contained in the pocket, not the other way around. V A hand is smaller than a pocket. CS John’s hand is not in Bill’s pocket. CS John’s wants his hand to be in his pocket. L This event is occurring now. L, CS, V Hand is a body-part, not a person. L, CS, V A pocket is a container in clothing.
Problems to Overcome • Specifying the relationship between linguistic and other forms of knowledge. • Dealing with ambiguity and other issues of natural language processing (NLP).
Natural Language Processing lexicon “The king gave the people bread.” Grammar semantics syntax The king gave the people bread DT N VB DT N N (the king) (gave) (the people) (bread) other knowledge give: tense: past agent: the king recipient: the people theme: bread The people have bread. The people ate the bread.
What is Meaning? • Symbols, Representation, Extensions [010011101] moon
What is Meaning? • Meaning derives from words. • Meaning derives from structure. • Meaning derives from context. Focus on the Lexicon
The Computational Lexicon • form—what data structures comprise the lexicon? • organization—how are the data structures organized? • content—what information is contained in the data structures?
Form of the Lexicon • Features—Katz and Fodor (1963): knowledge is a conjunction of features (monadic predicates) bachelor (x)→ unmarried (x) & male (x) & young (x)
Form of the Lexicon • Frames—Minsky (1975): knowledge is organized around concepts give: <agent Person> <recipient Person> <theme PhysicalObject> slots values
KL-ONE • Brachman 1979 • Description logic (subsumption, classification) • Distinguishes between conceptual and instance knowledge • Inheritance • No defaults • KRYPTON, LOOM, CLASSIC
KRYPTON • Brachman, Fikes, and Levesque (1983) • Combines frame language with power of a logic or programming language • Terminology (T-box) – hierarchy of concepts • Assertions (A-box) – descriptors for individual objects
Organization of the Lexicon • Semantic network—Quillian (1966): knowledge is interconnected has-part animal lungs Sub has-part bird feathers Inst tweety
Hierarchy of Concepts Artifact Machine Tool Motorized- Machine Non-motorized- Machine … … automobile drill loom windmill
Problems to Overcome • Specifying the relationship between linguistic and other forms of knowledge. • Dealing with ambiguity and other issues of natural language processing (NLP). • Determining what role visual knowledge of objects and events has in the disambiguation process.
Content of the Lexicon dog Purely linguistic: “dog”, [da:g] dog collar not *collar dog dog+PL = dogs not *doges
Content of the Lexicon dog Commonsense: has-part (dog, tail) makes-noise (dog, “bark”) disjoint (dog cat) likes (Scott, terrier)
Content of the Lexicon dog Visual: shape: size: color: texture:
WordNet (see Demo)
Underlying Problem • concepts—hierarchical, definitional, dictionary-like • instances—factual, encyclopedia-like meaning is distributed and integrated
Goals of the Present Research • Construct a visual scene from natural language. • focus on lexical disambiguation • access and use the visual knowledge associated with lexical items • argue for an integrated approach to the design of the lexicon
Formalization of the Problem • input: a list of well-formed English utterances U, where U={u1,u2,u3,…,un}, |U|≥1, and U can be interpreted as a complete visual scene. Each member of U, ui, is a list of strings {s1,s2,…,sn}, where |ui |≥1. {John is standing on the bridge. John has his hands in his pockets. John is wearing a cap.}
Formalization of the Problem (cont.) • output: VU, such that VUis a visual scene based on U consisting of a 3-tuple <I, O, R> where I is a set of icons, O is a set of orientations for the icons, and R is a set of relations between the icons. (exists ?b background) (in ?p ?b) (exists ?p person) (vertical ?p) (exists ?s shirt) (wears ?s ?p) (contains (pocket ?c) (hand ?p))
Required Knowledge Components • A grammar G consisting of <L,…>, where L (the lexicon) is a set of lexical knowledge structures {w1,w2,…,wn} such that siin ui is orthoForm(wi). orthoForm() is a function that maps a lexical knowledge structure w to a string s, the orthographic form of w .
Formalization of Lexical Ambiguty If wm is a word in L then there is come other word wn in L such that orthoForm(wm)=orthoForm(wn).
Required Knowledge Components(cont.) • An ontology Ont consisting of <C, R> where C is the set of concepts in the domain and R is the set of relations over C. • A knowledge base KB consisting of <A, Ont> such that A is a set of assertions {a1, a2,…,an} about entities in the ontology Ont.
A Concept in the Ontology (subclass Substance SelfConnectedObject) (documentation Substance "An &%Object in which every part is similar to every other in every relevant respect.) (=> (and (subclass ?OBJECTTYPE Substance) (instance ?OBJECT ?OBJECTTYPE) (part ?PART ?OBJECT)) (instance ?PART ?OBJECTTYPE))
An Assertion in the KB (=> (forall ?x bridge) (exists ?y ?z location) (connects ?x ?y ?z)) (=> (forall ?x bridge) (size ?x large))
Visual Assertions in KB (yet to be completed) • Shape • Size • Color • Orientation • Texture
Conclusion • Word sense disambiguation is challenge to NLP. • NLP can benefit from a knowledge-rich approach. • A combination of visual and other commonsense assertions can enrich the lexicon. • An enriched lexicon is required for visualization of scenes described by NL.