1 / 19

TextNet – A Text-Based Intelligent System

TextNet – A Text-Based Intelligent System. Sanda Harabagiu Dan Moldovan as (mis-)interpreted by Peter Clark. Introduction. Overall goal: Given a sentence/paragraph, create a representation of the unstated, extra knowledge (“context”) which it suggests.

Download Presentation

TextNet – A Text-Based Intelligent System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. TextNet – A Text-Based Intelligent System Sanda Harabagiu Dan Moldovan as (mis-)interpreted by Peter Clark

  2. Introduction • Overall goal: • Given a sentence/paragraph, create a representation of the unstated, extra knowledge (“context”) which it suggests. • Input: sentence graph; Output: bigger, richer graph • Purpose: Question-answering etc. (?) • Sources of this extra knowledge: • (Extended) WordNet • the Internet

  3. WordNet • Organized around concepts (“synsets”), not words • Contains: • ~100k concepts (“synsets”) • ~350k connections (14 types) • English definitions (“glosses”) for most synsets {“athletic game”} 132132 “Game involving athletic activity.” isa {“tennis”, “lawn tennis”} 433243 “A game played with rackets by twp or four players who hit the ball over a net that divides the court.”

  4. WordNet • Organized around concepts (“synsets”), not words • Contains: • ~100k concepts (“synsets”) • ~350k connections (14 types) • English definitions (“glosses”) for most synsets {“athletic game”} “Game involving athletic activity.” athletic game isa {“tennis”, “lawn tennis”} “A game played with rackets by twp or four players who hit the ball over a net that divides the court.” tennis

  5. Extended WordNet • Disambiguate and transform glosses into network representations. “Tennis court: A court in which tennis is played.” def location-of tennis court court play object tennis {“tennis”, “lawn tennis”}

  6. Extended WordNet • Disambiguate and transform glosses into network representations. “Serve: A stroke in tennis that puts the ball in play.” def agent serve stroke put object manner context tennis ball play

  7. Extended WordNet • Resulting structure is no longer just a big graph Original WordNet Processed Glossary Definitions def ball ball def Concepts in context (particular subtypes/ situations for concepts) “Raw” concepts (isa hierarchy, other relations)

  8. “The kid hit the ball very hard.” hit agent manner object kid ball hard Part I: Adding Relevant, Contextual Knowledge from WordNet

  9. “The kid hit the ball very hard.” hit agent manner object kid ball hard “Inference Extraction” • Goals: • provide supplementary information about a sentence • explain relation between sentences • Approach: • Deductive inference (e.g., “snore –entails sleep”) • Find and add information into the sentence representation • Challenge: • Many possible connections

  10. Path-finding To find path(s) between A and B: • use spreading activation/marker passing: • place markers at A and B • propogate markers to neighboring nodes • at quiescence, look for marker collisions • “Propogation rules” determine when to propogate • “asymmetric and transitive relations are more useful” • “going up the isa hierarchy allows hierarchical deductions” • “the same is true for relations such as entail and causation. For example, if a man is snoring, then he is sleeping, and further he is temporarily unconscious.”

  11. “The kid hit the ball very hard.” hit agent manner object kid ball hard • Find connections which “explain” these relations within context of tennis within context of ball context agent isa isa object-of hit game play player person kid within context of tennis within context of ball agent agent-of object context object-of hit game play player hit ball

  12. “The kid hit the ball very hard.” hit agent manner object kid ball hard • Find connections which “explain” these relations within context of return within context of drive manner-of gloss (“isa”) gloss (“isa”) context hard return stroke tennis within context of tennis agent agent-of object-of game play player hit

  13. Inter-sentential Global Context • Find connections between “local contexts” S1: The kid hit the ball very hard. S2: It landed almost always near the baseline. within context of move isa gloss (“isa”) object isa hit move change location within context of destination within context of arrive gloss (“isa”) object gloss (“isa”) isa place destination reach arrive land

  14. Part II: Adding Contextual Knowledge from the Internet

  15. Is WordNet (or a dictionary) sufficient to fully build the context? “GPS systems are used for hiking.” • QN: Can we relate “GPS” and “hiking” using a dictionary? • From Oxford Dictionary: • “GPS: a navigation system” • “Hiking: long walk in the countryside taken for pleasure” • “Walk: place or track or route for foot passengers” • “Route: course or way taken from starting point to destination” • But: • Missing knowledge that hiking involves following/navigating a particular trail, as opposed to just wandering aimlessly

  16. Finding and Adding Extra, Contextual Knowledge from the Internet • WordNet doesn’t contain all the background K • So can we addextra K using other texts too? • run-time, extra elaboration of current graph • further expansion of WordNet? • Approach: • Start with some initial “seed” text • Retrieve paragraphs containing relevant words • Elaborate their “local and global contexts” • Determine relevance using a similarity measure • Select “the most appropriate new context” • Add its graph (or parts of it?) to the original graph

  17. Finding Relevant Documents • Two problems: • Discovery: Which keywords to search with? • use words in the original seed text, or closely related words • e.g., “play AND (tennis OR ball OR baseline) AND hit” • Quality: How relevant are the results? • measure the degree of overlap of graphs for seed and new texts • Lexical ambiguity is a root problem • Disambiguation by assuming new words belong to same/close synsets as in the original query (dubious!)

  18. A Real Example… • Text: about player who gets tendinis from hitting ball too hard • Build initial graph of sentences (but info missing) • Look for additional information on Internet • try multiple queries • select the best result (= graph most coherent with original text) • layer this graph on top of the original text graph • Original text + WordNet: • hit –isaaffect isa- injure –result injury • hit –purpose  land –location backline • Internet text: • backline –result ace • WordNet • ace –isa serve –attr unreachable –purpose win • Hence (!) • “Winning is the motivation for actions causing tennis injuries”

  19. Summary • Interesting, ambitious • Right idea (used by others too) • Didn’t work (?); no further publications on TextNet • Critical details not clear from the paper • Problem  finding good connections, rather = avoiding finding bad connections

More Related