1 / 31

Automatic Methods to Detect the Compositionality of Multiwords

Multiwords. (Non-)compositionality. idioms. collocations. pragmatic. semantic. syntactic. Automatic Methods to Detect the Compositionality of Multiwords. Outline. What we want to cover Why we do it A survey of current methods Approaches to evaluation

rosesims
Download Presentation

Automatic Methods to Detect the Compositionality of Multiwords

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multiwords (Non-)compositionality idioms collocations pragmatic semantic syntactic Automatic Methods to Detect the Compositionality of Multiwords

  2. Outline • What we want to cover • Why we do it • A survey of current methods • Approaches to evaluation • Comparison of some of the results • Conclusions • Directions for the future

  3. Compositionality, non-compositionality and decomposability • Compositionality : the meaning of the phrase is a function of the meaning of the parts • + = • Non-Compositionality: The meaning of the phrase is not a function of the meaning of the parts • + = • Decomposability: The meaning of the phrase can be ascribed to its parts • Idiosyncratic: spill the beans, let the cat out of the bag • Simple: traffic light, car park

  4. frying pan car park one brick short of a load one slice short of a loaf one pear short of a fruit salad Correlation (or confusion) of compositionality: • with productivity • with statistical frequency of occurrence

  5. Motivation • Any requirement for semantic interpretation will require handling of non-compositional multiwords in order to arrive at the correct interpretation • e.g. “She kicked the bucket” • Associated syntactic behaviour is needed for parsing • e.g. “blow up the houses of parliament” • Important for lexical acquisition • e.g. “eat hot dog” • Associated non-productive and syntactic behaviour important for generation • e.g. “Wine and dine”

  6. Methods: the main categories • Statistical p(see,red) / (p(see)p(red) • Translations see red <-> aberrear • Dictionaries listings, semantic codes and semantic relationships • Substitutions see red, see yellow, see blue • Distributional see: look perceive gaze… • red: yellow orange blue…

  7. Statistical Methods • Statistical measures • e.g. pointwise mutual information • Venkatapathy and Joshi, (2006) useful for alignment • Syntactic flexibility • Fazly and Stevenson (2006) (verb+noun compounds) • idiomatic nature reflected • (passivization, determiner type and pluralization)

  8. Translations • Melamed (1997) "non compositional compounds“ statistical comparison of translation models i) with concatenated words ii) separate words • Mukerjee et al (2006) Hindi-English Parallel corpora used for detecting Hindi complex predicates. • Venkatapathy and Joshi (2006) compositionality (PMI) used for alignment. • Translations from one ↔ many are not necessarily non-compositional • e.g. swimming pool (piscine) video tape (video), • Nevertheless, very useful to find collocations for a language pair • Villada Moirón and Tieldemann (2006) diversity of translations for an expression. Overlap of meaning of expression from translation and those of its component words.

  9. Substitution Methods baggage, luggage • Pearce (2001) Anti-collocations using WordNet synonyms • e.g. “emotional baggage” vs “emotional luggage” • Lin (1999) PMI 95% significant difference between phrase and phrase with close substitute. Close substitutes found from an automatically generated thesaurus (Lin,98) • e.g. see: gaze, look, perceive… • Lexical fixedness Fazly and Stevenson 2006 (verb+noun compounds) as Lin (1999) but using difference in PMI between target and average of the PMI of the set of substitutes

  10. Dictionary methods • Recognition of idiomatic tokens in a Japanese corpus using syntactic evidence and information in an idiom dictionary Hasimoto et al (2006) • Using hierarchical information in WordNet to model decomposability for evaluation (Baldwin et al. 2003) • Piao et al. (2006) lexical resource (Lancaster Semantic Lexicon) to compare meaning of listed multiword to that of its component words. Measure semantic distance using semantic tags given in lexicon

  11. Substitution Methods Contd… • What is being captured? • Bannard et al (2003) and Baldwin et al (2003) argue that these methods capture non-productivity, (simple decomposable collocations) • NB Pearce (2001) is explicitly targeting collocations rather than compositionality • Fazly and Stevenson (2006) acknowledge the partial relationship (compositionality and lexical fixedness) but the relationship exists nevertheless

  12. Selectional Preference Models • Bannard (2002) verb particle data eat up <object> vs eat <object> • (Li and Abe, 1995) models acquired using corpus data and WordNet, • Current work (McCarthy) “prototypical selectional preference models” acquired using corpus data and an automatically generated thesaurus • (Lin, 98 …see later) • e.g. drink <object> vs drink tea • e.g. throw <object> vs throw light

  13. Distributional Approaches: Latent Semantic Analysis

  14. Distributional Approaches: Latent Semantic Analysis

  15. Example dog, hot and “hot dog” feed the dog, keep dogs, keep cats, stroke cats, feed the horse, --------------------------------- hot water cold water, hot milk, warm milk, boiling milk, hot weather ------------------------------ eat the sandwich, eat the hot dog, cook the hot dog, serve the burger dog: cat animal pet horse … --------------------------------- hot: cold warm boiling mild… --------------------------------- “hot dog” : hamburger sandwich pizza Distributional Approaches: Thesaurus creation

  16. Distributional Approaches • Schone and Jurafsky (2001) LSA weighed sum of vectors for component words compared to MWE candidate • Baldwin et al (2003) decomposability (simple vs non or idiosyncratic) • of noun noun compounds and verb particle constructions. Compared vectors of constituent words in isolation • Bannard et al (2003) compare LSA with Lin (1999) on verb particle constructions • Katz and Giesbrecht (2006) do token analysis for 1 example "ins Wasser fallen" . Compare literal and compositional vectors for this example. Type based experiment with composed vectors where constituent words have occurred in isolation.

  17. clamber up climb up slither down walk down creep down clamber up climb up slither down walk down creep down walk jump go up walk jump go up Distributional Methods • McCarthy et al. (2003) look at overlap of similar words (neighbours) in a distributional thesaurus for verb e.g. climb compared to verb and particle construction e.g. climb down • Various other measures, including number of neighbours in the phrasal set with the same particle, (minus the number having the same particle in the simplex verb neighbours)

  18. Combining approaches • Venkatapathy and Joshi (2005) • frequency • PMI • substitution based on Lin (1999) • distributed frequency of object, • distributed frequency of object with dissimilar verbs • LSA similarity of V-O with verbal form of O • LSA dissimilarity of V-O with V • All combined with SVM ranking

  19. Method: Selectional Preferences using distributional thesaurus (McCarthy) • Is the argument prototypical for this predicate and argument relationship? • E.g. eat my hat • like substitution methods, but not explicitly looking for substitute • Verb + direct objects • e.g. eat {meal 5 dinner 5 tea 6 lunch 10 food 6 sandwich 3 duck 1 cheese 2 • hat 3} • food: sandwich, cheese, meat duck… • --------------------------------- • meal: dinner lunch tea supper … • --------------------------------- • clothing : shirt belt hat trousers…

  20. Methods for evaluation: token based • token based: • Hashimoto et al (2006) 300 example sentences of 100 idioms, Information from dictionary for discrimination • Katz and Giesbrecht (2006) 67 occurrences of 1 idiom (ins Wasser fallen) literal and idiomatic readings have orthogonal LSA vectors Compare individual token vectors to these

  21. Methods for evaluation: type based • Dictionary • Schone and Jurfasky (2001) Fazly and Stevenson (2001) • Using is-links (hyponymy) • Baldwin et al. (2003), WordNet • manual verification • Lin (1999) • Web as validation • Villavicencio (2005) • Hayes et al (2005) • Compositionality judgements • Contribution from constituents, (Bannard, 2002) (Bannard et al 2003) • Along a continuum (McCarthy et al 2003), (Venkatapath and Joshi, 2005)

  22. Some results: Compositionality Judgements on a Continuum • McCarthy et al. (2003) 111 phrasal verb versus verb constructions • (0-10) • 3 native english speakers, highly significant Kendall coefficient of Concordance • Venkatapathy and Joshi (2005) 765 verb object pairs (1-6) • 2 fluent english speakers, Spearmans Rank Correlation Coefficient • Good level of agreement carry out cloud over climb up change hands take interest announce plan

  23. Results McCarthy et al. datasets

  24. Results McCarthy et al. datasets

  25. Correlation of McCarthy et al (2003) human rankings with statistics and dictionaries

  26. Correlation of measures with man-made resources (Mann Whitney Z scores)

  27. Results with Venkatapathy and Joshi (2005) dataset

  28. Conclusions • Purpose of task should match method and evaluation • Evaluation is tricky • Decisions are not clear cut • Statistical measures and substitution methods may be useful, though capturing behaviour that correlates with compositionality • Distributional approaches promising for languages without resources • Selectional preferences may add useful information, alongside other measures

  29. Future • Address tokens as well as types • Tokens on a continuum • Error analysis • Separating non-decomposable from idiosyncratically decomposable • Detecting what multiwords mean, distributional approaches might be promising in this respect • kick the bucket --- die • share datasets!!!

  30. References • Baldwin, Timothy, Colin Bannard, Takaaki Tanaka and Dominic Widdows (2003) An Empirical Model of Multiword Expression Decomposability. In Proceedings of the ACL Workshop on Multiword Expressions: Analysis, Acquisition and Treatment, Sapporo, Japan, pp. 89–96. • Bannard, Colin (2002) Statistical Techniques for Automatically Inferring the Semantics of Verb-Particle Constructions LinGO Working Paper No. 2002-06 http://lingo.stanford.edu/pubs/WP-2002-06.pdf • Bannard, Colin, Timothy Baldwin and Alex Lascarides (2003) A Statistical Approach to the Semantics of Verb-Particles, In Proceedings of the ACL Workshop on Multiword Expressions: Analysis, Acquisition and Treatment, Sapporo, Japan, pp. 65–72. • Fazly, Afsaneh, and Suzanne Stevenson (2006) Automatically constructing a lexicon of verb phrase idiomatic combinations, In Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL), 337-344, Trento, Italy. • Hayes, Jer, Nuno Seco, and Tony Veale (2005) Creative discovery in the lexical validation gap. Computer Speech and Language, 19(4):513-523, • Hashimoto, Chikara, Sato Satoshi and Utsuro Takehito (2006) Japanese Idiom Recognition: Drawing a Line between Literal and Idiomatic Meanings, In Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions pp 353-360, Sydney, Australia. • Katz, Graham and Eugenie Giesbrecht (2006) Automatic Identification of Non-Compositional Multi-Word Expressions using Latent Semantic Analysis, In Proceedings of the ACL Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties Sydney Australia • Lin, Dekang (1998) Automatic Retrieval and Clustering of Similar Words Automatic, In Proceedings of 17th International Conference on Computational Linguistics and the 36th Annual Meeting of the Association for Computational Linguistics Montreal, Canada. • Lin, Dekang (1999) Automatic Identification of Non-Compositional Phrases, In Proceedings of ACL-99, pp.317--324. University of Maryland, Colledge Park, Maryland. • Melamed, I. Dan (1997) Automatic Discovery of Non-Compositional Compounds in Parallel Data, in Proceedings of the 2nd Conference on Empirical Methods in Natural Language Processing (EMNLP), Providence, RI.

  31. References continued • McCarthy, Diana, Bill Keller and John Carroll (2003) Detecting a Continuum of Compositionality in Phrasal Verbs. In Proceedings of the ACL-SIGLEX Workshop on Multiword Expressions: Analysis, Acquisition and Treatment , Sapporo, Japan. • Mukerjee, Amitabha, Ankit Soni and Achla M Raina (2006) Detecting Complex Predicates in Hindi using POS Projection across Parallel Corpora In Proceedings of the ACL Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties pp 28-35 Sydney Australia • Pearce, Darren (2001) Synonymyin Collocation Extraction. In WordNet and Other Lexical Resources: Applications, Extensions and Customizations (NAACL 2001 Workshop). pp 41-46. June. 2001. Carnegie Mellon University, Pittsburgh. • Piao, Scott S.L., Paul Rayson, Olga Mudraya, Andrew Wilson and Roger Garside (2006) Measuring MWE Compositionality Using Semantic Annotation In Proceedings of the ACL Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties Sydney Australia pp 28-35 • Schone, Patrick and Daniel Jurafsky (2001) Is Knowledge-Free Induction of Multiword Unit Dictionary Headwords a Solved Problem? Proceedings of Empirical Methods in Natural Language Processing, Pittsburgh, PA. • Venkatapathy, Sriram and Aravind, K. Joshi (2005) Measuring the relative compositionality of verb-noun (V-N) collocations by integrating features. In Proceedings of HLT/EMNLP, Vancouver. • Villada Moirón, Begoña and Joerg Tiedemann (2006). Identifying idiomatic expressions using automatic word-alignment. In Proceedings of the EACL Workshop on Multiword Expressions in a Multilingual Context. Trento, Italy. • Villavicencio, A.  (2005) The availability of verb-particle constructions in lexical resources: How. much is enough? Computer Speech and Language, 19(4)

More Related