1 / 26

How much do word embeddings encode about syntax?

How much do word embeddings encode about syntax?. Jacob Andreas and Dan Klein UC Berkeley. Everybody loves word embeddings. few. most. that. the. each. a. this. every. [ Collobert 2011]. [ Collobert 2011, Mikolov 2013, Freitag 2004, Schuetze 1995, Turian 2010].

britain
Download Presentation

How much do word embeddings encode about syntax?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. How much do word embeddings encode about syntax? Jacob Andreas and Dan KleinUC Berkeley

  2. Everybody loves word embeddings few most that the each a this every [Collobert 2011] [Collobert 2011, Mikolov 2013, Freitag 2004, Schuetze 1995, Turian 2010]

  3. What might embeddings bring? Mary Cathleen complained about the magazine’s shoddy editorial quality . average executive

  4. Three hypotheses Vocabulary expansion(good for OOV words) Statistic pooling(good for medium-frequency words) Embedding structure(good for features) Cathleen Mary average editorial executive tense transitivity

  5. Cathleen Mary Vocabulary expansion: Embeddings help handling of out-of-vocabulary words

  6. Vocabulary expansion Cathleen yellow Mary John enormous Pierre hungry

  7. Vocabulary expansion Mary Cathleen complained about the magazine’s shoddy editorial quality. Cathleen yellow Mary John enormous Pierre hungry

  8. Vocab. expansion results +OOV Baseline

  9. Vocab. expansion results (300 sentences) +OOV Baseline

  10. average editorial executive Statistic pooling hypothesis: Embeddings help handling ofmedium-frequency words

  11. Statistic pooling {NN} {NN, JJ} editorial executive {JJ} giant {NN} kind {NN, JJ} average

  12. Statistic pooling {NN, JJ} {NN, JJ} editorial executive {JJ, NN} giant {NN, JJ} kind {NN, JJ} average

  13. Statistic pooling editorial NN {NN} {NN, JJ} editorial executive {JJ} giant {NN} kind {NN, JJ} average editorial NN

  14. Statistic pooling results +Pooling Baseline

  15. Vocab. expansion results (300 sentences) +Pooling Baseline

  16. tense transitivity Embedding structure hypothesis: The organization of the embedding spacedirectly encodes useful features

  17. Embedding structure “transitivity” vanishing dined dining vanished “tense” devoured devouring assassinated assassinating VBD dined VBD dined [Huang 2011]

  18. Embedding structure results +Features Baseline

  19. Embedding structure results (300 sentences) +Features Baseline

  20. To summarize (300 sentences)

  21. Combined results +OOV +Pooling Baseline

  22. Vocab. expansion results (300 sentences) +OOV +Pooling Baseline

  23. What about… • Domain adaptation? (no significant gain) • French? (no significant gain) • Other kinds of embeddings? (no significant gain)

  24. Why didn’t it work? • Context clues often provide enough information to reason around words with incomplete / incorrect statistics • Parser already has a robust OOV, smallcount models • Sometimes “help” from embeddings is worse than nothing: bifurcate Soap homered Paschi tuning unrecognized

  25. What about other parsers? • Dependency parsers(continuous repr. as syntactic abstraction) • Neural networks(continuous repr. as structural requirement) [Henderson 2004, Socher 2013] [Henderson 2004, Socher 2013, Koo 2008, Bansal 2014]

  26. Conclusion • Embeddings provide no apparent benefit to state-of-the-art parser for: • OOV handling • Parameter pooling • Lexicon features • Code online at http://cs.berkeley.edu/~jda

More Related