1 / 39

Dependency-Based Word Embeddings

Dependency-Based Word Embeddings. Omer Levy Yoav Goldberg Bar- Ilan University Israel. Neural Embeddings. Dense vectors Each dimension is a latent feature word2vec ( Mikolov et al., 2013) State-of-the-Art: Skip-Gram with Negative Sampling “Linguistic Regularities”

deiondre
Download Presentation

Dependency-Based Word Embeddings

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Dependency-BasedWord Embeddings Omer Levy Yoav Goldberg Bar-Ilan University Israel

  2. Neural Embeddings • Dense vectors • Each dimension is a latent feature • word2vec (Mikolov et al., 2013) • State-of-the-Art: Skip-Gram with Negative Sampling • “Linguistic Regularities” king man woman queen Linguistic Regularities in Sparse and Explicit Word Representations Friday, 2:00 PM, CoNLL 2014

  3. Our Main Contribution: Generalizing Skip-Gram with Negative Sampling

  4. Skip-Gram with Negative Sampling v2.0 • Original implementation assumes bag-of-words contexts • We generalize to arbitrary contexts • Dependency contexts create qualitatively different word embeddings • Provide a new tool for linguistically analyzing embeddings

  5. Context Types

  6. Australian scientist discovers star with telescope Example

  7. Australian scientist discovers star with telescope Target Word

  8. Australian scientist discovers star with telescope Bag of Words (BoW) Context

  9. Australian scientist discoversstar with telescope Bag of Words (BoW) Context

  10. Australian scientist discoversstar withtelescope Bag of Words (BoW) Context

  11. Australian scientist discovers star with telescope Syntactic Dependency Context

  12. Australian scientistdiscoversstar with telescope Syntactic Dependency Context nsubj prep_with dobj

  13. Australianscientistdiscoversstar with telescope Syntactic Dependency Context nsubj prep_with dobj

  14. Generalizing Skip-Gram with Negative Sampling

  15. How does Skip-Gram work? • Skip-gram represents each word as a vector • Skip-gram represents each context word as a different vector • Same word has 2 different embeddings (as “word”, as “context”)

  16. How does Skip-Gram work? Text Bag of Words Context Word-ContextPairs Learning

  17. How does Skip-Gram work? Text Bag of Words Contexts Word-ContextPairs Learning

  18. Our Modification Text Arbitrary Contexts Word-ContextPairs Learning

  19. Our Modification Modified word2vec publicly available! Text Arbitrary Contexts Word-ContextPairs Learning

  20. Our Modification: Example Text Syntactic Contexts Word-ContextPairs Learning

  21. Our Modification: Example Text (Wikipedia) Syntactic Contexts Word-ContextPairs Learning

  22. Our Modification: Example Text (Wikipedia) Syntactic Contexts (Stanford Dependencies) Word-ContextPairs Learning

  23. What is the effect of different context types?

  24. What is the effect of different context types? • Thoroughly studied in explicit representations (distributional) • Lin (1998), Padóand Lapata (2007), and many others… General Conclusion: • Bag-of-words contexts induce topicalsimilarities • Dependency contexts induce functionalsimilarities • Share the same semantic type • Cohyponyms • Does this hold for embeddings as well?

  25. Embedding Similarity with Different Contexts Related to Harry Potter Schools

  26. Embedding Similarity with Different Contexts Related to computability Scientists

  27. Embedding Similarity with Different Contexts Online Demo! Related todance Gerunds

  28. Embedding Similarity with Different Contexts • Dependency-based embeddings have more functional similarities • This phenomenon goes beyond these examples • Quantitative Analysis (in the paper)

  29. Quantitative Analysis Dependency-based embeddings have more functional similarities Dependencies BoW (k=2) BoW (k=5)

  30. Why do dependencies induce functional similarities?

  31. Dependency Contexts & Functional Similarity • Thoroughly studied in explicit representations (distributional) • Lin (1998), Padó and Lapata (2007), and many others… • In explicit representations, we can look at the features and analyze • But embeddings are a black box! • Dimensions are latent and don’t necessarily have any meaning

  32. Analyzing Embeddings

  33. Peeking into Skip-Gram’s Black Box • Skip-Gram allows a peek… • Contexts are embedded in the same space! • Given a word , find the contexts it “activates” most:

  34. Associated Contexts

  35. Associated Contexts

  36. Associated Contexts

  37. Analyzing Embeddings • We found a way to linguisticallyanalyze embeddings • Together with the ability to engineer contexts… • …we now have the tools to create task-tailored embeddings!

  38. Conclusion

  39. Conclusion • Generalized Skip-Gram with Negative Sampling to arbitrary contexts • Different contexts induce different similarities • Suggest a way to peek inside the black box of embeddings • Code, demo, and word vectors available from our websites • Make linguistically-motivatedtask-tailored embeddings today! Thank you for listening :)

More Related