400 likes | 855 Views
Dependency-Based Word Embeddings. Omer Levy Yoav Goldberg Bar- Ilan University Israel. Neural Embeddings. Dense vectors Each dimension is a latent feature word2vec ( Mikolov et al., 2013) State-of-the-Art: Skip-Gram with Negative Sampling “Linguistic Regularities”
E N D
Dependency-BasedWord Embeddings Omer Levy Yoav Goldberg Bar-Ilan University Israel
Neural Embeddings • Dense vectors • Each dimension is a latent feature • word2vec (Mikolov et al., 2013) • State-of-the-Art: Skip-Gram with Negative Sampling • “Linguistic Regularities” king man woman queen Linguistic Regularities in Sparse and Explicit Word Representations Friday, 2:00 PM, CoNLL 2014
Our Main Contribution: Generalizing Skip-Gram with Negative Sampling
Skip-Gram with Negative Sampling v2.0 • Original implementation assumes bag-of-words contexts • We generalize to arbitrary contexts • Dependency contexts create qualitatively different word embeddings • Provide a new tool for linguistically analyzing embeddings
Australian scientist discovers star with telescope Target Word
Australian scientist discovers star with telescope Bag of Words (BoW) Context
Australian scientist discoversstar with telescope Bag of Words (BoW) Context
Australian scientist discoversstar withtelescope Bag of Words (BoW) Context
Australian scientist discovers star with telescope Syntactic Dependency Context
Australian scientistdiscoversstar with telescope Syntactic Dependency Context nsubj prep_with dobj
Australianscientistdiscoversstar with telescope Syntactic Dependency Context nsubj prep_with dobj
How does Skip-Gram work? • Skip-gram represents each word as a vector • Skip-gram represents each context word as a different vector • Same word has 2 different embeddings (as “word”, as “context”)
How does Skip-Gram work? Text Bag of Words Context Word-ContextPairs Learning
How does Skip-Gram work? Text Bag of Words Contexts Word-ContextPairs Learning
Our Modification Text Arbitrary Contexts Word-ContextPairs Learning
Our Modification Modified word2vec publicly available! Text Arbitrary Contexts Word-ContextPairs Learning
Our Modification: Example Text Syntactic Contexts Word-ContextPairs Learning
Our Modification: Example Text (Wikipedia) Syntactic Contexts Word-ContextPairs Learning
Our Modification: Example Text (Wikipedia) Syntactic Contexts (Stanford Dependencies) Word-ContextPairs Learning
What is the effect of different context types? • Thoroughly studied in explicit representations (distributional) • Lin (1998), Padóand Lapata (2007), and many others… General Conclusion: • Bag-of-words contexts induce topicalsimilarities • Dependency contexts induce functionalsimilarities • Share the same semantic type • Cohyponyms • Does this hold for embeddings as well?
Embedding Similarity with Different Contexts Related to Harry Potter Schools
Embedding Similarity with Different Contexts Related to computability Scientists
Embedding Similarity with Different Contexts Online Demo! Related todance Gerunds
Embedding Similarity with Different Contexts • Dependency-based embeddings have more functional similarities • This phenomenon goes beyond these examples • Quantitative Analysis (in the paper)
Quantitative Analysis Dependency-based embeddings have more functional similarities Dependencies BoW (k=2) BoW (k=5)
Dependency Contexts & Functional Similarity • Thoroughly studied in explicit representations (distributional) • Lin (1998), Padó and Lapata (2007), and many others… • In explicit representations, we can look at the features and analyze • But embeddings are a black box! • Dimensions are latent and don’t necessarily have any meaning
Peeking into Skip-Gram’s Black Box • Skip-Gram allows a peek… • Contexts are embedded in the same space! • Given a word , find the contexts it “activates” most:
Analyzing Embeddings • We found a way to linguisticallyanalyze embeddings • Together with the ability to engineer contexts… • …we now have the tools to create task-tailored embeddings!
Conclusion • Generalized Skip-Gram with Negative Sampling to arbitrary contexts • Different contexts induce different similarities • Suggest a way to peek inside the black box of embeddings • Code, demo, and word vectors available from our websites • Make linguistically-motivatedtask-tailored embeddings today! Thank you for listening :)