100 likes | 200 Views
CLSW2013, Zhengzhou, Henan. Hedge Detection with Latent Features. SU Qi sukia@pku.edu.cn. May 12, 2013. 1. Introduction. The importance of Information credibility Hedge hedges are “words whose job is to make things fuzzier or less fuzzy”. [Lakoff, 1972]
E N D
CLSW2013, Zhengzhou, Henan Hedge Detection with Latent Features SU Qi sukia@pku.edu.cn May 12, 2013
1. Introduction • The importance of Information credibility • Hedge • hedges are “words whose job is to make things fuzzier or less fuzzy”. [Lakoff, 1972] • to weaken or intensify the speaker’s commitment to a proposition. • narrowed down by some linguists only to keep it as a detensifier. • CoNLL-2010 shared task of hedge detection • Detecting hedges and their scopes
1. Introduction • Examples <sentence id="S3.7" certainty="uncertain">It is <ccue>possible</ccue> that false allegations may be over-represented, because many true victims of child sexual abuse never tell anyone at all about what happened.</sentence> <sentence id="S3.11" certainty="uncertain"><ccue>Some studies</ccue> break down the level of false allegations by the age of the child.</sentence> <sentence id="S3.19" certainty="uncertain"><ccue>It is suggested</ccue> that parents have consistently underestimated the seriousness of their child's distress when compared to accounts of their own children.</sentence>
1. Introduction • sequence labeling models, e.g. conditional random fields and svm-hmm • binary classification • shallow features (e.g. word, lemma, POS tags, etc.) The complication of hedge detection is in the sense that the same word types occasionally have different, non-hedging uses can only marginally improve the accuracy of a bag-of-word representation auxiliaries (may, might), hedging verbs (suggest, question), adjectives (probable, possible), adverbs (likely), conjunctions (or, and, either…or), nouns (speculation), etc.
2. The Main Points in This Paper • Basic assumption: • high-level (latent) features work better for sequence labeling • projects words to a lower dimensional latent space thus improves generalizability to unseen items, and helps disambiguate some ambiguous items
3. Our Work • we perform LDA training and inference by Gibbs sampling, then train the CRF model by adding topic IDs as additional external features. • As an unsupervised model, LDA allows us to train and infer on an unlabeled dataset, thus relax the re-striction of the labeled dataset used for CRF train-ing.
4. Corpus and Experiments • biological scientific articles • three different levels of feature set • Level 1: token; whether the token is a potential hedge cue (occurring in the pre-extracted hedge cue list) or part of a hedge cue; its context within the scope of [-2, 2] • Level 2: lemma; part-of-speech tag; whether the token belongs to a chunk; whether it is a named entity • GENIA tagger • Level 3: topic ID (inferred by the LDA model)
5. Analysis and Conclusion • Hedge is a relatively “close” set • A significant improvement can be found between the baselines and all the other experimental settings. • The performance of sequence labeling outperforms both naïve methods significantly. • The topics generated by LDA are effective • Our work suggests a potential research direction of incorporating topical information for hedge detection.
Thank you! sukia@pku.edu.cn