100 likes | 201 Views
Term Burstiness in WSD and Pseudo Relevance Feedback. Atelach Alemu Argaw March 2006. Burstiness Model (Sarkar et al). Model gaps (not term occurrence) Mixture of exponential distributions Model the amount of time until a specific event occurs Between-burst (1/ l 1 , or l 1 ’)
E N D
Term Burstiness in WSD and Pseudo Relevance Feedback Atelach Alemu Argaw March 2006
Burstiness Model (Sarkar et al) • Model gaps (not term occurrence) • Mixture of exponential distributions • Model the amount of time until a specific event occurs • Between-burst (1/l1, or l1’) • Within-burst (1/l2 or l2’) • Reference:Sarkar, Avik; De Roeck,Anne; Garthwaite, Paul H. Term Re-occurrence Measures for Analyzing Style. In proceedings of SIGIR 2005 workshop on Stylistic Analysis Of Text For Information Access. 2005.
Burstiness Model (Sarkar et al) • First occurrence • No occurrence: censoring
Burstiness Model • Baysian parameter estimation • posterior prior x likelihood • P(theta/D) = P(theta) x P(D/theta) • choose uninformative prior • estimate posterior using Gibbs Sampling (MCMC) • Random sampling from the population and using the sample values to estimate the posterior. • WinBUGS software
Parameter estimates (Sarkar et al) l1’ = 1 / l1 • The mean of the exponential distribution with parameter lambda • Rarity of a term in the corpus : average gap at which the term occures if it has not occured recently l2 ’ = 1 / l2 • The rate of occurence of a term given it has occured recently • Within document burstiness P1 • Probability of a term occuring with rate l1’ P2 • Probability of a term occuring with rate l2’
Burstiness Model (Sarkar et al) Word behaviours Small l1’, small l2’: frequently occurring function word Large l1’, small l2’: bursty content word Small l1’, large l2’: frequent but well spaced function word Large l1’, large l2’: infrequent scattered function word
Test run Data • Europarl • English 164K • Morphology, POS • Swedish 130K • Converted to numeric format Pilot run • 1000 iteration burn-in • further 5000 iterations for estimate
Discussion Points • Convergence • Inclusion of POS and morphological analysis Vs more data • How could context information be included? • Does it have to be parallel? • WSD Vs topicality and pseudo relevance feedback