50 likes | 65 Views
TopicXP: Exploring Topics in Source Code using Latent Dirichlet Allocation. Trevor Savage, Bogdan Dit, Malcom Gethers and Denys Poshyvanyk 26 th IEEE International Conference on Software Maintenance Timişoara, Romania September 16, 2010. Latent Dirichlet Allocation (LDA).
E N D
TopicXP: Exploring Topics in Source Code using Latent Dirichlet Allocation Trevor Savage, Bogdan Dit,Malcom Gethers and Denys Poshyvanyk 26th IEEE International Conference on Software Maintenance Timişoara, Romania September 16, 2010
Latent Dirichlet Allocation (LDA) • Probabilistic Topic Models (Latent Dirichlet Allocation –LDA [Blei’03]) • Models documents as mixture of topics
Maximal Weighted Entropy (MWE) • Occupancy(tj) captures the average probability of topic tj • Distribution (tj) captures distribution of tj using information entropy • MWE(Cj)=max(Occupancy(tj) x Distribution (tj))
Thank you. Questions? SEMERU @ William and Mary http://www.cs.wm.edu/semeru/TopicXP