220 likes | 330 Views
Applying Text Classification in Conference Management: Some Lessons Learned. Andreas Pesenhofer, Helmut Berger, Michael Dittenbach, Andreas Rauber. Overview. Conference Management Systems Classification & Clustering Case Studies ECDL 2005 ECR Conclusions. Conference Management Systems.
E N D
Applying Text Classification in Conference Management: Some Lessons Learned Andreas Pesenhofer, Helmut Berger, Michael Dittenbach, Andreas Rauber
Overview • Conference Management Systems • Classification & Clustering • Case Studies • ECDL 2005 • ECR • Conclusions
Conference Management Systems • Set of tools to support conference workflow • Basic support for paper submission & review collection • Many tasks for further automation • Selection of the program committee • Topic assignment of submission • Paper to reviewer assignment • Support in review generation • Poster arrangement • Post-conference access to papers
Classification & Clustering • Topic assignment of submission • Problem: authors uncertain about precise topic assignment (conference terminology) • Solution: support by automatic assignment • Method: ATC based on abstracts • Poster arrangement & Post-conference access to papers • Problem: topic based arrangement • Solution: clustering • Method: SOM & Mnemonic SOM
ATC for topic assignment • Train model based on previous conferences • Abstract submission • Automatic assignment • Confirmation
Clustering for organization • Arrange posters thematically • Non-rectangular SOMs reflecting conference site • Mnemonic SOMs simplify post-conference paper access
Overview Conference Management Systems • Classification & Clustering • Case Studies • ECDL 2005 • ECR • Conclusions
ECDL 2005 – ATC data • English abstracts of previous ECDL conferences • Topics of the conference call -> defined seven categories • Pre-processing (removing all numbers, punctuation marks, special characters, transformation to lower case) • tfidf-weighting • 4,141 unique terms • IG of 3,460 top ranked terms average - accuracy over all category is 58.60%
ECDL 2005 – SOM data • Poster and Paper Organization: • full text of accepted posters of ECDL 2005 • term selection based on minimal word length and document frequencies • 30 posters - 569 terms • Post-conference access • 71 papers and posters – 5,654 terms
Overview Conference Management Systems • Classification & Clustering • Case Studies • ECDL 2005 • ECR • Conclusions
ECR - Data • Abstracts of the ECR:European Congress for Radiology • Training set: ECR 2003 & 2004 - 1,952 documents • Test set: ECR 2005 - 924 documents • Same steps as for the ECDL data • Resulting in 14,887 unique terms • IG: 5,720 top ranked terms, average accuracy over all categories of 73.57%
Conclusions • Quality is proportional to amount of training documents • Structure of the classes (overlapping?) • The bulk of submissions can be dealt with automatically • May be used for session assignment • Arrange poster & papers thematically • Easy to memorize & find
Questions? E-Commerce Competence Center Donau-City-Strasse 1 1220 Vienna Austria Phone: +43/1/522 71 71-20 Fax: +43/1/522 71 71-71 Internet: http://www.ec3.at/ E-Mail: office@ec3.at