200 likes | 297 Views
UMASS-Amherst at TDT 2004. Unsupervised and Supervised Tracking. Hema Raghavan. Outline. Create a training corpus Unsupervised tracking Supervised Tracking Discussion. Creating a training corpus. For Tracking 50% topics are English 50% are multilingual
E N D
UMASS-Amherst at TDT 2004 Unsupervised and SupervisedTracking Hema Raghavan
Outline • Create a training corpus • Unsupervised tracking • Supervised Tracking • Discussion
Creating a training corpus • For Tracking • 50% topics are English • 50% are multilingual • Created a training corpus (supervised and unsupervised) • 30 topics from TDT4 • 50% stories with primarily English topics. • 50% multilingual stories
Unsupervised Tracking Ideas Ideas • Models • Vector Space • Relevance Models • Adaptation • Native Language comparisons
Unsupervised Tracking Models • Vector Space • TF-IDF • IDF is incremental • Relevance Models • State of the art, high performance system • Adaptation
Native Language Hypothesis • TDT tasks involve comparisons of models: • Story link detection: sim(Si, Sj) • Topic tracking: sim(Si, Tj) • It is more effective to measure similarity between models in the original language of the stories, than after machine translation into English • Quality of translation • Differences in score distributions • Trivially obvious? Hard to demonstrate in tracking
Submitted Runs • TF-IDF (UMASS4) • TF-IDF + adaptation (UMASS1) • TF-IDF + adaptation + native models (UMASS2) • Relevance Models + adaptation (UMASS5) • All submissions for primary evaluation condition.
Supervised Tracking • Creating a newswire only training corpus. • Ideas • Models • Vector Space • Relevance Models • Native Language comparisons • Incremental Thresholds • Negative Feedback
Incremental Thresholds • Utility • Relevance judgments for both Hits and False-Alarms • Increment the YES/NO threshold by when Utility falls below zero.
Negative Feedback • Relevance judgments for both Hits and False-Alarms • for a hit. • for a false alarm.
Submitted Runs • Rel. Models (UMASS-2) • Optimized for TDT cost • Rel. Models + Inc. Thresholds (UMASS-1) • TF-IDF + adaptation + neg. feedback + inc thresholds (UMASS-3) • TF-IDF + adaptation + native models (UMASS-4) • TF-IDF + adaptation + native models + neg feedback + increase thresh. (UMASS-7) Optimized for T11SU
Supervised Tracking Results Cost: 0.0467
Results and Discussion • Supervision clearly helps. • Relevance models – a clear winner. • Negative Feedback helps. • Training set did not reflect test very well. • Min-cost versus T11SU
Future Work • Exploration Exploitation trade-off. • What about feedback that is less on demand? • more realistic • Can add costs for judgments. • What about feedback like in the HARD task – Clarification forms?