80 likes | 194 Views
Evaluating Vector Space Models using "Högskoleprovet". Poster for: GSLT Internal Conference 2004 Leif Grönqvist, 23. October GSLT & MSI@VxU. Experiment Setup. Training data Newspaper texts Size: 10MTok? Test data (HP200) ORD from Högskoleprovet, 5 years (200 questions), example: ansats
E N D
Evaluating Vector Space Models using "Högskoleprovet" Poster for: GSLT Internal Conference 2004 Leif Grönqvist, 23. October GSLT & MSI@VxU
Experiment Setup • Training data • Newspaper texts • Size: 10MTok? • Test data (HP200) • ORD from Högskoleprovet, 5 years (200 questions), example: ansats • A. sammanfattning • B. syfte • C. fortsättning • D. försök • E. granskning Evaluating Vector Space... GSLT Internal Conference
Basic Approach • Calculate a vector space model using training data (I will use SVD for dimensional reduction) • For each question: • Calculate vectors for the question word and the alternatives • Select the alternative with the vector closest to the question word vector • Can be used to evaluate different vector space models! Evaluating Vector Space... GSLT Internal Conference
But: The tests contain phrases! • psykoprofylax A. återspegling av känslolivet i t.ex. gester och kroppshållning B. förmåga att med tankekraft sätta föremål i rörelse C. förmåga att se in i framtiden D. metod för att förhindra oro och smärta vid t.ex. förlossning E. mätning av mentala prestationer, förmågor och personlighetsdrag Evaluating Vector Space... GSLT Internal Conference
Main problem • Try to build a vector space that handles phrases • Ordinary LSI: • The corresponding vector for a word A is the “meaning” of A • meaning (A B C) = meaning (A) + meaning (B) + meaning (C) • But how could we then know that: • reda av göra soppa eller sås tjockare • (that meaning of reda is very rare) Evaluating Vector Space... GSLT Internal Conference
Improvements(?) Has to be done during tokenization: • Improvement 1: add tuples up to length n (”president Bill Clinton” president, Bill, Clinton, president_Bill, Bill_Clinton, president_Bill_Clinton) • Dependency improvement: • Run the MALT parser • Create and include tuples according to the dependencies ((”president Bill Clinton” president, Bill, Clinton, Bill_Clinton, president_Bill_Clinton) • Ultimate improvement: combine them? Evaluating Vector Space... GSLT Internal Conference
No results yet, but more problems • HP200 contains 214 distinct words • 10 are not present in the training data (åsiktsmässig, porslinsmålning, humus, igångsättning, fröställning, …) • 30 have a frequency between 1 and 4 • If we don’t know the question word we have to guess result=baseline • If we don’t know some of the alternatives, should we ever guess the unknown alternative? • Maybe compound analysis is needed… Evaluating Vector Space... GSLT Internal Conference
Cooperation with SICS • Magnus Sahlgren will try the same experiment using RI and other training data • We will then try the tuned systems with the same training data • Compared to Toefl, ORD200 seems to be much harder • Many phrases • More difficult and old fashioned words • Easier to find good training data for English • Main goal: to use ORD200 for evaluation Evaluating Vector Space... GSLT Internal Conference