50 likes | 116 Views
This research focuses on MLR issues across different countries such as Japan, China, and Germany, emphasizing the need for new features, varying query types, and language blending approaches. Metrics designed for English may not be suitable for Japanese MLR development, with unique features like Query Word Length and Phonetic URL Match being crucial. Evidence suggests Germany is ahead in MLR compared to Google, while Japan faces fewer spam issues. Future developments include features like vcano.match and Matching segmented chunks.
E N D
International/JP MLR Issues • Have to do more with less data • Blending different languages? • Can’t necessarily filter adult • May need new/different features • Different types of queries English/Bracket/Phrase/etc • Metrics designed for English • China has lots more spam • Japan has much less spam • Germany looks 10-20% ahead of Google by DCG
Different features important for JP • http://internal.inktomi.com/~lukeb/FeatureImportance.html • “Linkflux” • How soon the word appears in the document • Is the first word in query in the title
New features for JP • Query Word Length very important • Query type important • Phonetic url match • Future: • vcano match • Matching segmented chunks