180 likes | 336 Views
A Study of Web Logs for Personalizing the MultiLingual Information Access to The European Library. Outline. The European Library Web Logs MultiLingual Information Access (MLIA) Web Log Analysis Geographic provenance Collections usage Discussion Conclusion and Future work.
E N D
A Study of Web Logs for Personalizing the MultiLingual Information Access to The European Library
Outline • The European Library • Web Logs • MultiLingual Information Access (MLIA) • Web Log Analysis • Geographic provenance • Collections usage • Discussion • Conclusion and Future work
The European Library http://www.theeuropeanlibrary.org/ • Provide access to most of the European National Digital Libraries • Many different languages • Resources can be both digital or bibliographical (books, posters, maps, sound recordings, videos, etc.).
The European Library • The European Library collections are suitable for Learning/E-Learning • Quality / Reliability • How the analysis of WEB Log can help in implementing MLIA • Questions we posed at the beginning of our work: • How can E-learning systems deal with collections in different languages? • Are users interested in multilingual/cross-collections instruments? • What are their preferences? • Are them influenced by language?
The European Library • User interaction mainly on the client side • Web Server logs
HTTP Log: structure of the data • W3C Extended Log File Format 2007-01-01 00:04:05 192.87.31.35 GET /Index.html - 80 - 70. *.*.* Mozilla/4.0+(compatible;+MSIE+7.0;+Windows+NT+5.1;+.NET+CLR+1.1.4322) - http://www.google.com/search?sourceid=navclient&gfns=1&ie=UTF-8&rls=PCTA,PCTA:2006-33,PCTA:en&q=read+tokyopop+books+online cTargets=collections:a0000,collections:a0037,collections:a0200,collections:a0141,collections:a0010,collections:a0035,collections:a0086,collections:a0132,collections:a0067,collections:a0001,collections:a0062,collections:a0130,collections:a0163,collections:a0211,collections:a0194,collections:a0075,collections:a0073,collections:a0066;+TELSESSID=d551tvd9legbq3rh4l23rjkgh7;+AreCookiesEnabled=889;+cTargetsThemes=theme0 0 0 381 535 203
Log analysis • October 1st 2006 to April 30th 2007 • 22,458,350 Requests • 209,900 different sessions reconstructed using cookies On demand query Logs Log processing
MLIA • MultiLingual Information Access (MLIA) • “possibility for the users of the system to access and search the federated libraries in a personalized way that can allow them to access the collections of documents in their mother tongue and in other preferred languages” • Issues: • Interaction happens outside the system • Logs contain mainly navigational and browsing activity • No control over query sent • MLIA require modifications both on TEL and Digital Libraries services • Control over the central index
Index Index Index Index Index Index Index MLIA • Isolated Query Translation • Translation and Retrieval are separated Translation Component Retrieval Component Translation • Pseudo-translation • Central index translated translation
MLIA • Language to language context (+400 language resources) • Pivot language • Does the user like the query translation approach? • Poor interaction versus rich interaction • When should we prefer direct translation? • User geographical distribution • Collection usage • Language to language preferences • Promoting usage
Collections usage Collection selection Default list First time user
Conclusions • How can E-learning systems deal with collections in different languages? • Isolated query translation versus pseudo-translation • Are users interested in multilingual/cross-collections instruments? • Data showed that there is a demand for multilingual contents and users are interested in more multilingual functionalities • What are their preferences? • The achieved results allowed The European Library to better know users preference about multilingual resources (user distribution, collection selection, …) • Are them influenced by language? • There is a correlation between language and user behavior
Future work: new questions • Do users navigate the portal displaying data in their mother tongue or do they prefer to use the default language (English)? • 80% use the portal as it is • Do English users search only on English collections? • Query language not present in HTTP logs • Action logs • 53475 sessions, 15674 advanced searches, 783 searches based on language • Most ENG,FRE, GER, ITA