220 likes | 436 Views
Collaborative Query Previews in Digital Libraries. Lin Fu, Dion Goh, Schubert Foo Division of Information Studies School of Communication and Information Nanyang Technological University. Presentation Overview. Background Query Previews and Collaborative Filtering
E N D
Collaborative Query Previews in Digital Libraries Lin Fu, Dion Goh, Schubert Foo Division of Information Studies School of Communication and Information Nanyang Technological University
Presentation Overview • Background • Query Previews and Collaborative Filtering • Collaborative Query Previews (CQPs) • System Design and Implementation • Advantages of the System • Future work
Background • Information Overload: • World Wide Web • Digital libraries • Information Seeking: • Information seeking is a broad term encompassing the ways individuals articulate their information needs, seek, evaluate, select and use information (Lokman & Stephanie, 2001) • Collaboration and communication are important • Pre-Query Information (PQI) • Information needs • Information system • Knowledge of the collection
Use of PQI in Information Retrieval Information Needs Information Systems Physical Collections Pre-Query Information Target Information Information Systems Digital Library Query Collection Knowledge Structure of the Collection Domain knowledge
Example of Collection Knowledge • Suppose a user wants to search a paper on overview-detail style interface but does not know the title, and also a novice in this field. • The user enters “interface” or “overview, detail” as the query. However, nothing in the top 50 results rings a bell • Someone else searching for the same paper might remember its name clearly (“Reading of Electronic Documents: The Usability of Linear, Fisheye, and Overview+Detail Interfaces”). He knows that using “fisheye, overview, detail” as the query keyword will yield a good result
Concept 1: Query Previews • Definition: • Query previews provide an overview about the data distribution in a data collection (Greene et al., 1999). • Overviews are represented as aggregate information on attributes of the collection---known as summary data. • The summary data is displayed using various visualization techniques: histograms, timelines.
Advantages of Query Previews: • Reduce queries with zero or large number of hits. • Prevent the retrieval of undesired records. • Represent statistical information of the database visually
Concept 2: Collaborative Filtering • Definition: • Collaborative filtering is a technique for recommending items to a user based on similarities between the past behavior of the user and that of likeminded people (Chun & Hong, 2001) • Examples: • Tapestry: a system that can filter information according to other users’ annotations (Goldberg, Nichols, Oki & Terry, 1992) • GroupLens: a recommender system using user ratings of documents (Resnick , Courtiat & Villemur, 2001)
Advantages of Collaborative Filtering • Use the community for knowledge sharing. • Select high quality items from a large information stream.
Limitationsof Existing Techniques • Query Previews: • Lack of support for communication and collaboration. • Collaborative Filtering: • Lack of support for gathering PQI.
Collaborative Query Previews (CQPs) • CQP is an integrated approach to augment information seeking by supporting collaboration and communication during the process of gathering PQI. • CQPs generate an overview about a data collection through a set of aggregate information. • CQPs introduce a collaborative aspect by providing recommendations of queries.
Collaborative Query Previews (CQPs) • Direct Previews of the Data Collection: • Through the aggregate information on selected attributes, users can get familiar with the structure of the database. • Recommendation of Queries: • Through collaborative filtering techniques, CQPs recommend related queries previously executed by other users to help the current user make better sense of how the document collection met past information needs that coincide with the present information need.
Design and Implementation • Introduction: • ZWE provides an integrated platform for supporting a variety of scholarly tasks including browsing, querying, organizing and annotating of information resources (Goh, Fu & Foo, 2002) using a spatial metaphor. • ZWE supports the entire process of information seeking by incorporating CQPs.
Design and Implementation Artifacts (photos, metadata, annotations) Recommended queries Browsing tree Query previews Popup menu Result lists Tabs Query area Work area
Design and Implementation Zoomable Work Environment User Management Browsing Authoring Searching Query Previews Display Recommendation Feature Extraction Metadata Repository Multimedia Repository Past Queries Repository User Profiles Repository
Design and Implementation • JAZZ: a Zoomable User Interface (ZUI) API that allows developers to quickly and easily build zoomable information spaces.
Tamino Manager Database Schema Editor Schema Schema Schema Interactive Tools XML XML X-Query Tools Design and Implementation • Tamino XML Server: a platform to build an XML based information retrieval system.
Design and Implementation • For query recommendation module, we proposed a hybrid approach (Fu, Goh & Foo, 2003a, 2003b) to cluster past queries and apply the algorithms to find similar past queries for a given query. • Experiments show that our hybrid algorithm outperforms the existing query clustering approach.
Advantages of Proposed System • Integerated work environment: more interactive, zoomable. Multifaceted information artifacts. Generic framework. • CQPs support the information seeking process from two perspectives: • From direct previews of the data collection. • From queries issued previously by others.
Future Work • With the initial prototype developed, the next phase of this work will focus on the evaluation of CQPs by users of the digital library. • Continuing research is also being carried out to improve the aspects of query clustering by further investigating the use of hybrid approaches, including content-based, feedback-based and result-based approaches.