260 likes | 413 Views
Capturing User Contexts: Dynamic Profiling for Information Seeking Tasks. Roman Y. Shtykh Waseda University, Japan. Information Need as a Driving Force of Human Information Behaviour. Recognition of one’s knowledge inadequacy to satisfy a particular goal (Case, 2002)
E N D
Capturing User Contexts: Dynamic Profiling for Information Seeking Tasks Roman Y. Shtykh Waseda University, Japan
Information Need as a Driving Force of Human Information Behaviour • Recognition of one’s knowledge inadequacy to satisfy a particular goal (Case, 2002) • “consciously identified gap” in one’s knowledge (Ingwersen and Jarvelin, 2005) How can the system be user-centric and satisfy sufficiently the user’s information need without knowing it?
Context • Information need emerges in one’s individual context, and both context and information need are evolving over time • Information behaviours happening to satisfy the information need and leading to an information object selection also take place in the same particular context
BESS (BEtter Search and Sharing) Framework for collaborative information seeking and sharing. Uses uniform relevance feedback to infer user interests changing over time and • Use the knowledge about the interests • to better satisfy seeking intents by providing information that is likely to match inferred user interests – PERSONALIZED SEEKING • to evaluate shared information based on his/her interests (expertise) – PERSONALIZED SHARING
Profile Structure l – layer, k – concept number
Concept Formation with H2S2D (High Similarity Data-Driven) Clustering • Online incremental clustering method for relevance feedback sequential data. • Based on the peculiarities of a user’s seeking behavior (ASSUMPTIONS in the next slide).
Assumptions • When a user searches, he/she usually sees (clicks on, focuses the attention on, etc.) several documents (links or other objects) until the most relevant is found. Most of these documents are potentially inter-similar to some extent and can give a conception about a particular user interest. • Even if some similar documents are not sequenced till the present moment, there are documents related to the persistent user interests and re-searches on these interests are likely to occur. In these cases a user either clicks on the links he/she found before or on the links leading to the documents highly similar to those found before.
Assumptions (1)Relevance Feedback • Feedback – sequentially-incoming uniform data S with subsequences of n (more than one) or more highly similar items linked through by a particular information needS = S1S2…Sn…and can be considered potentially new semantic clusters (concepts).
Assumptions (2)Relevance Feedback Items not coming in high-similarity subsequences are still considered as potentially related to user interests, but since they are not much useful for profiles they are put into a candidate pool to be retrieved and used for concept formation later when a subsequence of similar feedback data items is observed.
AssumptionsUser Study. Assumption 1: subsequence percentage 12 users, two weeks Sth = 0.05 Sth = 0.1
AssumptionsUser Study. Assumption 2: percentage of re-accessed and all inter-similar documents
H2S2D Algorithm (1) • Online incremental • Unsupervised Key features: • a new cluster definition relies upon sequential characteristics of relevance feedback; • assignment of an incoming data item is delayed if there is no similar enough cluster, and performed when such a cluster is created.
H2S2D AlgorithmEvaluation Results (Reuters collection) Number of clusters created after n items are processed
H2S2D AlgorithmEvaluation Results (Reuters collection) Ratio of candidate items to the number of processed items
Interest-change-driven Profile Construction Construction criteria: • Recency • Frequency • Persistency
Profile Structure l – layer, k – concept number
Profile ConstructionSession Layer Latest created or updated concept from Ca = {Ca1, …, Can} of user a Recency
Profile ConstructionShort-term Layer m most frequently updated and used concepts, which are, in their turn, chosen from r most recent (top) concepts in the concept recency list. Recency and Frequency
Profile ConstructionLong-term Layer . derived from the concepts of the short-term layer which were most frequently observed as the short-term layer’s components. Persistency
Profile ConstructionAn example (1) Short-term layer
Profile ConstructionAn example (2) Long-term layer
Conclusions (1) * In order to implement user-centric services, knowledge about a user’s information need (IN) is needed. * IN is something that cannot be captured (at least with today’s advances in human sciences) * An attempt to obtain “fragmentary” context can be done to further facilitate a user’s activities
Conclusions (2) * We proposed a multi-layered user modelling approach to dynamically organise and update a user’s contextual information according to its volatility and persistency characteristics. * We proposed Similarity Sequence Data-Driven clustering algorithm for concept construction. In spite of its relative simplicity, the proposed H2S2D method has demonstrated reasonably good clustering results in terms of accuracy and precision and has proved to be suitable for fast real-time relevance feedback processing to guarantee always-updated concepts.