130 likes | 224 Views
Detecting, Assessing and Monitoring Relevant Topics in Virtual Information Environments. Jo¨ rg Ontrup , Helge Ritter, So¨ ren W. Scholz , and Ralf Wagner TKDE, Vol.21, No. 3, 2009, pp. 415-427. Presenter : Wei- Shen Tai 200 9 / 4/8. Outline. Introduction
E N D
Detecting, Assessing and Monitoring Relevant Topics in Virtual Information Environments Jo¨ rgOntrup, Helge Ritter, So¨ ren W. Scholz, and Ralf Wagner TKDE, Vol.21, No. 3, 2009, pp. 415-427. Presenter : Wei-Shen Tai 2009/4/8
Outline • Introduction • Managerial information seeking • Methods • Hierarchically growing hyperbolic self-organizing maps • Information foraging theory • Assessment of association rules and statistical testing for changes • Performance evaluation • Usability evaluation • Discussion and conclusions • Comments
Motivation • Environmental Scanning (ES) activities are hampered by an information overload • It caused by the dramatic increase of relevant documents and messages emitted. • Managers need efficient ways to understand their business environment as well as to integrate this understanding into their planning and decision-making processes.
Objective • Automated ES systems • Supports the limited information processing capacity of humans. • Facilitates sensitiveand context dependent reductions of the information overload.
Managerial information seeking • Situation awareness • A manager identifies a topic relevant to his or her business decisions, he or she is interested in precise information and, particularly, in changes of the relations of facts. • Application domain • Example of 2,314 documents obtained from the Internet-based hospitality industry newsletter, ehotelier.com.
Hierarchically growing hyperbolic SOM • Hierarchically Growing Hyperbolic SOM (H2SOM) • Node’s quantization error QE as the growth criterion. If a given threshold QE for a node is exceeded, that node is expanded.
Hierarchical Document Organization • Labeling • Terms correspond to the maximal values in the prototype vectors. • Interactive message level display • Each node represents a subset of messages, which can be displayed via “drill down “.
Topic Detection in Document Streams • Time-dependent activation potential • A distinct peak dominates the message landscape.
Information foraging theory (IFT) • Information scent • ghi is appraised by means of its relevance in the actual context. Ak is the relevance of a term k via Bayesian prediction. • Information diet • B is total time spent on searching this information, T is the total time spent on extracting and handling the relevant information.
Assessment of association rules andstatistical testing for changes • Lift and interestingness • Statistical testing with the measures of interestingness • For rule 1 (hotel chain reports), χ2(A →C)=11.82. In contrast, for rule 2 (Bali attacks), χ2(A →C)= 53.10, and for rule 3 (Iraq war), χ2(A →C)= 65.63.
Performance evaluation • Fast tree search capability of the H2SOM • Usability evaluation • The degree of completion of both tasks is equal or significantly lower for subjects using the standard tree browser.
Discussion and conclusions • An intelligent system for supporting ES process • Discovery of new information • H2SOM and an interactive visual exploration. • Expansion of knowledge • IFT to digest relevant information sources. • Monitoring of already identified topics • More precise assessment of changes in the document stream.
Comments • Advantage • This hybrid intelligent system provides an interactive information exploration tool via visual interface. • It can be integrated into discovery, expansion, and monitoring concepts in cognitive phases of ES. • Drawback • It lacks of enough persuasiveness to determine the branching factor nbas an esthetic view. • The growth threshold Θ QE was set to zero but limited the expansion of the network to a depth of five hierarchy levels. • Application • Information discovery, organizing and maintenance.