330 likes | 507 Views
A Concept Space Approach to Semantic Exchange. Tobun Dorbin Ng Dissertation Defense April 19, 2000. Outline. Introduction Literature Review Research Questions & Methodologies Concept Space Consultation Concept Space Generation Conclusions. Objective.
E N D
A Concept Space Approach toSemantic Exchange Tobun Dorbin Ng Dissertation Defense April 19, 2000
Outline • Introduction • Literature Review • Research Questions & Methodologies • Concept Space Consultation • Concept Space Generation • Conclusions
Objective • To investigate the use of information technologies that clarify semantic meaning to help users elaborate their information needs by providing their library-specific knowledge during the information seeking process. Introduction
Questions& Problems Users Does a query truly represent user information need? Query Document Set Information Retrieval Systems • Keyword Search • Inverted Index • Summarization • Visualization Can these knowledge sources adequately serve users’ information needs? Browsing Classifications Search for Documents Knowledge Spaces Distributed, Heterogeneous Database Collections Concept Spaces Knowledge Discovery • Concept Association • Cluster Analysis Category Spaces Text Image Video Introduction
Goal • To adopt a user-centric and interactive approach to helping users elaborate their information needs with library-specific knowledge and simultaneously gain insight into a library’s offerings related to their information needs. Introduction
Research Issues • Interactive Consultation with Knowledge Sources • Automatic Generation of Semantic-bearing Knowledge Sources from Corresponding Libraries Introduction
Static Nature of Knowledge in Library Collection • Characterizing Document Objects • Characterizing Global Knowledge in Document Collections • Grand Coverage • Knowledge of Knowledge • Revealing Knowledge in Neighborhood • Contextual Information Literature Review
Dynamic Nature of User Information Need • Expressing User Need • Information Need • Dynamic, not directly observable or symbolized • Indeterminism • Opportunism • Vocabulary Problem • Recognition with Contextual Information • Key Word In Context, Relevance Feedback Literature Review
Perceiving Knowledge • What is the user’s perspective of knowledge? • How does a user perceive retrieved or derived knowledge? • Computing Relevance? Literature Review
Structure & Context: Aids To Perceive Knowledge • Structureless and Contextless • Document List • Structural but Contextless • Dynamic Clustering • Structural and Contextual • Path to the Knowledge Literature Review
ResearchQuestions Users Context-rich Query Information Need Context-coherent Document Set • Can knowledge sources be used to help users express their information needs? Vocabulary & Context Concept Consultation Systems Information Retrieval Systems • Keyword Search • Inverted Index • Summarization • Visualization • Concept Exploration • Branch-and-bound Search • Hopfield Net Activation Browsing Classifications Search for Related Concepts Search for Documents Knowledge Spaces Distributed, Heterogeneous Database Collections Concept Spaces Knowledge Discovery • Concept Association • Cluster Analysis Category Spaces Text Image Video Research Questions & Methodologies
Research Methodologies • Systems Development Approach • Experimental Design Research Questions & Methodologies
Concept Space Consultation • Algorithmic Concept Exploration • Large Networks of Knowledge • Man-made Thesauri: LCSH & ACM CRCS • Concept Spaces • Spreading Activation • Traversing a set of Knowledge Networks automatically and suggesting a set of most relevant concepts Concept Space Consultation
Research Questions 1&2 • Would the automatic concept exploration process be able to help users identify more relevant concepts? • Would such a process be able to perform more efficient exploration of a concept space than the conventional manual browsing method? Concept Space Consultation
Research Question 3 • If so, which algorithmic methods - symbolic-based branch-and-bound or neural network-based Hopfield net algorithm - is better in terms of gathering relevant concepts from knowledge sources? Concept Space Consultation
Research Questions 4&5 • Would the concept space consultation process provide a semantic medium to reduce the cognitive demand from users in terms of elaborating information needs? • Would the concept exploration process be able to help users find more relevant documents? Concept Space Consultation
Two Algorithms forSpreading Activation • Branch-and-bound Algorithm • Semantic Net Based: “Optimal” Search • Hopfield Net Algorithm • Neural Net Based: Parallel Relaxation Search • Spreading Activation Process • Activation, Weight Computation, Iteration • Stopping Condition Concept Space Consultation
User Evaluation • 3 Subjects, 6 Tasks, 3 Phases • Phase 1: Identify subject areas • Phase 2: Find other topics using spreading activation & manual browsing • Phase 3: Document evaluation Concept Space Consultation
Findings: Concepts • Manual browsing achieved higher recall but lower term precision than the algorithmic systems. • Manual browsing was also a much more laborious and cognitively demanding process. • When using the algorithms, subjects reviewed the suggested terms more slowly and treated them more seriously and carefully than when performing manual browsing. Concept Space Consultation
Findings: Documents • No signification differences (in document recall and precision) were observed between the relevant documents suggested by the algorithms and those generated via the manual browsing process. • Each approach could contribute to a larger set of relevant documents for users. • The essential differences were time spent and cognitive effort in both approaches. Concept Space Consultation
Publications • Chen, H., Lynch, K. J., Basu, K., and Ng, T. D. “Generating, Integrating, and Activating Thesauri for Concept-Based Document Retrieval,” IEEE Expert, Special Series on Artificial Intelligence in Text-Based Information Systems 8(2):25-34 (1993). • Chen, H. and Ng, T.D. “An Algorithmic Approach to Concept Exploration in a Large Knowledge Network (Automatic Thesaurus Consultation): Symbolic Branch-and-bound Search vs. Connectionist Hopfield Net Activation,” Journal of the American Society for Information Science 3(5): 348-369 (1995). Concept Space Consultation
Concept Space Generation • Automatic Generation of Large-scale Concept Spaces • Feasibility and Scalability Issues of Large-scale Concept Space Generation • Domain Knowledge • Computing Resources Concept Space Generation
Research Question 1 • With regard to computing scalability, would the technique of computer generation of concept spaces be applicable to very large textual databases? Concept Space Generation
Research Question 2 • With regard to domain specific knowledge scalability, would concept space generation by technology create satisfactory domain-specific concept associations from corresponding textual databases? Concept Space Generation
Research Question 3 • How does the quality of concept associations in concept space generated from very large textual databases compare with that of a man-made domain-specific thesaurus? Concept Space Generation
Concept Space Techniques • Document & Object List Collection • Object Filtering • Automatic Indexing • Co-occurrence Analysis • Parallel Supercomputing to Laptop Computing • Large to Small Collections Concept Space Generation
User Evaluation • 10 Subjects, 23 Tasks • Recall & Recognition Phases • Findings: • Concept space has higher concept recall • INSPEC thesaurus has higher concept precision • Concept space compliments man-made thesaurus Concept Space Generation
Publications • Chen, H., Schatz, B.R., Ng, T.D., Martinez, J., Kirchhoff, A., and Lin, C. “A Parallel Computing Approach to Creating Engineering Concept Spaces for Semantic Retrieval: The Illinois Digital Library Initiative Project,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Special Section on Digital Libraries: Representation and Retrieval 18(8): 771-782 (1996). • Chen, H., Martinez, J., Ng, T. D., and Schatz, B. “A Concept Space Approach to Addressing the Vocabulary Problem in Scientific Information Retrieval: An Experiment on the Worm Community Systems,” Journal of the American Society for Information Science 48(1):17-31 (1997). • Houston, A. L., Chen, H., Hubbard, S. M., Schatz, B. R., Ng, T. D., Sewell, R. R., and Tolle, K. M. “Medical Data Mining on the Internet: Research on a Cancer Information System,” Artificial Intelligence Review13(5/6):437-466 (1999). Concept Space Generation
Corpuses & Applications • INSPEC, CSQuest http://ai.bpa.arizona.edu/cgi-bin/mcsquest • CancerLit, Cancer Space http://ai20.bpa.arizona.edu/cgi-bin/cancerlit/cn • Webpages, ET-Space http://ai.bpa.arizona.edu/cgi-bin/tng/ETSpace • GeoRef & Petroleum Abstracts, GIS Space http://ai10.bpa.arizona.edu/gis/ • Law Enforcement, COPLINK Concept Space • DARPA ITO Project Summary Collection http://ai6.bpa.arizona.edu/cgi-bin/tng/Psum • CNN News, http://processc.inf.cs.cmu.edu/tng/inf/ Concept Space Generation
Conclusions • Context-specific Concept Space Consultation • Concept Space As Semantic Exchange Medium Conclusions
Lessons Learned • Both concept space consultation and generation work • “Strategic” use of knowledge sources • Concept Space Technique is scalable conceptually and computationally • Insight to potentially retrieved documents Conclusions
Future Directions • Performing Summarization • Semantic Protocol for Machine Comm. • Multimedia Concept Association • Context Analysis with • Metric Clusters: “distance” information • Scalar Clusters: neighboring concepts of two targeting concepts to compute their similarity Conclusions