10 likes | 116 Views
TEXT CORPUS. CANDIDATE TERMS. CORPUS ANALYSIS. LEXICAL SEMANTIC PATTERNS COLLOCATIONS. TERM BASE. USER INTERACTION. CONCEPTUAL MODELLING. INTERMEDIATE REPRESENTATION. FORMALIZATION. CANDIDATE ONTOLOGY.
E N D
TEXT CORPUS CANDIDATE TERMS CORPUS ANALYSIS LEXICAL SEMANTIC PATTERNS COLLOCATIONS TERM BASE USER INTERACTION CONCEPTUAL MODELLING INTERMEDIATE REPRESENTATION FORMALIZATION CANDIDATE ONTOLOGY Information Extraction and Multimedia Group Dept. of Computing, School of Electronics and Physical Sciences, University of Surrey Mariam TARIQ Automatic Extraction of Knowledge Structures in Specialist Domains If the knowledge of a domain can be expressed in its specialist language, it follows that the specialist language can be said to be a reflection of the ontology of the domain, which determines the various categories that exist and their interrelationships. I am investigating whether a method can be outlined for the automatic extraction of such categories and relationships given a representative text corpus of any arbitrary specialist domain. These knowledge structures are derived from texts through the use of well-established NLP techniques that exploit the fact that specialist languages tend to have a profusion of nouns and compound nouns, which relate to certain named objects, events, and actions key to the domain and that the interrelationships between these named objects are manifested in the language through the use of certain lexical semantic patterns. The ontology derived from the forensic science corpus will be used by SOCIS for query expansion. Method and Implementation Publications Khurshid Ahmad, Bogdan Vrusias & Mariam Tariq, “Co-operative Neural Networks and Integrated Classification,” IJCNN 2002 Proceedings, Honolulu, Hawaii, May, 2002. Andrew Hippisley, Mariam Tariq & David Cheng, “Hierarchical Data & the Derivational Relationship Between Words” IRCS Workshop Proceedings p125-133, December 2001. “The Forensic Science Corpus: A Preliminary Analysis,” “Image Retrieval using Object-Relational Technology,” “Ontology: A Review of Current Trends in Design & Development,” Technical Reports, Dept. of Computing, UniS. UPPER-LEVEL /OTHER ONTOLOGY This work is funded by EPSRC project SOCIS (Scene of Crime Information System)