120 likes | 143 Views
Visualizing Ontology Components through Self-Organizing Maps. Advisor : Dr. Hsu Reporter : Wen-Hsiang Hu Author : D Elliman and JRG Pulido ∗. 2002 IEEE, Proceedings of the Sixth International Conference on Information Visualisation. Outline. Motivation Objective Introduction
E N D
Visualizing Ontology Components through Self-Organizing Maps Advisor:Dr. Hsu Reporter:Wen-Hsiang Hu Author:D Elliman and JRG Pulido∗ 2002 IEEE, Proceedings of the Sixth International Conference on Information Visualisation
Outline • Motivation • Objective • Introduction • Methods (SOM -> Ontology) • Results • Conclusions • Personal Opinion
Motivation • Some sites are an overgrown wilderness in which it is difficult to find anything of interest, even if it is known to be hidden there somewhere. • It would be useful to be able to construct some representation of the information in the site.
Objective • We describe an approach for constructing an ontology for such web sites.
Methods • Our system will organize the knowledge extracted (figure 1) from a digital archive, e.g. a digital library, or the web itself, as follows: • Set of objects (entities) • Set of functions (is-a) • Set of relations (has, part-of )
The Algorithm (1/2) The major steps of our approach are as follows: • Retrieve hyperlinks from a predefined digital archive. For our analysis we have only retrieved local ones, e.g. cs.nott.ac.uk. • Preprocess each hyperlink. For each file, the following is done: • Remove html tags, e.g. <html>,</html>. • Remove words by using a stoplist e.g. the, by, but, and the like. Common words that carry little information are pruned from the files. • For each remaining word, the following is done: • i. Its stem is obtained e.g. play is the stem of the words plays, playing, played. • ii. A weighted valued is given to it by using the tf x idf • iii. A vector space is created for each file.
The Algorithm (2/2) • Produce a document space • By using the vector spaces of the previous step, a document space is created. • Construct the SOM • By using the vector spaces of the previous step, a document space is created. • Create the Ontology • Once the SOM is done, ontology components can be visualized. This produced results that are often surprisingly close to the user’s intuitive expectation.
Results-Classifying Animals (1/2) • The animal dataset (table 1) is presented by means of a html page. Our approach uses a 4x4 SOM and presents the same data by using colored areas. Entities Attributes
Results-Classifying Animals (2/2) • small animals with feathers, big animals with hooves, and the ones with four legs and hair are also clustered together.
Results-Classifying Digital Archives (1/2) • For our second analysis a set of web pages of the Computer Science Department2 at Nottingham University • we can readily identify people within the domain, their roles4, modules5 that are taught, and research6 interests of the members of the school. • Terms like ieee, confer(encee), proceed(ings), workshop, journal, spring(er), and even the location of the school (wollaton, jubil(e), campu(s), nottingham), and how to reach it (driv(e), rout(e), map, direc(tion), guid(e)) are clustered together. • Further subcategories are also visualized, for instance, within the Image Processing and Interpretation Research Group we found terms like text, vision, ai, colour, recognition, image grouped (figure 4).
Conclusion • An ontology is a form of knowledge representation that can be used to give a sense of order to unstructured digital sources. • Visualizing Ontology Components through Self-Organizing Maps.
Personal Opinion • Application • Apply ontology to IR