140 likes | 227 Views
Towards Semantic Web: An Attribute-Driven Algorithm to Identifying an Ontology Associated with a Given Web Page. Dan Su Department of Computer Science Brigham Young University. Motivation. Semantic Web Enrich current web manually will be laborious, tedious and error-prone Ontology
E N D
Towards Semantic Web: An Attribute-Driven Algorithm to Identifying an Ontology Associated with a Given Web Page Dan Su Department of Computer Science Brigham Young University
Motivation • Semantic Web Enrich current web manually will be laborious, tedious and error-prone • Ontology Capture the semantics of information from various sources and output a concise description The number of different ontologies is increasing
given a web page, how could you identify which ontology in the ontology library will be associated with it?
Ontology Library • It is necessary to classify ontology in an ontology library system in order to facilitate searching, managing and re-using ontology
Text Categorization • Assign a Boolean value to each pair <dj, ci> Є D * C • Traditional approaches • Advantage of traditional approaches • Disadvantage of traditional approaches
Why not a direct matching? • Direct Matching • Reasons for discarding a direct matching: 1. The increasing large number of ontologies 2. Ignore weights of different attributes
Thesis Statement • focus on identifying the ontology associated with the given web page from an ontology library based on the attributes similarity computation • discuss the feasibility of an ontology-based machine learning approach
Assumption of Ontology Library • Unique identification: a unique URI or a unique name • Life-time: the valid period for current ontology version • Domain concept and the associated attributes
Automatic Construction of Training corpus • <html> • <p> Honda MH-2 1994</p> ……… </html> • ----------------------------------- • <html> • <ontology car.ontology> </ontology> • <p><attri make> Honda </attri> <attri model>MH-2</attri><attri year> 1994</attri> • </html>
Attributes Similarity Computation Attribute Recognizer <html> <p> Honda price $9000, model…</p> ……… </html> make
TFIDF-like computation Oj= (wj1, wj2, ………wjn) wk=afjk * idfjk
Evaluation • Precision and Recall • Compare our results with the results using Pure TFIDF algorithm.
Contribution • Automatically identify the ontology associated with a give web page • Advance the transformation from current web to a semantic web • Can be extended to the text categorization field