1 / 12

Ontology–based author profiling of documents

STARLab. Ontology–based author profiling of documents. Jan De Bo, Mustafa Jarrar, Ben Majer, Robert Meersman VUB STARLab Vrije Universiteit Brussel LREC 2002, Event Modelling for Multilingual Document Linking. 1.Introduction and Motivation.

brede
Download Presentation

Ontology–based author profiling of documents

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. STARLab Ontology–based author profiling of documents Jan De Bo, Mustafa Jarrar, Ben Majer, Robert Meersman VUB STARLab Vrije Universiteit Brussel LREC 2002, Event Modelling for Multilingual Document Linking Ontology-based author profiling of documents

  2. 1.Introduction and Motivation • Regular search engines result in a huge amount of information • Semantics-based filtering mechanism may help • Ontologies add more semantics to the knowledge representations used by IR-systems • + enables reasoning • + improves recall by query expansion • + improves precision • - expensive task, scalability low Ontology-based author profiling of documents

  3. 2. Using Ontology with NLPs • Definition of an ontology • An ontology is a shared agreement about a conceptualization • An ontology is more than a taxonomy or classification of terms since it includes semantic relations between terms (ORM) • Information filtering systems based on ontologies will assist the user by filtering the data stream and delivering more relevant information to the user Ontology-based author profiling of documents

  4. 3. Profiling system • A user profile • is a specification of a query on the ontology • enables a user to specify his interests and expresses this way what kind of documents he is interested in • within Namic should result in exactly news items of interest of the journalist-user Ontology-based author profiling of documents

  5. 3 (continued) • We distinguish two different filtering systems : • Cognitive filtering systems (content-based) • Social filtering systems (collaborative filtering) • Def. Of UP implies specification of a query language • Query is a composition of logic combinations using concepts and binary relations from the ontology Ontology-based author profiling of documents

  6. 3 (continued) • Benefits in IR by using ontology-based user-profiles • Improvement of recall • By query expansion (a user implicitly selects all concepts from the ontology which inherit from the selected concept) • Query needs to be expanded by relevant terms and meaningful relations from the ontology • Improvement of precision • Through disambiguation of terms (context) • Ability to navigate through the ontology for the selection of more specific concepts Ontology-based author profiling of documents

  7. 3 (continued) • Ontology is separated from the objective representations used by the NLPs • Users don’t have to be aware of the different obj. repr. of the NLPs • Once the ontology built, NLPs have to adapt their obj. repr. to it • Query on the ontology interacts independently with the obj. repr. of the NLPs Ontology-based author profiling of documents

  8. 3 (continued) • UP enables to specify language independent queries and get back related documents in all languages • UPs don’t have to be static, they can be dynamically adapted to the user’s current needs Ontology-based author profiling of documents

  9. 4. Implementation • To satisfy the requirements mentioned above we developed a tool with the following architecture: Ontology-based author profiling of documents

  10. 4 (continued) • Process of ontology engineering begins with the development of the ontology base • The ontology base contains concepts based upon the natural language processors’ objective representations • Resources considered for their incorporation into the NAMIC ontology were the following: • IPTC category system • EWN base concepts • Named Entity List • Event types Ontology-based author profiling of documents

  11. 4 (continued) • To integrate the NLPs’ obj. repr. of these different resources into the ontology, an alignment process is needed • Categories, events and named entities are aligned with EWN base concepts • By definition aligning different ontologies with one another is required to obtain agreement between the concepts of the different ontologies • It is yet unrealistic to hope that the task of alignment could be performed automatically Ontology-based author profiling of documents

  12. 5. Future work • Developing a more sophisticated conceputal query language to specify queries on the ontology Ontology-based author profiling of documents

More Related