220 likes | 339 Views
Hermes: a Semantic Web-Based News Decision Support System*. Flavius Frasincar frasincar@few.eur.nl Erasmus University Rotterdam. * Joint work with Jethro Borsje and Leonard Levering. Contents. Motivation Hermes Framework: News Classification Query Formulation Results Presentation
E N D
Hermes: a Semantic Web-Based News Decision Support System* Flavius Frasincar frasincar@few.eur.nl Erasmus University Rotterdam * Joint work with Jethro Borsje and Leonard Levering SAC WT 2008
Contents • Motivation • Hermes Framework: • News Classification • Query Formulation • Results Presentation • Hermes News Portal: • An example • Conclusions • Future Work SAC WT 2008
Motivation • Large quantity of news on the Web: • Difficult to find the ones of interest • Limited annotation of RSS feeds: • Broad categories (business, cars, entertainment, etc.) • News messages have a big impact on stock prices • Google finance shows direct news which pertain to a certain portofolio: • Indirect news (competitors of Google like Microsoft are not presented) • Not possible to ask (time-related) queries about news SAC WT 2008
Hermes Framework • Input: • News items from RSS feeds • Domain ontology linked to a semantic lexicon (e.g., WordNet) • Output: • News items relevant for a particular user • Three steps: • News Classification: • Relate news items to ontology concepts • Query Formulation • Allow the user to express his concepts of interest • Results Presentation • Present the news items that match user’s concepts of interest SAC WT 2008
1. News Classification • Concept defined in the ontology (class or individual) • Multiple lexical representations for the same concept: • Ontology synonyms (e.g., New York→ New York, Big Apple) • Semantic lexicon synonyms (e.g., buy→acquire) • Concepts without subclasses or instances: • Semantic lexicon hyponyms (e.g., company→dot-com) • Lookup ontology concepts into news items • Heuristics: at least three hits (concepts) in a news item • Work in progress: use a word sense disambiguation algorithm (e.g., SSI, GAMBL) SAC WT 2008
1. News Classification • The news classification process: SAC WT 2008
2. Query Formulation • Present the domain knowledge as directed labeled multi-graph: • with the additional constraint that arcs between two nodes are not allowed to share the same label • User selects the concepts of interest in the original graph (e.g., Google) • User is able to add to its selection concepts related to the concepts of interests via a certain relation (e.g., hasCompetitors: Microsoft, eBay, and Yahoo) • The selected concepts are presented in a separate graph (called search graph) SAC WT 2008
2. Query Formulation • News are time stamped • User is able to specify that only news in a certain time interval should be retrieved • Time constraints: • Last hour • Last day • Last year • [2007-03-01T00:00:00.000+00:01, 2007-05-31T00:00:00.000+00:01 ] SAC WT 2008
3. Results Presentation • Return news items that match a query • Present the concepts involved in the query • Per each news items show a summary: • Title • Source • Date • Few lines from the news item • Emphasize the hits (found concepts from the ontology) in the retrieved news items SAC WT 2008
Hermes News Portal • Hermes News Portal (HNP) is an implementation of the Hermes framework • Implementation language: Java • Ontology represention langauge: OWL • Semantic lexicon: WordNet • Graph visualization: Prefuse • Query language: SPARQL • SPARQL extended with custom time functions (e.g., currentDate(),currentTime(), etc.) SAC WT 2008
An Example • Query: Which are the news items interesting for Google from the past three months? SAC WT 2008
News Classification • Conceptual graph: SAC WT 2008
2. Query Formulation • Concepts selection: SAC WT 2008
2. Query Formulation • Conceptual graph: Individuals Classes Selected concepts Concepts related to the selected node Concepts from keyword search SAC WT 2008
2. Query Formulation • Search graph: SAC WT 2008
2. Query Formulation PREFIX hermes: <http://hermes-news.org/news.owl#> SELECT ?title WHERE { ?news hermes:title ?title . ?news hermes:time ?date . ?news hermes:relation ?relation . ?relation hermes:relatedTo hermes:Google . FILTER ( ?date > "2007-03-01T00:00:00.000+00:01" && ?date < "2007-05-31T00:00:00.000+00:01" ) } • SPARQL query: SAC WT 2008
2. Query Formulation • Custom time functions: SAC WT 2008
2. Query Formulation • Extended SPARQL query: PREFIX hermes: <http://hermes-news.org/news.owl#> SELECT ?title WHERE { ?news hermes:title ?title . ?news hermes:time ?date . ?news hermes:relation ?relation . ?relation hermes:relatedTo hermes:Google . FILTER ( ?date > hermes:dateTime-substract(hermes:now(), P0Y3M) && ?date < hermes:now() ) } SAC WT 2008
3. Results Presentation SAC WT 2008
Conclusions • Hermes Framework: presents news items that match the user interests • Hermes Framework: • News Classification • Query Formulation • Results Presentation • Hermes News Portal (HNP): an implementation of the Hera framework • HNP based on: • WordNet semantic lexicon, OWL ontology, (extended) SPARQL queries, Prefuse visualization SAC WT 2008
Future Work • Word Sense Disambiguation: • SSI • GAMBL • Ontology updates: • Learning from news items • Check if the extracted information obeys the ontology axioms: • Faulty extraction • Ontology axioms update • Simplify the query interface: • Allow users to ask English queries from a limited vocabulary • Evaluate the tool outside the university lab SAC WT 2008