210 likes | 225 Views
Hermes system helps find relevant news on the web using semantic technology. Classifies news, forms queries, and presents results based on user preferences. Includes Hermes News Portal implementation details.
E N D
Hermes: a Semantic Web-Based News Decision Support System* Flavius Frasincar frasincar@few.eur.nl Erasmus University Rotterdam * Joint work with Jethro Borsje and Leonard Levering SAC WT 2008
Contents • Motivation • Hermes Framework: • News Classification • Query Formulation • Results Presentation • Hermes News Portal: • An example • Conclusions • Future Work SAC WT 2008
Motivation • Large quantity of news on the Web: • Difficult to find the ones of interest • Limited annotation of RSS feeds: • Broad categories (business, cars, entertainment, etc.) • News messages have a big impact on stock prices • Google finance shows direct news which pertain to a certain portofolio: • Indirect news (competitors of Google like Microsoft are not presented) • Not possible to ask (time-related) queries about news SAC WT 2008
Hermes Framework • Input: • News items from RSS feeds • Domain ontology linked to a semantic lexicon (e.g., WordNet) • Output: • News items relevant for a particular user • Three steps: • News Classification: • Relate news items to ontology concepts • Query Formulation • Allow the user to express his concepts of interest • Results Presentation • Present the news items that match user’s concepts of interest SAC WT 2008
1. News Classification • Concept defined in the ontology (class or individual) • Multiple lexical representations for the same concept: • Ontology synonyms (e.g., New York→ New York, Big Apple) • Semantic lexicon synonyms (e.g., buy→acquire) • Concepts without subclasses or instances: • Semantic lexicon hyponyms (e.g., company→dot-com) • Lookup ontology concepts into news items • Heuristics: at least three hits (concepts) in a news item • Work in progress: use a word sense disambiguation algorithm (e.g., SSI, GAMBL) SAC WT 2008
1. News Classification • The news classification process: SAC WT 2008
2. Query Formulation • Present the domain knowledge as directed labeled multi-graph: • with the additional constraint that arcs between two nodes are not allowed to share the same label • User selects the concepts of interest in the original graph (e.g., Google) • User is able to add to its selection concepts related to the concepts of interests via a certain relation (e.g., hasCompetitors: Microsoft, eBay, and Yahoo) • The selected concepts are presented in a separate graph (called search graph) SAC WT 2008
2. Query Formulation • News are time stamped • User is able to specify that only news in a certain time interval should be retrieved • Time constraints: • Last hour • Last day • Last year • [2007-03-01T00:00:00.000+00:01, 2007-05-31T00:00:00.000+00:01 ] SAC WT 2008
3. Results Presentation • Return news items that match a query • Present the concepts involved in the query • Per each news items show a summary: • Title • Source • Date • Few lines from the news item • Emphasize the hits (found concepts from the ontology) in the retrieved news items SAC WT 2008
Hermes News Portal • Hermes News Portal (HNP) is an implementation of the Hermes framework • Implementation language: Java • Ontology represention langauge: OWL • Semantic lexicon: WordNet • Graph visualization: Prefuse • Query language: SPARQL • SPARQL extended with custom time functions (e.g., currentDate(),currentTime(), etc.) SAC WT 2008
An Example • Query: Which are the news items interesting for Google from the past three months? SAC WT 2008
News Classification • Conceptual graph: SAC WT 2008
2. Query Formulation • Concepts selection: SAC WT 2008
2. Query Formulation • Conceptual graph: Individuals Classes Selected concepts Concepts related to the selected node Concepts from keyword search SAC WT 2008
2. Query Formulation • Search graph: SAC WT 2008
2. Query Formulation PREFIX hermes: <http://hermes-news.org/news.owl#> SELECT ?title WHERE { ?news hermes:title ?title . ?news hermes:time ?date . ?news hermes:relation ?relation . ?relation hermes:relatedTo hermes:Google . FILTER ( ?date > "2007-03-01T00:00:00.000+00:01" && ?date < "2007-05-31T00:00:00.000+00:01" ) } • SPARQL query: SAC WT 2008
2. Query Formulation • Custom time functions: SAC WT 2008
2. Query Formulation • Extended SPARQL query: PREFIX hermes: <http://hermes-news.org/news.owl#> SELECT ?title WHERE { ?news hermes:title ?title . ?news hermes:time ?date . ?news hermes:relation ?relation . ?relation hermes:relatedTo hermes:Google . FILTER ( ?date > hermes:dateTime-substract(hermes:now(), P0Y3M) && ?date < hermes:now() ) } SAC WT 2008
3. Results Presentation SAC WT 2008
Conclusions • Hermes Framework: presents news items that match the user interests • Hermes Framework: • News Classification • Query Formulation • Results Presentation • Hermes News Portal (HNP): an implementation of the Hera framework • HNP based on: • WordNet semantic lexicon, OWL ontology, (extended) SPARQL queries, Prefuse visualization SAC WT 2008
Future Work • Word Sense Disambiguation: • SSI • GAMBL • Ontology updates: • Learning from news items • Check if the extracted information obeys the ontology axioms: • Faulty extraction • Ontology axioms update • Simplify the query interface: • Allow users to ask English queries from a limited vocabulary • Evaluate the tool outside the university lab SAC WT 2008