230 likes | 360 Views
MIA An Information System for Mobile Users. Bernd Thomas University Koblenz AI-Research Group. Motivation Architecture & Functionality Agents, Communication & Distribution Information Extraction Using Ontological Knowledge. Outline. The M obile I nformation A gent Project.
E N D
MIA An Information System for Mobile Users Bernd Thomas University Koblenz AI-Research Group
Motivation Architecture & Functionality Agents, Communication & Distribution Information Extraction Using Ontological Knowledge Outline Bernd Thomas
The MobileInformationAgent Project • Clients: WebBrowser, WAP, PDA+GPS • Ambition: • Online Web Search and • Information Extraction • Location Awareness • Anytime Algorithm • Uses Logic (LP): Agents + Ontology • Distributed Multi Agent System • Project Time: 1.1.2000 - 31.10.2002 What is MIA? A Multi Agent Information System that provides a mobile user with location based information according to his individual interests. profile creation search constraint search start and logout re-login Bernd Thomas
search Ontology Agent Server Agent User Interests Blackboard Agent Spider Agent Matchmaker Paltfom Agent host 1 Paltfom Agent host n classify extract MIA and its Agents foreign Agent query request start PDA,GPS,Mobile WAP WEB Browser KQML HTTP GPS DB ... register Black- Board host N start Bernd Thomas
Agent Communication Example Communication Session (System Startup and Search with 3 hosts): start search server -> matchmaker : ask for spider recommendation matchmaker -> platform B : create spider agent spider -> platform B : created platform B -> matchmaker : send spider address platform B -> matchmaker : send spider recommendation server -> spider : start spidering topic/city server -> spider : send all results for topic/city spider -> server : starting to search matchmaker -> blackboard : there is a new spider blackboard -> spider : send all results for topic/city phase I host A -> platform A : start platform A -> matchmaker : start platform A -> blackboard : start platform A -> server : start • MIA‘s agent system and communication architecture oriented at FIPA • MIA‘s agents use KQML performatives. • Agent platforms: abstraction from machine provide environment for agents • Other Agents can query platform agents for running agents or request starting of agents phase II host B -> platform B : start platform B -> matchmaker : register host C -> platform C : start platform C -> matchmaker : register phase III server -> matchmaker : ask for blackboard matchmaker -> server : blackboard address matchmaker -> server : recommend blackboard server -> blackboard : ask for old results blackboard -> server : send old results blackboard -> matchmaker : subscribe to agent status change Bernd Thomas
much communication less computation less communication much computation Agent Distribution Policy Distributed MAS has two goals: distribute computation among machines minimize communication between machines MIA uses simple distribution policy: platform-agent 1: matchmaker, server and blackboard platform-agent 2-n: ontology-agent and spider-agents are equally distributed Load-Balancing: MIA does not use automatic load-balancing, but while the system is online new platform-agents can be added Bernd Thomas
Information Extraction MIA uses two modes to extract information from web pages: • apply offline learned wrappers (synthesized extraction procedures) • set of predefined pages are examined by offline learned wrappers • online learning of wrappers • for each page found by the spider and positive address containment classification a wrapper is learned. • major problem: absence of examples! • Online and Offline method both learn only from positive examples • Both methods use LGG techniques on feature-terms to learn. Bernd Thomas
IE: Offline Wrapper Learning Wrapper Learning System: for offline learning and integration into the MIA system • Learning Technique: • Document Representation: • logical representation of a DOM-Tree (set of facts) • each node is represented by a feature term • Idea: • learn relevant features of ancestor and descendant nodes surrounding the relevant nodes for extraction • Method: • learning from positive examples (subtrees) only • LGG on feature terms, • user-based inductive learning • Result: generalized node paths Bernd Thomas
IE: Online Wrapper Learning • Major Problem: • how to obtain learning examples (example extractions) for unknown pages? • Idea: • use (very strict) address patterns to idenitfy only a few addresses on a page • these few matches serve as learning examples • Document Representation: • list of tokens (feature terms) • Method: • one shot learning (generalize in one step on all examples) • for each page one wrapper is learned • Result: • generalized feature-term lists used as left and right delimiters for extraction Bernd Thomas
IE: Extraction Evaluation How does the agent can verify the quality of its extractions? • Evaluation for online learned wrappers: • „self-supervision“: check if extractions match with generalized • patterns derivable from knowledge base • semantic cross check: use associated semantic of slots for evaluation • Evaluation for offline learned wrappers: • semantic cross check Bernd Thomas
MIA‘s Ontology Ontological Knowledge useful for: Web Spidering: keywords from the user profile may not be sufficient Information Extraction: check correctness of extractions Description Logic used to model ontology for gastronomy & recreation domains RACER: Renamed ABox and Concept Expression Reasoner (Volker Haarslev, Ralf Möller) KrHyper (Peter Baumgartner) [WLP2001]: bottom up model generation DL similar language (plus non-monotonic negation, rule based language) Bernd Thomas
Ontology partial TBOX of MIA‘s gastronomy ontology • currently covered: • gastronomy • recreationABox (3800 facts)TBox (~ 90 concepts) Bernd Thomas
TBOX: (implies c_mahlzeit c_essen). (equivalent c_speisestaette (and c_ort (some offers c_mahlzeit) (some of_nationality c_nationalitaet))). (implies c_fastfood (and c_speisestaette (not (some has_service c_service)))). (equivalent c_restaurant (and c_speisestaette (some has_service c_service))). RACER system ABOX: (instance antipasti c_mahlzeit). (instance ristorante c_restaurant). about(X,Explanation) :- racer('instantiators'(X),Concept), Explanation = ('instantiators'=Concept). about(X,Explanation) :- racer('concept-ancestors'(X),Subsumers), Explanation = ('concept-ancestors'=Subsumers). about(X,Explanation) :- racer('concept-descendants'(X),Subsumees), Explanation = ('concept-descendants'=Subsumees). % meta queries related(X,R):- racer('retrieve-individual-fillers'(X,'offered_by'),R1), ( assert_answer(X,R1,R) ; ( racer('retrieve-individual-fillers'(R1,'offers_same'),R2), assert_answer(X,R2,R) ) ; ( racer('retrieve-individual-fillers'(R1,'inv_offers_same'),R2), assert_answer(X,R2,R) ) ). Ontology Agent Ontology Agent queries [eclipse 6]: about(antipasti,X). X = instantiators = ['C_MAHLZEIT'] More? (;) X = instantiators = ['C_ESSEN'] More? (;) X = instantiators = ['C_VERDERBLICH'] More? (;) X = instantiators = ['C_PRODUKT'] More? (;) X = instantiators = ['C_DING'] More? (;) X = instantiators = ['C_FESTSTOFF'] More? (;) [eclipse 7]: related_term(antipasti,X). X = 'OF_NATIONALITY' = 'ITALIENISCH' More? (;) X = 'OFFERED_BY' = 'PIZZERIA' More? (;) [eclipse 11]: related(antipasti,X). X = pizzeria More? (;) X = osteria More? (;) X = pasticceria More? (;) X = ristorante More? (;) X = rosticceria More? (;) X = trattoria More? (;) X = pizza_zum_mitnehmen More? (;) X = antipasti More? (;) X = carpaccio More? (;) X = cozze More? (;) X = maccaroni More? (;) X = nudeln More? (;) info on search topics Bernd Thomas
Outlook • Need for cooperation with telecom provider for automatic user position estimation via cell information of mobile phones • Ongoing research in Information Extraction with good results for HTML/XML documents • Major problem online learning of wrappers, MIA uses very heuristic method ... good ideas needed. • Ontology based web spidering ... let us see what the semantic web project offers? • Left out in this project: sharing search and extraction work among agents Bernd Thomas
References Peter Baumgartner, Ulrich Furbach and Bernd Thomas Model Based Deduction for Knowledge Representation . 17. WLP - Workshop Logische Programmierung , Technische Universität Dresden 4-6. September 2002 Nicholas Kushmerick and Bernd Thomas Adaptive Information Extraction: A Core Technology for Information Agents . In Intelligent Information Agents R&D in Europe: An AgentLink perspective. (2002) Springer. Gerd Beuster, Bernd Thomas and Christian Wolff Ubiquitous Web Information Agents Workshop on Artificial Intelligence In Mobile Systems ,ECAI'2000 , European Conference on Aritifical Intelligence August 22nd 2000, Berlin,Germany Bernd Thomas: Token-Templates and Logic Programs for Intelligent Web Search Journal of Intelligent Information Systems , Kluwer Academic Publishers Special Issue: Methodologies for Intelligent Information Systems Volume 14, Number 2/3, March-June 2000, pp. 241-261 Bernd Thomas: Anti-Unification Based Learning of T-Wrappers for Information Extraction Workshop on Machine Learning for Information Extraction , preceeding Sixteenth National American Conference on Artifical Intelligence (AAAI-99) , July 18-19 Orlando, Florida Bernd Thomas
Register and Profile Creation • register as new user • specify search topics according to your individual interests • a user can create multiple search profiles • user is not domain bounded CLICK HERE FOR MOVIE Bernd Thomas
Start Agents and Retrieve Info • start search for specific city info • each topic handled by one spider agent • agent status monitored • caching of extraction results • relational extraction results can easily be linked to other web services CLICK HERE FOR MOVIE Bernd Thomas
Restrict the Search • search can be restricted by additional keywords CLICK HERE FOR MOVIE Bernd Thomas
Logout and Comeback Later • the user can start the search and can come back later to retrieve his information • this is very helpful for the mobile user to minimize costs CLICKE HERE FOR MOVIE Bernd Thomas
... coming back CLICK HERE FOR MOVIE Bernd Thomas
WLS: Learning a Wrapper • Many web servicesuse highly structered web pages for which machine learning based wrapper techniques are successfully applicable • WLS is a prototypical web interface to aid the MIA administrator to learn and add new wrappers. • MIA applies learned wrappers to each page of the associated web domain CLICK HERE FOR MOVIE Bernd Thomas
WLS: Applying a Wrapper CLICK HERE FOR MOVIE Bernd Thomas
monitoring the system CLICK HERE FOR MOVIE Bernd Thomas