Gunnar AAstrand Grimnes Alun Preece & Pete Edwards University of Aberdeen ggrimnes@csd.abdn.ac.uk

11/4/2002 Personal Agents - The impact of the Semantic WebFirst presented at AgentLink II - Amsterdam, December 2001 Gunnar AAstrand Grimnes Alun Preece & Pete Edwards University of Aberdeen ggrimnes@csd.abdn.ac.uk

Challenges: Representation of training events. Data volume. Noise, redundancy, inconsistencies. Ill defined semantics. Tools Statistical analysis of text. TF/IDF, SVD. Naïve bayes, Nearest Neighbour, Bagging & Boosting. Dimensionality reduction Machine Learning from unstructured data

The Semantic Web Ontologies Well defined syntax. XML Well defined semantics RDF/DAML+OIL WebServices Truly flexible autonomous agents assisting users.

Web-Agents: No more “screen scraping” Less human effort required. Recommenders: Semantic model of the user. Semantics for products and services. New recommender techniques which marry user models & product/service descriptions to deliver meaningful recommendations. Learning profiles Today: sparse vector representation Tomorrow: semantically enriched representation Semantic Web & Learning

Aims: To investigate how semantically enriched representation affects profile learning. To explore methods for mapping semantic markup to training instance representations. To explore various performance metrics (accuracy, time to learn profile, time to use profile). Hypothesis: Semantic representation should outperform simple sparse-vector (text) representation. Our Experiment

We were unable to find any dataset with semantic markup. We have used two datasets: 1. ITTalks, binary classification (like/dislike), 58 instances, classes average 19 and 38 instances. 3200 different terms. http://www.ittalks.org 2. CiteSeer Papers, 17 classes (subject areas of CS) We also did binary classifications. 5066 instances, classes average 298 instances. 400.000 distinct terms reduced to 1500 by choosing most significant, based on TF/IDF. http://citeseer.nj.nec.com/directory.html Datasets

<Talk rdf:parseType="Resource"> <Title>Bidding Algorithms for Simultaneous Auctions</Title> … <Abstract>This talk is concerned with computational problems … </Abstract> <Speaker rdf:parseType="Resource"> <Name>Amy Greenwald</Name> <Organization>Department of Computer Science Brown University</Organization> </Speaker> <Host rdf:parseType="Resource"> <Name>Timothy Finin</Name> <Organization>UMBC</Organization> </Host> ... Dataset 1 - IT Talkswww.ittalks.org Class: Pete - likes, Alun - dislikes, Gunnar - dislikes.

Class: Human Computer Interaction RDF generated from BibTex: <?xml version="1.0"?> <article key="pelachaud96generating"> <author>Catherine Pelachaud and Norman I. Badler and Mark Steedman</author> <title>Generating Facial Expressions for Speech</title> <journal>Cognitive Science</journal> <volume>20</volume> <number>1</number> <pages>1-46</pages> <year>1996</year> <url>citeseer.nj.nec.com/pelachaud94generating.html</url> </article> Dataset 2 - CiteSeer papers

Approaches: 1. Conventional textual representation 2. Treating semantic data as text 3. Mapping semantic markup to attributes 4. Using Inductive Logic Programming (Progol) Classifier Naïve Bayes and K Nearest Neighbour classifiers Binary term vector Data pre-processing: Stoplist, no stemming, length>=3, no numbers TF/IDF Experimental Methodology

<?xml version=“1.0” ?> <talk> <speaker>Gunnar Grimnes</speaker> <title>Personal Agents : The impact of Semantic Metadata</title> <description>A talk describing some experiments with learning from metadata</description> </talk> Utilising MetaData - Approach I Binary term vector of words that appear in the text: For example, if the term vector was made up of these words: talk, agent, speaker, daml, shop … This instance would look like this: Class: 1,0,1,0,0, ...

<?xml version=“1.0” ?> <talk> <speaker>Gunnar Grimnes</speaker> <title>Personal Agents : The impact of Semantic Metadata</title> <description>A talk describing some experiments with learning from metadata</description> </talk> Utilising MetaData - Approach II Each tag maps to an attribute. For example, given a list of all tags like this: talk, speaker,title, venue, description, date, … This instance would look like this: Class: {}, {gunnar, grimnes}, {personalised, customer, services, impact, semantic, web}, {}, {talk, describing, experiments, learning, metadata} … We chose to treat some tags differently, such as the ACMTopic given for the IT Talks.

Initial Results Dataset 1 - ITTalks Dataset 2 - Citeseer papers - Using Naïve Bayes Dataset 2 - Citeseer papers - Using CAIL

Progol “learns” prolog predicates given basic guidance, data-types, background info and some examples. Progol uses A* search in the solution space and is guaranteed to return the solution with the best “compression”. Progol is developed by Stephen Muggleton at York university. More details about Progol can be found in : Inverse Entailment and Progol , S. Muggleton, "New Generation Computing Journal", Vol. 13, pp. 245-286, 1995. A more general introduction to ILP and some applications: Applications of inductive logic programming, I. Bratko and S. Muggleton. “Communications of the ACM”, 38(11):65--70, 1995. Inductive Logic ProgrammingAnIntroduction to Progol in 3 minutes.

Progol Example Given this input: % Types person(jane). person(henry). person(sally). person(jim). person(sam). person(sarah). person(judy). %Examples aunt_of(jane,henry). aunt_of(sally,jim). aunt_of(judy,jim). % Background knowledge parent_of(Parent,Child) :- father_of(Parent,Child). parent_of(Parent,Child) :- mother_of(Parent,Child). father_of(sam,henry). mother_of(sarah,jim). sister_of(jane,sam). sister_of(sally,sarah). sister_of(judy,sarah). % Guidance :- modeh(1,aunt_of(+person,+person))? :- modeb(*,parent_of(-person,+person))? :- modeb(*,parent_of(+person,-person))? :- modeb(*,sister_of(+person,-person))? Progol could learn: aunt_of(A,B):-parent_of(C,B), sister_of(A,C).

Mapping RDF to Prolog predicates can be done in different ways, I chose the fairly naïve mapping to triples: Progol and Semantic Metadata <techreport rdf:about="back98java"> <author>Godmar Back and Patrick Tullmann and Leigh Stoller and Wilson C. Hsieh and Jay Lepreau</author> <title>Java Operating Systems: Design and Implementation</title> <pages>15</pages> <url>citeseer.nj.nec.com/back98java.html</url> </techreport> </rdf:RDF> triple( type, back98java, #techreport' ). triple( author, back98java, 'Godmar Back and Patrick Tullmann and Leigh Stoller...' ). triple( title, back98java, 'Java Operating Systems: Design and Implementation' ). triple( pages, back98java, '15' ). triple( url, back98java, 'citeseer.nj.nec.com/back98java.html' ).

Initial artificial experiment with 3 instance dataset worked. I’ve been trying both multi-class classification: making Progol learn class(A, Class) and binary: learning inClass(A). No results to date - Progol simply “learns” the input examples: The instance is in this class if it is any of the examples given for that class. I will continue tweaking the learning/searching parameters. Progol and my experiment

Initial Hypothesis: “Semantic representation should outperform simple sparse vector (text) representation.” What do our results to date tell us? Outstanding Issues / Questions: Other datasets? How to exploit the semantics further? The role ontological inference. Discussion and future directions

Gunnar AAstrand Grimnes Alun Preece & Pete Edwards University of Aberdeen ggrimnes@csd.abdn.ac.uk