260 likes | 390 Views
A Method for a Comparative Evaluation of Semantic Browsing. Sami Lini – Research Intern at the DERI, Galway – July 2008. Table of contents. Introduction The dataset Introduction Ontology development Data sources The interfaces Introduction The Blacklight Project
E N D
A Method for a Comparative Evaluation of Semantic Browsing Sami Lini – Research Intern at the DERI, Galway – July 2008
Table of contents • Introduction • The dataset • Introduction • Ontology development • Data sources • The interfaces • Introduction • The Blacklight Project • The evaluation protocol • The tasks • The measures • Conclusion
Introduction • What is faceted-browsing? • Browsing a dataset by applying constraints classified in categories and sub categories (facets) • Why for the Semantic Web • Data described by several variables • Good way to narrow down big datasets
Introduction • Why a Human-Computer Interaction (HCI) evaluation? • Semantic Web “dynamic” data treatment (domain independant) • HCI challenges • Interface design: • Exploit the full potential of the SW • Cognitive charge (memory, attention…) • Explanation of what the Semantic Web empowers the users to do • No benchmark dataset, no typical evaluation protocol (TREC)
Introduction • Requirements • Why a books dataset? • Exhaustive enough • General topic • Free to use • Allow faceted browsing
The dataset IntroductionOntology developmentData sources • Ontology building issues: • Use of existing ontologies • Taking into account the most information possible • Way to match the different datasets • Solution: • Mapping on ISBN/LCCN values ISBN & LCCN • ISBN: International Standard Book Number • LCCN: Library of Congress Control Number • One book = several ISBN and LCCN (different editions)
The dataset IntroductionOntology developmentData sources • Book-Crossing Dataset • CSV dataset: user ID, books ratings, ISBN values • C++ script to convert it into RDF
IntroductionOntology developmentData sources Book-Crossing
The dataset IntroductionOntology developmentData sources • OpenLibrary.org • Library of Congress MARC records dataset (books: title, author, publisher, date, categories…) MARC records: • MAchine Readable Cataloguing records • Binary files • Standard way of classifying books in digital libraries • Issues • MARC records contain either ISBN or LCCN values • Many languages = many different alphabets character conversion issues
IntroductionOntology developmentData sources Book-Crossing OpenLibrary
The dataset IntroductionOntology developmentData sources • LibraryThing.com • Web API: ISBN value XML: matching ISBN & LCCN values • XSL transformation to convert it into RDF
IntroductionOntology developmentData sources Book-Crossing OpenLibrary LibraryThing
The dataset IntroductionOntology developmentData sources • The RDF Book Mashup • RDF dataset: contains additional books information from Amazon.com • We crawl all instances of foaf:Person to have further information about authors
IntroductionOntology developmentData sources Book-Crossing Book Mashup OpenLibrary LibraryThing
The dataset Data sources Ontology development Book-Crossing OpenLibrary LibraryThing Book Mashup
The interfaces IntroductionThe Blacklight Project • Mandatory criteria: • Faceted-browsing • Ability to handle the same dataset than SWSE (dataset bias) Dataset bias: Book-Crossing dataset (= 130.000 books IDs) + additional information (ISBN/LCCN, categories) = SWSE Books Dataset Way to index SWSE Books dataset in compared interface
The interfaces IntroductionThe Blacklight Project • The Blacklight Project • Properties • Faceted and keyword search • Index MARC records RDF to MARC records conversion
The interfaces IntroductionThe Blacklight Project Keywords Facets and facet-values
The evaluation protocol • The tasks • The measures • Tasks • User-based 4 tasks with subtasks: inspired by existing tasks in Digital Libraries and faceted-browsing evaluation • First 3 tasks: scenario based • Last task: based on AOL dataset (link with automatic evaluation): evaluate different type of queries: directed search / simple browse / complex browse
The evaluation protocol • The tasks • The measures • Tasks • User-based Task 3: Gather materials for an essay about French history. Complete 4 subtasks, ranging from very specific to more open ended: • find books about French history written in English; • choose the decade for which the collection seems to have the most books about history; • find all books of an author of your choice, who has published books during the 1980's; • find another U.S. writer who wrote about Charles De Gaulle in a different way.
The evaluation protocol • User-centred evaluation procedure • Consent form • Questionnaire (demographic questions) • Performing tasks on Blacklight/SWSE(inter-interfaces bias) after reading written tasks (inter-users bias) • Questionnaire (System Usability Scale about Blacklight) • Performing tasks on SWSE/Blacklight after reading written tasks • Questionnaire (System Usability Scale about SWSE) • Questionnaire about global preferences and suggestions
The evaluation protocol • The tasks • The measures • Tasks • Automatic • Automatically crawl both interfaces according to a rating table and heuristics performance evaluation • Use of ratings (faceted-browsing) (ex: books having a better rating than…) • Use of AOL datasets (keyword search/faceted-browsing)(ex: “books on managing family home work school children social life and time for me”)
The evaluation protocol • The tasks • The measures • Measures • User-based • Objective • Time taken • Click • Search terms used • Success score • Subjective • Time evaluated by the user • Automatic • “Clicks” • Response time
Conclusion • What has been done: • Large benchmark dataset • Comparative evaluation methodology • Blacklight up and running • Limits: • Tasks inspired by existing works No typical evaluation protocol • Blacklight interface: Specialised interface (domain specific) • Interface bias