290 likes | 422 Views
A User Evaluation of Hierarchical Phrase Browsing. Katrina D. Edgar, David M. Nichols, Gordon W. Paynter, Kirsten Thomson and Ian H. Witten. [kde2, dmn, kthomson, ihw]@cs.waikato.ac.nz. gordon.paynter@ucr.edu. New Zealand Digital Library Project
E N D
A User Evaluation of Hierarchical Phrase Browsing Katrina D. Edgar, David M. Nichols, Gordon W. Paynter, Kirsten Thomson and Ian H. Witten [kde2, dmn, kthomson, ihw]@cs.waikato.ac.nz gordon.paynter@ucr.edu New Zealand Digital Library Project Department of Computer Science University of Waikato New Zealand INFOMINE Project University of California, Riverside USA nzdl.org infomine.ucr.edu
Overview • Background: searching, browsing, … • Inferring Hierarchical Phrase Structure • Phind: an interface for phrase browsing • Evaluating Phind • User Study • Results • Conclusion
Access • Search • Browsing • Subject • Metadata • Textual documents • Concordance Hierarchical Phrase Browsing
Identifying phrases The basic insight of the phrase-finding method is that any phrase which appears more than once can be replaced by a grammatical rule that generates the phrase, and that this process can be continued recursively. The result is a hierarchical representation of the original sequence. • Nevill-Manning et al, IJDL, (1999)
Extracting nice phrases • Extract text from HTML • Stopwords, punctuation delimiters • Create overlapping phrase hierarchy • Each phrase has a set of expansions which are the longer phrases that contain it • Only repeated phrases • Maximal length condition • No unique expansion in either direction • Different LHS and RHS contexts • Turn phrase hierarchy into an interactive interface • Phind • Paynter et al, Proc. DL (2000)
Phrases that occur twice or more Prune trivial expansions
Example • FAO on the Internet CD-ROM (1998) • Food and Agriculture Organization • 187 MB of HTML • 30 mins to extract phrases • 28 MB of index files
Phind Interface • Java applet in Web pages • Just another means of access • 2 main panels
Previously we have claimed about Phind… • Good points • Automatically created • Cheap and scalable • Bad points • uncontrolled vocabulary (compared with thesaurus) • Paynter et al, DL 2000 • Only previous Phind evaluation in relation to a thesaurus • Paynter et al, Asian DL 2000
So … • It may be cheap, scalable and automatic… • … but is it any use? • What do people do when confronted by Phind? • Can they use it to find things?
User Study: participants • University of Waikato Usability Lab • http://www.cs.waikato.ac.nz/usability • 12 participants • Students, 9 male • Backgrounds: Computing, management • Individual sessions • Session length : 1 hour
User Study: collection • Existing collection within Greenstone • Web site of the Food and Agriculture Organization (FAO) of the United Nations, • CD-ROM version as distributed in 1998 • 21,700 Web pages • as well as around 13,700 associated files (image files, PDFs, etc.), • a medium-sized collection of approximately 140 million words of text
User Study: tasks • seven tasks that involve locating information, understanding content, and recognizing and using elements and functions • prompted with help during their first task • 1. exploratory questions • “find out more about national forest programmes in different countries” • 2. specific retrieval tasks • “where can golden apple snails be found?” • “what was the locust numbers situation during May in Kuwait?”
User Study: mechanics • Phind as a Java applet within Greenstone • In Internet Explorer on Windows 98 • FAO collection on public web server • nzdl.org • Video recording • Questionnaires • Before and after tasks • Summary questionnaire at end
Results: summary • Phind was • useful • liked • good at supporting exploratory tasks • bad at supporting specific tasks
Results: task performance • Specific retrieval tasks involving multiple concepts: • ‘what are the most widely planted pines for timber and pulp production in the southern United States?’ • ‘What was the locust numbers situation during May in Kuwait? • 12 attempts using Phind on these 2 tasks: • 4 gave up, 5 gave the wrong answer • 3 found the correct answer • 12 attempts using keyword searching: • 11 correct, 1 wrong • Quotes: • “You should be able to put more than one word” • “Confusing when I was searching for two different topics.”
Results: interface • 2 Windows: • Three participants minimized the document window instead of closing it • which meant that when they clicked on a document link, Phind opened the document in the hidden window • Navigation • 5 of the 12 participants did not use the ‘Previous’ or ‘Next’ buttons at all • Elements little used: • ‘get more phrases’ • ‘get more documents’
Results: questionnaires • Phind’s results (10/12) : • ‘clear and easy to understand’ • ‘relevant and useful to the query’ • ‘elements or features that they most disliked about Phind’ • “not being able to go back” • During task: “Is there a way to go back?” (2) • ‘search method they preferred overall’ • 9 to 3 in favour of keyword searching
Results • 75% of the users preferred the keyword searching over phrase browsing overall. • Despite liking the Phind interface, the participants found many problems. • main functional problem was Phind's inability to perform multi-word queries. • Phind's unfamiliarity: new interface has too many new elements
Results: links • two previously-reported design issues • Blandford et al (JCDL’01) • “working across boundaries” • in the different paradigms of browser-based keyword searching vs. the Java-based Phind interface • inconsistent experiences with the opening of windows leading to lost documents • lack of feedback during query evaluation • unfamiliar navigation tools • problems understanding the relationship between frames and result sets. • “blind alleys” • when Phind users attempted multi-term phrase queries
Technology • Java applet in Web pages • Could be run as a Server-side process • Reduce the dislocation between 2 interfaces • Selecting words from actual vocabulary • Remove zero-hit queries • Dynamic reactive Java-like interface? • Tension between different routes forward
Caveats • Numbers • Authenticity • Motivation and domain knowledge • Prior experience • Keyword searching on the web • Lack of integration • Normal work patterns • Search mode
Conclusion • Phind seems to be ok for exploration • Multi-concept queries not good • Not integrated with other searching/browsing mechanisms • Small ‘features’ of Phind confound results • Positive subjective feedback