1 / 70

Using Words to Search a Thousand Images Hierarchical Faceted Metadata in Search & Browsing

Explore the use of hierarchical faceted metadata in image search and browsing, with very careful UI design and testing, backed by usability studies.

billyw
Download Presentation

Using Words to Search a Thousand Images Hierarchical Faceted Metadata in Search & Browsing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using Words to Search a Thousand ImagesHierarchical Faceted Metadata in Search & Browsing Marti Hearst SIMS, UC Berkeley Research funded by: NSF CAREER Grant IIS-9984741

  2. Outline • How do people search for images? • Current approaches: • Spatial similarity • Keywords • Our approach: • Hierarchical Faceted Metadata • Very careful UI design and testing • Usability Study • Conclusions

  3. How do people want to search and browse images? Ethnographic studies of people who use images intensely find: • Find specific objects is easy • Find images of the Empire State Building • Browsing is hard, and people want to use rich descriptors.

  4. Ethnographic Studies • Garber & Grunes ’92 • Art directors, art buyers, stock photo researchers • Search for appropriate images is iterative • After specifying and weighting criteria, searchers view retrieved images, then • Add restrictions • Change criteria • Redefine Search • Concept starts out loosely defined, then becomes more refined.

  5. Ethnographic Studies • Markkula & Sormunen ’00 • Journalists and newspaper editors • Choosing photos from a digital archive • Stressed a need for browsing • Searching for specific objects is trivial • Photos need to deal with themes, places, types of objects, views • Had access to a powerful interface, but it had 40 entry forms and was generally hard to use; no one used it.

  6. Query Study • Armitage & Enser ’97 • Analyzed 1,749 queries submitted to 7 image and film archives • Classified queries into a 3x4 facet matrix • Rio Carnivals: Geo Location x Kind of Event • Conclude that users want to search images according to combinations of topical categories.

  7. Ethnographic Study • Ame Elliot ’02 • Architects • Common activities: • Use images for inspiration • Browsing during early stages of design • Collage making, sketching, pinning up on walls • This is different than illustrating powerpoint • Maintain sketchbooks & shoeboxes of images • Young professionals have ~500, older ~5k • No formal organization scheme • None of 10 architects interviewed about their image collections used indexes • Do not like to use computers to find images

  8. Current Approaches to Image Search • Using Visual “Content” • Extract color, texture, shape • QBIC (Flickner et al. ‘95) • Blobworld (Carson et al. ‘99) • Body Plans (Forsyth & Fleck ‘00) • Piction: images + text (Srihari et al. ’91 ’99) • Two uses: • Show a clustered similarity space • Show those images similar to a selected one • Usability studies: • Rodden et al.: a series of studies • Clusters don’t work; showing textual labels is promising.

  9. Rodden et al., CHI 2001

  10. Rodden et al., CHI 2001

  11. Rodden et al., CHI 2001

  12. Current Approaches to Image Search • Keyword based • WebSeek (Smith and Jain ’97) • Commercial image vendors (Corbis, Getty) • Commercial web image search systems • Museum web sites

  13. A Disconnect Why are image search systems built so differently from what people want? • An image is worth a thousand words. • But the converse has merit too!

  14. Some Challenges • Users don’t like new search interfaces. • How to show lots more information without overwhelming or confusing?

  15. Our Approach • Integrate the search seamlessly into the information architecture. • Use proper HCI methodologies. • Use faceted metadata: • More flexible than canned hyperlinks • Less complex than full search • Help users see where to go next and return to what happened previously

  16. Faceted Metadata

  17. GeoRegion + Time/Date + Topic Metadata: data about dataFacets: orthogonal categories

  18. Faceted Metadata: Biomedical MeSH (Medical Subject Headings)www.nlm.nih.org/mesh

  19. Mesh Facets (one level expanded)

  20. Questions we are trying to answer • How many facets are allowable? • Should facets be mixed and matched? • How much is too much? • Should hierarchies be progressively revealed, tabbed, some combination? • How should free-text search be integrated?

  21. An Important Trend in Information Architecture Design • Generating web pages from databases • Implications: • Web sites can adapt to user actions • Web sites can be instrumented

  22. A Taxonomy of WebSites high Complexity of Data low low high Complexity of Applications From: The (Short) Araneus Guide to Website development, by Mecca, et al, Proceedings of WebDB’99, http://www-rocq.inria.fr/~cluet/WEBDB/procwebdb99.html

  23. The Flamenco Interface • Nine hierarchical facets • Matrix • SingleTree • Chess metaphor • Opening • Middle game • End game • Tightly Integrated Search • Expand as well as Refine • Intermediate pages for large categories

  24. What is Tricky About This? • It is easy to do it poorly • See Yahoo example • It is hard to be not overwhelming • Most users prefer simplicity unless complexity really makes a difference • It is hard to “make it flow” • Can it feel like “browsing the shelves”?

  25. How NOT to do it • Yahoo uses faceted metadata poorly in both their search results and in their top-level directory • They combine region + other hierarchical facets in awkward ways

  26. Yahoo’s use of facets

  27. Yahoo’s use of facets

  28. Yahoo’s use of facets

  29. Yahoo’s use of facets • Where is Berkeley? • College and University > Colleges and Universities >United States > U > University of California > Campuses > Berkeley • U.S. States > California > Cities >Berkeley > Education > College and University > Public > UC Berkeley

  30. Problem with Metadata Previews as Currently Used • Hand edited, predefined • Not tailored to task as it develops • Not personalized • Often not systematically integrated with search, or within the information architecture in general

  31. HCI Methodology • Identify Target Population • Needs assessment. • What to people want; how to they work? • Lo-fi prototyping. • Produce cheap (throw-away) prototypes • Get feedback from target population • Design / Study Round 1. • Simple interactive version. See if main ideas work. • Design / Study Round 2: • More thorough interactive version; more graphics. Begin to fine-tune, fix remaining major problems • Design / Study Round 3: • Continue to fine-tune. Introduce more advanced features.

  32. Our Project History • Identify Target Population • Architects, city planners • Needs assessment. • Interviewed architects and conducted contextual inquiries. • Lo-fi prototyping. • Showed paper prototype to 3 professional architects. • Design / Study Round 1. • Simple interactive version. Users liked metadata idea. • Design / Study Round 2: • Developed4 different detailed versions; evaluated with 11 architects; results somewhat positive but many problems identified. Matrix emerged as a good idea. • Metadata revision. • Compressed and simplified the metadata hierarchies

  33. Our Project History • Design / Study Round 3. • New version based on results of Round 2 • Highly positive user response • Identified new user population/collection • Students and scholars of art history • Fine arts images • Study Round 4 • Compare the metadata system to a strong, representative baseline

  34. New Usability Study • Participants & Collection • 32 Art History Students • ~35,000 images from SF Fine Arts Museum • Study Design • Within-subjects • Each participant sees both interfaces • Balanced in terms of order and tasks • Participants assess each interface after use • Afterwards they compare them directly • Data recorded in behavior logs, server logs, paper-surveys; one or two experienced testers at each trial. • Used 9 point Likert scales. • Session took about 1.5 hours; pay was $15/hour

  35. The Baseline System • Floogle • Take the best of the existing keyword-based image search systems

  36. Comparison of Common Image Search Systems

  37. sword

  38. Evaluation Quandary • How to assess the success of browsing? • Timing is usually not a good indicator • People often spend longer when browsing is going well. • Not the case for directed search • Can look for comprehensiveness and correctness (precision and recall) … • … But subjective measures seem to be most important here.

More Related