1.94k likes | 1.95k Views
This tutorial explores the core problems in designing user interfaces for search and information access, including query specification, browsing and searching collections, and information visualization. Learn how to design effective search interfaces and understand user goals and information seeking behavior.
E N D
SIGIR 2004 Tutorial:User Interfaces for Information Access Marti Hearst UC Berkeley http://www.sims.berkeley.edu/~hearst
Tutorial Outline • Introduction • How to Design for Search • Core Problems • Visualization for Search
Tutorial Outline • Introduction • What do people search for (and how)? • Why is designing for search difficult? • How to Design for Search • HCI and iterative design • What works? • Small details matter • Scaffolding • The Role of DWIM • Core Problems • Query specification and refinement • Browsing and searching collections • Information Visualization for Search • Summary
Question/Answer Browse and Build Text Data Mining A of Information Needs Spectrum • What is the typical height of a giraffe? • What are some good ideas for landscaping my client’s yard? • What are some promising untried treatments for Raynaud’s disease?
Questions and Answers • What is the height of a typical giraffe? • The result can be a simple answer, extracted from existing web pages. • Can specify with keywords or a natural language query • However, most search engines are not set up to handle questions properly. • Get different results using a question vs. keywords
Classifying Queries • Query logs only indirectly indicate a user’s needs • One set of keywords can mean various different things • “barcelona” • “dog pregnancy” • “taxes” • Idea: pair up query logs with which search result the user clicked on. • “taxes” followed by a click on tax forms • Study performed on Altavista logs • Author noted afterwards that Yahoo logs appear to have a different query balance. Rose & Levinson, Understanding User Goals in Web Search, Proceedings of WWW’04
Classifying Web Queries Rose & Levinson, Understanding User Goals in Web Search, Proceedings of WWW’04
Information Seeking Behavior • Two parts of a process: • search and retrieval • analysis and synthesis of search results
Standard Model • Assumptions: • Maximizing precision and recall simultaneously • The information need remains static • The value is in the resulting document set
Alternative to the Standard Model • Users learn during the search process: • Scanning titles of retrieved documents • Reading retrieved documents • Viewing lists of related topics/thesaurus terms • Navigating hyperlinks • The “berry-picking” model • Interesting information is scattered like berries among bushes • The query is continually shifting Bates, The Berry-Picking Search: UI Design, in “User Interface Design”, Thimbley (ED), Addison-Wesley 1990
A sketch of a searcher… “moving through many actions towards a general goal of satisfactory completion of research related to an information need.” (after Bates 89) Q2 Q4 Q3 Q1 Q5 Q0 Bates, The Design of Browsing and Berry-Picking Techniques for the On-line Search Interface”, Online Review 13(5), 1989
Implications • Interfaces should provide clues for where to go next • Interfaces should make it easy to store intermediate results • Interfaces should make it easy to follow trails with unanticipated results • Different types of information needs require different kinds of search tools and interfaces • Lists of ranked results and snippets • Collection browsing tools • Comparison tables • We’ve only begun to scratch the surface!
What People do AFTER the Search • Look for Trends • Make Comparisons • Aggregation and Scaling • Identify a Critical Subset • Assess • Interpret • The rest: • cross-reference • summarize • find evocative visualizations • miscellaneous O’Day & Jeffries, Orienteering in an information landscape: how information seekers get from here to there, Proceedings of InterCHI ’93.
SenseMaking • The process of encoding retrieved information to answer task-specific questions • Combine • internal cognitive resources • external retrieved resources • Create a good representation • an iterative process • contend with a cost/benefit tradoff Russell, Stefik, Pirolli, Card, The Cost Structure of Sensemaking , Proceedings of InterCHI ’93.
Why is Supporting Search Difficult? • Everything is fair game • Abstractions are difficult to represent • The vocabulary disconnect • Users’ lack of understanding of the technology
Everything is Fair Game • The scope of what people search for is all of human knowledge and experience. • Other interfaces are more constrained (word processing, formulas, etc) • Interfaces must accommodate human differences in: • Knowledge / life experience • Cultural background and expectations • Reading / scanning ability and style • Methods of looking for things (pilers vs. filers)
Abstractions Are Hard to Represent • Text describes abstract concepts • Difficult to show the contents of text in a visual or compact manner • Exercise: • How would you show the preamble of the US Constitution visually? • How would you show the contents of Joyce’s Ulysses visually? How would you distinguish it from Homer’s TheOdyssey or McCourt’s Angela’s Ashes? • The point: it is difficult to show text without using text
Vocabulary Disconnect • If you ask a set of people to describe a set of things there is little overlap in the results.
The Vocabulary Problem Data sets examined (and # of participants) • Main verbs used by typists to describe the kinds of edits that they do (48) • Commands for a hypothetical “message decoder” computer program (100) • First word used to describe 50 common objects (337) • Categories for 64 classified ads (30) • First keywords for a each of a set of recipes (24) Furnas, Landauer, Gomez, Dumais: The Vocabulary Problem in Human-System Communication. Commun. ACM 30(11): 964-971 (1987)
The Vocabulary Problem These are really bad results • If one person assigns the name, the probability of it NOT matching with another person’s is about 80% • What if we pick the most commonly chosen words as the standard? Still not good: Furnas, Landauer, Gomez, Dumais: The Vocabulary Problem in Human-System Communication. Commun. ACM 30(11): 964-971 (1987)
Lack of Technical Understanding • Most people don’t understand the underlying methods by which search engines work.
People Don’t Understand Search Technology A study of 100 randomly-chosen people found: • 14% never type a url directly into the address bar • Several tried to use the address bar, but did it wrong • Put spaces between words • Combinations of dots and spaces • “nursing spectrum.com” “consumer reports.com” • Several use search form with no spaces • “plumber’slocal9” “capitalhealthsystem” • People do not understand the use of quotes • Only 16% use quotes • Of these, some use them incorrectly • Around all of the words, making results too restrictive • “lactose intolerance –recipies” • Here the – excludes the recipes • People don’t make use of “advanced” features • Only 1 used “find in page” • Only 2 used Google cache Hargattai, Classifying and Coding Online Actions, Social Science Computer Review 22(2), 2004 210-227.
People Don’t Understand Search Technology Without appropriate explanations, most of 14 people had strong misconceptions about: • ANDing vs ORing of search terms • Some assumed ANDing search engine indexed a smaller collection; most had no explanation at all • For empty results for query “to be or not to be” • 9 of 14 could not explain in a method that remotely resembled stop word removal • For term order variation “boat fire” vs. “fire boat” • Only 5 out of 14 expected different results • Understanding was vague, e.g.: • “Lycos separates the two words and searches for the meaning, instead of what’re your looking for. Google understands the meaning of the phrase.” Muramatsu & Pratt, “Transparent Queries: Investigating Users’ Mental Models of Search Engines, SIGIR 2001.
Tutorial Outline • Introduction • What do people search for (and how)? • Why is designing for search difficult? • How to Design for Search • HCI and iterative design • What works? • Small details matter • Scaffolding • The Role of DWIM • Core Problems • Query specification and refinement • Browsing and searching collections • Information Visualization for Search • Summary
HCI Principles • We design for the user • Not for the designers • Not for the system • AKA: user-centered design • Make use of cognitive principles where available • Important guideslines for search: • Reduce memory load • Speak the user’s language • Provide helpful feedback • Respect perceptual principles
User-Centered Design • Needs assessment • Find out • who users are • what their goals are • what tasks they need to perform • Task Analysis • Characterize what steps users need to take • Create scenarios of actual use • Decide which users and tasks to support • Iterate between • Designing • Evaluating
User Interface Design is An Iterative Process Design Evaluate Prototype Slide by James Landay
Rapid Prototyping • Build a mock-up of design • Low fidelity techniques • paper sketches • cut, copy, paste • video segments
Why Do We Prototype? • Get feedback on our design faster • Experiment with alternative designs • Fix problems before code is written • Keep the design centered on the user Slide adapted from James Landay
Evaluation • Test with real users (participants) • Formally or Informally • “Discount” techniques • Potential users interact with paper computer • Expert evaluations (heuristic evaluation) • Expert walkthroughs
What Works for Search Interfaces? • Query term highlighting • in results listings • in retrieved documents • Sorting of search results according to important criteria (date, author) • Grouping of results according to well-organized category labels (see Flamenco) • DWIM only if highly accurate: • Spelling correction/suggestions • Simple relevance feedback (more-like-this) • Certain types of term expansion • So far: not really visualization Hearst et al: Finding the Flow in Web Site Search, CACM45(9), 2002.
Highlighting Query Terms • Boldface or color • Adjacency of terms with relevant context is a useful cue.
found! found! don’t know don’t know Highlighted query term hits using Google toolbar Microso US Blackout PGA Microsoft
Small Details Matter • UIs for search especially require great care in small details • In part due to the text-heavy nature of search • A tension between more information and introducing clutter • How and where to place things important • People tend to scan or skim • Only a small percentage reads instructions
Small Details Matter • UIs for search especially require endless tiny adjustments • In part due to the text-heavy nature of search • Example: • In an earlier version of the Google Spellchecker, people didn’t always see the suggested correction • Used a long sentence at the top of the page: “If you didn’t find what you were looking for …” • People complained they got results, but not the right results. • In reality, the spellchecker had suggested an appropriate correction. • Interview with Marissa Mayer by Mark Hurst: http://www.goodexperience.com/columns/02/1015google.html