680 likes | 781 Views
Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002 http://www.sims.berkeley.edu/academics/courses/is202/f03/. Lecture 22: Interfaces for Information Retrieval I. SIMS 202: Information Organization and Retrieval. Lecture Overview.
E N D
Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002 http://www.sims.berkeley.edu/academics/courses/is202/f03/ Lecture 22: Interfaces for Information Retrieval I SIMS 202: Information Organization and Retrieval
Lecture Overview • Review of Last Time • Web Search Engines and Algorithms • Interfaces for Information Retrieval • Introduction to HCI • Why Interfaces Don’t Work • Early Visions: Memex • Discussion Questions • Action Items for Next Time Credit for some of the slides in this lecture goes to Marti Hearst
Lecture Overview • Review of Last Time • Web Search Engines and Algorithms • Interfaces for Information Retrieval • Introduction to HCI • Why Interfaces Don’t Work • Early Visions: Memex • Discussion Questions • Action Items for Next Time Credit for some of the slides in this lecture goes to Marti Hearst
Directories Hand-selected sites Search over the contents of the descriptions of the pages Organized in advance into categories Search Engines All pages in all sites Search over the contents of the pages themselves Organized after the query by relevance rankings or other scores Directories vs. Search Engines
Challenges for Web Searching: Data • Distributed data • Volatile data/“Freshness”: 40% of the web changes every month • Exponential growth • Unstructured and redundant data: 30% of web pages are near duplicates • Unedited data • Multiple formats • Commercial biases • Hidden data
Challenges for Web Searching: Users • Users unfamiliar with search engine interfaces (e.g., Does the query “apples oranges” mean the same thing on all of the search engines?) • Users unfamiliar with the logical view of the data (e.g., Is a search for “Oranges” the same things as a search for “oranges”?) • Many different kinds of users
Web Search Queries • Web search queries are SHORT • ~2.4 words on average (Aug 2000) • Has increased, was 1.7 (~1997) • User expectations • Many say “the first item shown should be what I want to see!” • This works if the user has the most popular/common notion in mind
Search Engines • Crawling • Indexing • Querying
Standard Web Search Engine Architecture Check for duplicates, store the documents DocIds crawl the web user query create an inverted index Inverted index Search engine servers Show results To user
Google • Google maintains (currently) the world’s largest Linux cluster (over 15,000 servers) • These are partitioned between index servers and page servers • Index servers resolve the queries (massively parallel processing) • Page servers deliver the results of the queries • Over 3 Billion web pages are indexed and served by Google
Starting Points: What is Really Being Used? • Today’s search engines combine these methods in various ways • Integration of directories • Today most web search engines integrate categories into the results listings • Lycos, MSN, Google • Link analysis • Google uses it; others are also using it • Words on the links seems to be especially useful • Page popularity • Many use DirectHit’s popularity rankings
Ranking: Link Analysis • Assumptions: • If the pages pointing to this page are good, then this is also a good page • The words on the links pointing to this page are useful indicators of what this page is about • References: Page et al. 98, Kleinberg 98
Ranking: Link Analysis • Why does this work? • The official Toyota site will be linked to by lots of other official (or high-quality) sites • The best Toyota fan-club site probably also has many links pointing to it • Less high-quality sites do not have as many high-quality sites linking to them
Lecture Overview • Review of Last Time • Web Search Engines and Algorithms • Interfaces for Information Retrieval • Introduction to HCI • Why Interfaces Don’t Work • Early Visions: Memex • Discussion Questions • Action Items for Next Time Credit for some of the slides in this lecture goes to Marti Hearst
Human-Computer Interaction (HCI) • Human • The end-users of a program • The others in the organization • The designers of the program • Computer • The machines the programs run on • Interaction • The users tell the computers what they want • The computers communicate results • The computer may also tell users what the computer wants them to do
What is HCI? Organizational & Social Issues Task Design Technology Humans
Shneiderman on HCI • Well-designed interactive computer systems • Promote • Positive feelings of success • Competence • Mastery • Allow users to concentrate on their work, exploration, or pleasure, rather than on the system or the interface
Design Guidelines • Set of design rules to follow • Apply at multiple levels of design • Are neither complete nor orthogonal • Have psychological underpinnings (ideally)
Shneiderman’s Design Principles • Provide informative feedback • Permit easy reversal of actions • Support an internal locus of control • Reduce working memory load • Provide alternative interfaces for expert and novice users
HCI for IR • Information seeking is an imprecise process • UI should aid users in understanding and expressing their information needs • Help formulate queries • Select among available information sources • Understand search results • Keep track of the progress of their search
Provide Informative Feedback • About: • The relationship between query specification and documents retrieved • Relationships among retrieved documents • Relationships between retrieved documents and metadata describing collections
Reduce Working Memory Load • Provide mechanisms for keeping track of choices made during the search process • Allow users to: • Return to temporarily abandoned strategies • Jump from one strategy to the next • Retain information and context across search sessions • Provide browsable information that is relevant to the current stage of the search process • Related terms or metadata • Search starting points (e.g., lists of sources, topic lists)
Interfaces For Expert And Novice Users • Simplicity vs. power tradeoffs • “Scaffolded” user interface • How much information to show the user? • Number and complexity of user operations • Variants of operations • Inner workings of system itself • System history • Example: • Television remote control
User Differences • Abilities, preferences, predilections • Spatial ability • Memory • Reasoning abilities • Verbal aptitudes • Personality differences • Age, gender, ethnicity, class, sexuality, culture, education • Modalilty preferences/restrictions • Vision, audition, speech, gesture, haptics, locomotion
Nielsen’s Usability Slogans • Your best guess is not good enough • The user is always right • The user is not always right • Users are not designers • Designers are not users • Less is more • Details matter (from Nielsen’s “Usability Engineering”)
Who Builds UIs? • A team of specialists (ideally) • Graphic designers • Interaction / interface designers • Technical writers • Marketers • Test engineers • Software engineers • Enthnographers • Cognitive psychologists
How to Design and Build UIs Design Evaluate Prototype • Task analysis • Rapid prototyping • Evaluation • Implementation Iterate at every stage!
Task Analysis • Observe existing work practices • Create examples and scenarios of actual use • Try out new ideas before building software
Rapid Prototyping • Build a mock-up of design • Low fidelity techniques • Paper sketches • Cut, copy, paste • Video segments • Interactive prototyping tools • Visual Basic, HyperCard, Director, etc. • UI builders • NeXT, etc.
Evaluation Techniques • Qualitative vs. quantitative methods • Qualitative (non-numeric, discursive, ethnographic) • Focus groups • Interviews • Surveys • User observation • Participatory design sessions • Quantitative (numeric, statistical, empirical) • User testing • System testing
Qualitative Questions • User experience • User preferences • User recommendations • “Design dialogue”
Quantitative Questions • Precision • Recall • Time required to learn the system • Time required to achieve goals on benchmark tasks • Error rates • Retention of the use of the interface over time
Lecture Overview • Review of Last Time • Web Search Engines and Algorithms • Interfaces for Information Retrieval • Introduction to HCI • Why Interfaces Don’t Work • Early Visions: Memex • Discussion Questions • Action Items for Next Time Credit for some of the slides in this lecture goes to Marti Hearst
Why Interfaces Don’t Work • Because… • We still think of using the interface • We still talk of designing the interface • We still talk of improving the interface • “We need to aid the task, not the interface to the task.” • “The computer of the future should be invisible.”
Norman on Design Priorities • The user—what does the person really need to have accomplished? • The task—analyze the task. How best can the job be done?, taking into account the whole setting in which it is embedded, including the other tasks to be accomplished, the social setting, the people, and the organization. • As much as possible, make the task dominate; make the tools invisible. • Then, get the interaction right, making things the right things visible, exploiting affordances and constraints, providing the proper mental models, and so on—the rules of good design for the user, written about many, many times in many, many places.
Lecture Overview • Review of Last Time • Web Search Engines and Algorithms • Interfaces for Information Retrieval • Introduction to HCI • Why Interfaces Don’t Work • Early Visions: Memex • Discussion Questions • Action Items for Next Time Credit for some of the slides in this lecture goes to Marti Hearst
“What Dr. Bush Foresees” Cyclops Camera Worn on forehead, it would photograph anything you see and want to record. Film would be developed at once by dry photography. Microfilm It could reduce Encyclopaedia Britannica to volume of a matchbox. Material cost: 5¢. Thus a whole library could be kept in a desk. Vocoder A machine which could type when talked to. But you might have to talk a special phonetic language to this mechanical supersecretary. Thinking machine A development of the mathematical calculator. Give it premises and it would pass out conclusions, all in accordance with logic. Memex An aid to memory. Like the brain, Memex would file material by association. Press a key and it would run through a “trail” of facts.