450 likes | 522 Views
Surviving the Information Explosion. Jaime Teevan, MIT. with Christine Alvarado, Mark Ackerman and David Karger. Let Me Interview You!. Web:. What’s the last Web page you visited? How did you get there? Have you looked for anything on the Web?. Email:.
E N D
Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger
Let Me Interview You! • Web: • What’s the last Web page you visited? How did you get there? • Have you looked for anything on the Web? • Email: • What’s the last email you read? What did you do with it? • Have you gone back to an email you’ve read before? • Files: • What’s the last file you looked at? How did you get to it? • Have you looked for a file?
Overview • Introduction • Related Work • Study Methodology • Results: Search • Discussion Intro RW Study Res Disc
Overview • Introduction • Related Work • Study Methodology • Results: Search • Discussion Intro RW Study Res Disc
The Information Explosion You must extract information from: • 3 billion Web pages (Google) • Dozens of incoming emails daily • Hundreds of files on your personalcomputer Intro RW Study Res Disc
Haystack Haystack:Personal Information Storage Intro Web pages Email RW Study Res Files Calendar Contacts Disc
Haystack:Personal Information Storage Intro What was that paper I read last week about Information Retrieval? RW Haystack Study Res Disc
Haystack:Personal Information Storage Intro Ah yes! Thank you. RW Haystack Study Res Disc
Supporting Information Interaction • Treat different corpora the same? • Provide access to meta-data? • Keyword search (XP, advanced search) • Browse (Hearst) Intro RW Study We don’t really know … Res Understand access in the wild! Disc
Overview • Introduction • Related Work • Study Methodology • Results: Search • Discussion Intro • Interaction by corpus RW • How people search Study Res Disc
Interaction By Corpus • Paper documents • [Malone, 1983], [Whittaker & Hirshberg, 2001] • Files • [Barreau & Nardi, 1995] • Web • [Abrams, et al. 1998], [Byrne, et al. 1999] • Email/Calendar • [Whittaker & Snider, 1996], [Bellotti & Smith, 2000] Intro RW Study Res Disc
How People Look for Information • Focus: Web • Log analysis • [Catledge & Pitkow, 95], [Tauscher & Greenberg 97] • Controlled tasks/environment • [Baldonado & Winograd, 1997], [Spool, 1998] • Situated navigation • Micronesian islanders [Suchman, 1987] • Electronic [Marchionini, 1995], [Hearst, 2000] • Information scent [Chi, Pirolli, Chen & Pitkow, 2001] Intro RW Study Res Disc
Overview • Introduction • Related Work • Study Methodology • Results: Search • Discussion Intro RW Study Res Disc
Method • Subjects • 15 MIT CS graduate students (5 women, 10 men) • Setup • 10 short interviews (~ 5 min.) • 1 long interview (~ 45 min.) • Topics • Web, Email, Files Intro RW Study Res Disc
Short Interviews • Modified diary study [Palen, 2002] • Randomly interrupted participant • Two question types • Last email/file/Web page looked at • Last email/file/Web page looked for • Goal: Discover patterns in searching and browsing Intro RW Study Res Disc
Long Interviews • “Guided tour” of subject’s Web space, email, and file system • Goals: • Discover organizational patterns • Discover problems in organizational structure • Relate organization to search/browse behavior Intro RW Study Res Disc
Overview • Introduction • Related Work • Study Methodology • Results: Search • Discussion Intro • What and how RW • Relating what and how Study • Individual strategies Res Disc
Complex Information Spaces • People had complex spaces • Felt in control Intro • “That’s an interesting question. I think my email is the worst, because I have so much of it. And there are people on the other end who expect me to reply to it. My file system is pretty well organized. I have to go through it every once in a while, every couple of months and just kind of push things into the right folders and delete the old stuff. The Web just works, usually.” RW Study Res Disc
What People Look For • Specific Information • A small fact • E.g., URL, phone number, appointment time • General Information • A broad set of information • E.g., good sneakers to buy, info on cancer • Specific Document • The actual document • E.g., a file to print, an email to reply to Intro RW Study Res Disc
How People Look For Information • The last thing you looked for on the Web Intro • Did you use a search engine? RW • Search is more than just keyword search • Browse, use bookmarks, type URLs Study “I was looking to figure out where Glaris was. When I lived in Switzerland there were only a few reasonable mapping places of the country. And so I had bookmarked [the Switzerland map site].” Res Disc
Strategies Looking for Information • Teleporting • Orienteering Intro • Traditional search • Jump directly to target • Specify everything up front RW Study • Use local navigation • [O’Day and Jeffries, 1993] • Could include keyword search Res Disc
Example: Orienteering […] J:I knew that she had a very small Web page saying, “I’m here at Harvard. Here’s my contact information.” Intro Interviewer: Have you looked for anything on the Web today? Jim: I had to look for the office number of the Harvard professor. I: So how did you go about doing that? J: I went to the homepage of the Math department at Harvard RW Study […] I:So you went to the Math department, and then what did you do over there? J:It had a place where you can find people and I went to that page and they had a dropdown list of visiting faculty, and so I went to that link and I looked for her name and there it was. Res Disc
Example: Teleporting • What if Jim had teleported instead? Intro RW Could have typed into a search engine: “Connie Monroe, office number” Study Res Disc
“Keyword Search” and “Browse” • Teleporting • Orienteering • Teleporting • Orienteering Intro • Traditional search • Jump directly to target • Specify everything up front “Keyword Search” “KeywordSearch” RW Study “Keyword Search” and “Browse” “Keyword Search” and “Browse” “Keyword Search” and “Browse” • Use local navigation • [O’Day and Jeffries, 1993] • Could include keyword search Res Disc
Relating How and What • People orienteer a lot • What people look for related to how they look Intro RW Study Res Orienteer to specific information • Surprise: Disc
Why So Much Orienteering? • Your last email search Intro • Did you know what email contained that information? • What were you looking for? RW Study • People look for the information source • Specific information searches Document searches Res Disc
Looking for the Source: Example Intro “I was looking to figure out where Glaris was. When I lived in Switzerland there were only a few reasonable mapping places of the country. And so I had bookmarked [the Switzerland map site].” RW Study Res Disc
Looking for the Source: Example Interviewer: Have you looked for anything on the Web today? Jim: I had to look for the office number of the Harvard professor. I: So how did you go about doing that? J: I went to the homepage of the Math department at Harvard […] J:I knew that she had a very small Web page saying, “I’m here at Harvard. Here’s my contact information. […] I:So you went to the Math department, and then what did you do over there? J:It had a place where you can find people and I went to that page and they had a dropdown list of visiting faculty, and so I went to that link and I looked for her name and there it was. Intro RW Study Res Disc
Individual Strategies • Search strategies varied by individual • Pilers: Pile information • Filers: File information Intro RW • Where was the last email you found? Study • Inbox? • Elsewhere? Res Disc
File or Pile Email Intro Filer RW Study Piler Res Disc
How Individuals Search For Files Intro Filers Teleport RW Study Res Pilers Orienteer Disc
Overview • Introduction • Related Work • Study Methodology • Results • Discussion Intro • Understanding and applying what we learn RW • Future work Study Res Disc
UnderstandingTeleporting v. Orienteering Why was orienteering chosen over teleporting? • Teleporting doesn’t work • Teleporting requires too much cognitive effort • Risk of over-specifying target • Orienteering gives knowledge of the source • Teleporting a failure mode • Can’t associate information with source • Can’t find the information source Intro RW Study Res Disc
Understanding Filers v. Pilers Why do filers teleport more than pilers? • Irony: Those with good organization don’t take advantage of it • Filers have strictly organized information Are used to defining meta-data for their information • Pilers loosely organize their information Are used to associative navigating Intro RW Study Res Disc
Haystack: Applying What We Learn • Using meta-data: Support orienteering • Not about having the perfect search interface • Need ability to prompt • Individualized support • Pilers/filers • Learning individual behaviors Intro RW Study Res Disc
Future Work: Search • Previously viewed information • Causes of failure • Searches across corpus • Getting help from others Intro RW Study Res Disc
Future Work: Organization • Consistency of organization across corpus • Corpora boundaries • Context used in organization • Organization’s effect on search Intro RW Study Res Disc
Conclusion • Look at search in the wild • Strategies: Teleport/Orienteer • Individual strategies • Future systems should: • Support orienteering • Provide individualized support
Questions? Contact us with comments: - teevan@ai.mit.edu - calvarad@ai.mit.edu To learn more about Haystack: http://haystack.lcs.mit.edu
Relating How and Corpus • Email and files: Almost always orienteered • Easy to associate information with document • Web: Teleported much more often Intro RW Study Res Disc
Relating What and Corpus Intro RW Study • Email searches were primarily for specific information • File searches were primarily for documents • Web searches were more evenly distributed Res Disc