1 / 40

Summarization and Personal Information Management

This study explores personal information management (PIM) strategies and the usage of summarization tools across different PIM tools. The study examines user demographics, usage patterns, and the evolution of PIM strategies over time. The findings provide insights into the challenges and opportunities in designing effective PIM tools.

landre
Download Presentation

Summarization and Personal Information Management

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Summarization and Personal Information Management Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute

  2. Announcements • Questions? • Final exam 1 week from today (or on Friday, May 9) • You can pick which exam time you prefer • Poster session Thursday, May 1 on bridge between Wean and NSH • Posters should be 3 feet wide by 4 feet long • Rohit can help with printing • We will provide foam core and easels • Poster session will be public!!!! • I’ll bring food!!!!

  3. Announcements • Deadline for Term Project final posters and papers is Wednesday, May 14!!! • University Course Assessments • Plan for Today • Readings for Today • Boardman & Sasse, 2004 • Ian’s student presentation on Otterbacher et al., 2006 • Evaluation critiques

  4. Remember: Validity • Face validity: are you really measuring what you say you’re measuring? Are you measuring what’s important to measure? • External (ecological) validity: is your task/setting realistic? does your measure correlate with something else associated with what you want to measure? • Internal validity: are you controlling for random variance? Do you have enough statistical power?

  5. Boardman & Sasse

  6. Popping Back Up! • Zooming out from summarization to larger information management issues • Thinking about the connection between studies • Thinking about user adaptation • “self-auditing effect” • Emphasis on ecological validity Problem Human Behavior Solution Design Technology Problem?

  7. User Demographics (Stage 1) • 29 out of 31 were university people • 8 out of 31 were female • 14 researchers, 14 students, 1 IT support person, 1 IT manager, 1 unemployed • Average age 35 (min 21, max 60) How is this similar to or different from prior studies of PIM we have looked at?

  8. Interesting quote • We need to study on pattern of those people who has personal human secretary doing all their PIM. On user study like this paper, we see people tend to trade-off their effort and profit which encourage us to make tools only of which people put much effort on. However, what we think of as an ideal PIM tool is a tool that gives us profit without a big effort, and that's what those people with personal human secretary does. They never care about how hard it is for a secretary to find that piece of information, but only cares about profit they can get; this is exactly what we want for computerized secretary.

  9. Boardman and Sasse, 2004“Stuff Goes into the Computer and Doesn’t Come Out”: A Cross-tool Study of Personal Information Management Emil Albright

  10. Motivation Little study has been done on PIM as a cross-tool activity PIM strategy evolution has not been deeply explored

  11. How does the usage of diverse PIM tools contrast and interact for single users? Examines usage of three PIM tools: Files Email Bookmarks Proposes new classifications of use patterns Attempts to track usage change over time by a subset of participants

  12. The Study Phase 1: Interviews featuring “guided tours” of participant's PIM strategy 31 participants (28 academics)‏ All had solid computing experience Phase 2: PIM evolution 8 participants Bi-weekly snapshots for 3 months, and a followup 5 months later Participants were provided author's PIM tool

  13. Phase 1 results Users did not easily match pre-existing use patterns for email and bookmarks Multiple organizational strategies were employed Folder overlap based on projects and roles between tools

  14. Phase 1 results (cont.)‏ Little maintenance of collections Browsing generally preferred over search Act of considering PIM frequently caused reorganization Bookmarks viewed as more ephemeral / less organized

  15. Phase 1 results (cont.)‏

  16. Phase 2 Participants given WorkspaceMirror tool 4 persistently used it, 4 only tried it Participants who found it useful used it to reinforce their existing structures rather than reorganize Only 2 users made major revisions of PIM strategy WM not a factor; both cited study Collection growth: files > email > bookmarks

  17. Phase 2 (cont.)‏

  18. Overall observations Not all tasks are supported across tools, so integration will yield varied amounts of benefit Organizational needs vary from tool to tool (and user to user)‏

  19. Taxonomy extensions Information usefulness: Active (including ephemeral and working)‏ Dormant (inactive, potentially useful)‏ Not useful Un-assessed (e.g. new emails)‏ Information ownership: Mine (self-created or assessed)‏ Not-mine (un-assessed or “in-the-wild” online)‏

  20. What do we do with strategies? Why do you think previous email use categories didn’t work? Why is it important that there is a lot of overlap between email folders and file folders?

  21. What type of validity is challenged here?

  22. News to Go: Hierarchical Text Summarization for Mobile Devices

  23. Evaluation of Hierarchical Summarization System • Summarization for Mobile Devices • Web Documents • Reduced screen size • Reduced bandwidth

  24. The Algorithm • Compute salience score for each sentence • Distance from centroid • Position in document • Length • Similarity w/ first sentence • Good for news, but will this carry over well to other domains?

  25. More Algorithm • Construct a tree • Depth in the tree relates to salience in document • Nodes can be collapsed or expanded to view children in the display

  26. The User Study • Extrinsic Task Based Evaluation • Reading plaintext news items • How well does this generalize to other “Web Documents”? • Used a mobile phone emulator • DeckIt WAP Phone Emulator • How would this compare to using an actual mobile phone? • Interface actions • Latency • Setting • Answering comprehension questions • One question per document • Answer contained directly in document • No notes on question selection process

  27. n • 5 sets of 10 sets of articles • Temporally similar • News articles • One 5-option multiple-choice question per article • 39 subjects • Recruited from a student population • How generalizable? • Assigned to one of 6 conditions

  28. Experimental Conditions • Independent Variable! • Table 1: The six treatments used in the study.

  29. Experimental Design (cont) • Each subject completed between 1 and 5 of the document sets • Could leave before completing due to University policy • ~3.7 document sets on average (144/39) • Task/Treatment pairings were counterbalanced • Minimal instruction • Unlimited time to answer questions

  30. Research Questions • Are there significant differences between the five treatments (systems) when the effects of task difficulty are controlled? • Excluding the no-summary control • Are there any significant differences in task performance and efficiency between the hierarchical summarization setting and the full text setting? • Are there significant differences between the hierarchical summarization setting and the other three summarization methods?

  31. Variables Measured • Mean Response Time • Task Accuracy • Bytes Transferred (per task and per click)

  32. Data

  33. Answers • Are there significant differences between the five treatments (systems) when the effects of task difficulty are controlled? • ANOVA • Determined that there were significant differences between systems along all three measured metrics • Accuracy, Speed, Bytes transferred

  34. Answers (cont) • Are there any significant differences in task performance and efficiency between the hierarchical summarization setting and the full text setting? • Only significant difference was bytes • Their system comes out ahead • Would accuracy have been significant with more subjects? • 5-8 people per cond/set

  35. Answers (cont) • Are there significant differences between the hierarchical summarization setting and the other three summarization methods?

  36. Conclusions • Do the answers to the research questions validate the conclusions? • “There is a difference between the systems” • “Our system requires less data than full text” • And performs similarly? • “Our system beats these baseline measures” • Only marginal significance in one case • How generalizable is their solution to other users, other situations, other data?

  37. Evaluation Critique

  38. Remember: Validity • Face validity: are you really measuring what you say you’re measuring? Are you measuring what’s important to measure? • External (ecological) validity: is your task/setting realistic? does your measure correlate with something else associated with what you want to measure? • Internal validity: are you controlling for random variance? Do you have enough statistical power?

  39. Discussion about Evaluation • Pick one of the evaluations from a different group than your own • Evaluate completeness of the design • Number and type of users, specifics of the task and setting, experimental manipulation, outcome measures • Evaluate the plan in terms of: Face validity, Internal validity, Ecological Validity • What makes evaluating this work tricky? • What would you do differently?

  40. Questions?

More Related