1 / 26

Summarization and Personal Information Management

This article explores the challenges and implications of language technologies in managing personal information, including email, and discusses the potential benefits of summarization. It also examines the importance of discourse structure in email and other discussion forums.

robertsc
Download Presentation

Summarization and Personal Information Management

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Summarization and Personal Information Management Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute

  2. Announcements • Questions? • Context: Looking forward to Group Moderation • Plan for Today • Wattenberg et al., 2005 • More discussion about discourse structure

  3. From Email to Discussion Forums • Finishing up email • Thinking about scale and architecture issues • Thinking about collaboration from a new angle (shared inboxes) • Understanding Moderation • What problem are we trying to solve (Cosely et al., 2005) and how do moderators solve that problem (Sustein, 2006) • Flash Forums (Kushal paper most technology focused) Focus on supporting productive interaction Rather than simply keeping artifacts under control

  4. Wattenberg et al., 2005

  5. Real World Issues • More of an experience report than a research article • Raises important questions that should influence our research directions • Mass adoption • Permanent/long term use • Use in a variety of contexts within an organization • Huge volumes of email (client-server rather than just client)

  6. “Cool” Features that Never Shipped • Information Visualization • Person centric • Size = frequency, Color = recency • Readability issues • Text Analysis • Named entity recognition • Internationalization • Regular expression matching not feasible on server side • Instant Search • New search with every keystroke • Not enough bandwidth

  7. Server Side versus Client Side • Emails stored on server • Advantage: you can get to your email from everywhere • Disadvantage: huge volume of email processed on server in large organizations • Emails stored on client • Moving to a new machine is difficult • Harder to enforce uniform filtering/monitoring Implications related to where the emails are archived and where the processing occurs.

  8. Issues of Scale • One of the major shaping forces of research in language technologies over the past decade Where does that leave us? Should we just give up then?

  9. Client versus Server? • Navigation/Browsing • Search • Classification, Filtering, Ranking • Extraction (Text Mining, Information Extraction, Content Selection) • Presentation (Design, Reformulation, Structuring, Ordering) * Where would summarization be located?

  10. What do you think? • Seemingly trivial "implementation details" can become daunting technological constraints in the end. However, does it suggest that we should keep these constraints in mind even at the design stage?

  11. Multi-User In-Boxes • Support staff, Help desks, Clubs or other organizations, distance education • Large volumes of non-personal email • Reuse of message texts • Probably a coincidence • More a function of the kind of email this type of organization receives

  12. Multi-User In-Boxes • New problems: Trouble coordinating when a message has received a response already • New use of folders to facilitate flow • Print, Done • Take home message: context of use affects which features are needed • Implications for what kind of automatic processing would support this activity

  13. Re-invention • Email usage patterns are idiosyncratic • People use features for different purposes than what they were designed for • Print and Done folders • Could be seen as a sign of success

  14. Internationalization • Dates are important and rather formulaic • Information extraction work has focused on entities like these • Difficult issues wrt generalizability • Differ from country to country • Addresses • Person names • Can be solved by brute force – raises questions about what sorts of automatic processing we should invest in

  15. More about Discourse Structure

  16. Thread Arcs – Implications of the conversational structure of email interactions • Threads are important and relatively easy to compute in email • Much less easy in other contexts • Empirical study shows that short, simple threads were the majority • Not true in other types of on-line communication • A bushy thread email may suggest that it is requesting feedback from others; a deep thread may imply it is a back-and-forth conversation.

  17. Email and Chat

  18. Why are threads important? • Useful for navigation • But what actually is a thread? • What does the structure mean? • Will the design considerations that went into thread arcs transfer to other discussion forums?

  19. Grosz & Sidner, 1986 Attention, Intentsions, and the Structure of Discourse • What is the structure of discourse? • Attentional state: What is salient now • My hamster was making a lot of noise on his hamster wheel last night. • “It is nocturnal” Vs. “It needs to be oiled” Vs. “It is new.” • How do you know what “it” refers to?

  20. Grosz & Sidner, 1986 Attention, Intentsions, and the Structure of Discourse • Intentional structure • Whenever I say something, I am trying to accomplish something • In coherent discourse, separate intentions are related to one another • If I am trying to walk you through installing some software, I will first talk you through downloading it, and then I will talk you through the installation procedure

  21. Example of Hierarchical Structure of Discourse

  22. Connection between Intentions and Attentional State DS 0 DS 4 DS 2 DS 1 DS 3

  23. What do threads have to do with it?

  24. What do threads have to do with it? What about information scent?

  25. What’s the point? • What does the structure of a thread mean? • What does it say about the quality of the interaction? • What does it imply about which information is more important? • What does it imply about what people will notice versus not notice? • What does it imply about how the information is related to each other? • Structure will be important for summarization • Ideas?

  26. Questions?

More Related