1 / 23

Visualizing ANLP Results for Exploring Enron Data

Explore ANLP results visually to analyze Enron's message traffic and network, detect communities, and identify key players for better insights and decision-making. Contact jheer@cs.berkeley.edu with improvement ideas.

evillagomez
Download Presentation

Visualizing ANLP Results for Exploring Enron Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. exploring enron – visualizing anlp results (an aanlp project) Jeffrey Heer – jheer@cs.berkeley.edu

  2. the problem • ANLP technologies are highly valuable but often less than usable and reliable… • Can be hard to make sense of results… how to go from reams of textual output to new knowledge and insight? • Completely automated processing can be dangerous! Can be wrong or obscure patterns, especially when trusted training data is not available.

  3. one possible solution • Turn ANLP technologies into tools usable within exploratory data environments • Enable users to directly visualize and analyze the results of processing, always providing access to the underlying source data. • Users can then use these tools to further analysis, while simultaneously making their own decisions of the quality of processing results and possibly even correcting algorithms as they go.

  4. visualize inferred social network view message traffic and actual e-mail text

  5. visualize clustering results – color coded to enron business e-mails pie charts indicate categorizations of e-mail traffic

  6. zoom and pan to explore large networks

  7. filter network for ‘hubs’ of higher connectivity

  8. filter, zoom, details on demand! view all messages to or from a given person…

  9. …or view all message traffic between two people.

  10. networks form various communities … some obvious, some not can we process the inferred network to automatically identify communties at various granularities? attempt social network analysis using a hierarchical agglomerative clustering approach, greedily combining groups into communities based on a criterion function that compares within-community edges against total connectivity.

  11. show results of community analysis at various stages of progress… allowing interactive exploration of the agglomerative cluster tree

  12. analysis scenario • filtered graph to isolate “power players” • looked for “california” color labels on edges • found John Shelk reporting on congressional meetings to Tim Belden – all one way e-mails • looking at Time Belden revealed ALL one-way e-mails sent to him, no responses, etc • seemed a bit suspicious… where is that info going?

  13. Analysis scenario All one way e-mails to Tim Belden about various legal issues…

  14. guilty!

  15. future work a plenty • improved colors, filtering, and brushing • category filtering, brushing from e-mails to graph • histogram visualization over sliders • visualize network of messages themselves? • temporal dimension of data • time-selection range slider • animate evolution of the network • search search search • tie to additional analyses • automated clustering • finer social network analysis • duplicate identification, acronym resolution, etc…

  16. please send me any ideas you have to improve this!!!  jheer@cs.berkeley.edu 

  17. I’m Kenneth Lay. And I approve this message.

More Related