1 / 59

Modeling Temporal Intention in Resource Sharing

Modeling Temporal Intention in Resource Sharing. Hany M. SalahEldeen & Michael L. Nelson. Old Dominion University. Department of Computer Science Web Science and Digital Libraries Lab. WADL 2013. Hany SalahEldeen & Michael Nelson Modeling Temporal Intention. WADL2013.

kedem
Download Presentation

Modeling Temporal Intention in Resource Sharing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Modeling Temporal Intention in Resource Sharing Hany M. SalahEldeen & Michael L. Nelson Old Dominion University Department of Computer Science Web Science and Digital Libraries Lab. WADL 2013 Hany SalahEldeen & Michael Nelson Modeling Temporal Intention. WADL2013

  2. All tweets are equal… …but some are more equal than the others Hany SalahEldeen & Michael Nelson 01 Modeling Temporal Intention. WADL2013

  3. Preliminary research questions: How long would these last? And if lost, is there backup somewhere? Is this what the author intended? Hany SalahEldeen & Michael Nelson 01 Modeling Temporal Intention. WADL2013

  4. Historical integrity Since tweets are considered the first draft of history… the historical integrity of the tweets could be compromised. Hany SalahEldeen & Michael Nelson 02 Modeling Temporal Intention. WADL2013

  5. People rely on social media for most updated information Hany SalahEldeen & Michael Nelson 03 Modeling Temporal Intention. WADL2013

  6. The life cycle of a social post Hany SalahEldeen & Michael Nelson 04 Modeling Temporal Intention. WADL2013

  7. The life cycle of a social post tweets Hany SalahEldeen & Michael Nelson 04 Modeling Temporal Intention. WADL2013

  8. The life cycle of a social post Links to tweets Hany SalahEldeen & Michael Nelson 04 Modeling Temporal Intention. WADL2013

  9. The life cycle of a social post Links to tweets What the reader receives Same state the author intended Hany SalahEldeen & Michael Nelson 04 Modeling Temporal Intention. WADL2013

  10. The life cycle of a social post Links to tweets What the reader receives Same state the author intended The resource has disappeared Hany SalahEldeen & Michael Nelson 04 Modeling Temporal Intention. WADL2013

  11. The life cycle of a social post Links to tweets What the reader receives Same state the author intended The resource has disappeared The resource has changed Hany SalahEldeen & Michael Nelson 04 Modeling Temporal Intention. WADL2013

  12. Resource’s possibilities What the reader receives Same state the author intended The resource has disappeared The resource has changed Hany SalahEldeen & Michael Nelson 05 Modeling Temporal Intention. WADL2013

  13. Resource’s possibilities a bigger problem since the reader might not know. What the reader receives Same state the author intended The resource has disappeared The resource has changed Hany SalahEldeen & Michael Nelson 05 Modeling Temporal Intention. WADL2013

  14. We could lose the linked resource Hany SalahEldeen & Michael Nelson 06 Modeling Temporal Intention. WADL2013

  15. Or the resource could change The attack on the embassy was in February 2013 Hany SalahEldeen & Michael Nelson 07 Modeling Temporal Intention. WADL2013

  16. Why do we want to detect the Author’s Temporal Intention? • Match: and convey the intended information. • Notify: • the author that the resource is prone to change. • the reader that the resource has changed. • Preserve: the resource by pushing snapshots into the archive automatically. • Retrieve: the closest archived version to maintain the consistency. Hany SalahEldeen & Michael Nelson 08 Modeling Temporal Intention. WADL2013

  17. Our investigation angles • The state of the archived content • The age of the shared resource • The states of the resource: • Missing from the live web • Changed from what the author intended to share • Detect the author’s intention and collect a dataset • Model this intention • Create a time-based navigation tool to match the predicted intention Hany SalahEldeen & Michael Nelson Modeling Temporal Intention. WADL2013

  18. Estimating web archiving coverage • Goal: Estimate how much of the public web is present in the public archives and how many copies are available? • Action: • Getting 4 different datasets from 4 different sources: • Search Engines Indices • Bit.ly • DMOZ • Delicious. • Results: * • Publications: • How much of the web is archived? JCDL '11 * Table Courtesy of Ahmed AlSum JCDL 2011 Hany SalahEldeen & Michael Nelson 09Modeling Temporal Intention. WADL2013

  19. Our investigation angles • The state of the archived content • The age of the shared resource • The states of the resource: • Missing from the live web • Changed from what the author intended to share • Detect the author’s intention and collect a dataset • Model this intention • Create a time-based navigation tool to match the predicted intention Hany SalahEldeen & Michael Nelson Modeling Temporal Intention. WADL2013

  20. The timeline of the resource Hany SalahEldeen & Michael Nelson 10 Modeling Temporal Intention. WADL2013

  21. Timestamps accumulation Hany SalahEldeen & Michael Nelson 11 Modeling Temporal Intention. WADL2013

  22. Our investigation angles • The state of the archived content • The age of the shared resource • The states of the resource: • Missing from the live web • Changed from what the author intended to share • Detect the author’s intention and collect a dataset • Model this intention • Create a time-based navigation tool to match the predicted intention Hany SalahEldeen & Michael Nelson Modeling Temporal Intention. WADL2013

  23. From Twitter, Websites, Books: • The Egyptian revolution. • From Twitter Only: • Stanford’s SNAP dataset: • Iranian elections. • H1N1 virus outbreak. • Michael Jackson’s death. • Obama’s Nobel Peace Prize. • Twitter API: • The Syrian uprising. Six socially significant events Hany SalahEldeen & Michael Nelson 12 Modeling Temporal Intention. WADL2013

  24. Resources missing & archived Hany SalahEldeen & Michael Nelson 13 Modeling Temporal Intention. WADL2013

  25. Revisiting after a year… Hany SalahEldeen & Michael Nelson 14 Modeling Temporal Intention. WADL2013

  26. Measured vs. predicted Hany SalahEldeen & Michael Nelson 15 Modeling Temporal Intention. WADL2013

  27. Interesting phenomenon: reappearance on the live web and disappearance from the archives Hany SalahEldeen & Michael Nelson 16Modeling Temporal Intention. WADL2013

  28. Reappearing and disappearance predictions Hany SalahEldeen & Michael Nelson 17Modeling Temporal Intention. WADL2013

  29. Our investigation angles • The state of the archived content • The age of the shared resource • The states of the resource: • Missing from the live web • Changed from what the author intended to share • Detect the author’s intention and collect a dataset • Model this intention • Create a time-based navigation tool to match the predicted intention Hany SalahEldeen & Michael Nelson Modeling Temporal Intention. WADL2013

  30. Temporal Intention Relevancy Model( TIRM) • Between ttweet and tclick: • The linked resource could have: • Changed • Not changed • The tweet and the linked resource could be: • Still relevant • No longer relevant Hany SalahEldeen & Michael Nelson 18Modeling Temporal Intention. WADL2013

  31. Resource is changed but relevant • The resource changed • But it is still relevant •  Intention: need the current version of the resource at any time Hany SalahEldeen & Michael Nelson 19Modeling Temporal Intention. WADL2013

  32. Relevancy and intention mapping Current Hany SalahEldeen & Michael Nelson 20 Modeling Temporal Intention. WADL2013

  33. Resource is changed and not relevant • The resource changed • But it is no longer relevant •  Intention: need the past version of the resource at any time Hany SalahEldeen & Michael Nelson 21 Modeling Temporal Intention. WADL2013

  34. Relevancy and intention mapping Current Past Hany SalahEldeen & Michael Nelson 22 Modeling Temporal Intention. WADL2013

  35. Resource is not changed and relevant • The resource is not changed • And it is relevant •  Intention: need the past version of the resource at any time Hany SalahEldeen & Michael Nelson 23 Modeling Temporal Intention. WADL2013

  36. Relevancy and intention mapping Current Past Past Hany SalahEldeen & Michael Nelson 24 Modeling Temporal Intention. WADL2013

  37. Resource is not changed and not relevant • The resource is not changed • But it is not relevant •  Intention: I am not sure which version of the resource I need Hany SalahEldeen & Michael Nelson 25 Modeling Temporal Intention. WADL2013

  38. Relevancy and intention mapping Current Past Not Sure Past Hany SalahEldeen & Michael Nelson 26 Modeling Temporal Intention. WADL2013

  39. Our investigation angles • The state of the archived content • The age of the shared resource • The states of the resource: • Missing from the live web • Changed from what the author intended to share • Detect the author’s intention and collect a dataset • Model this intention • Create a time-based navigation tool to match the predicted intention Hany SalahEldeen & Michael Nelson Modeling Temporal Intention. WADL2013

  40. Feature extraction • For each tweet we perform: • Link analysis • Social Media Mining • Archival Existence • Sentiment Analysis • Content Similarity • Entity Identification Hany SalahEldeen & Michael Nelson 27 Modeling Temporal Intention. WADL2013

  41. Modeling and classification using Mechanical Turk • To remove confusion we removed the close calls  898 instances remaining Hany SalahEldeen & Michael Nelson 28Modeling Temporal Intention. WADL2013

  42. The trained classifier • From the feature extraction phase we extracted 39 different features to train the classifier. • Using 10-fold cross validation, the Cost Sensitive Classifier Based on Random Forests gave the highest success rate = 90.32% Hany SalahEldeen & Michael Nelson 29Modeling Temporal Intention. WADL2013

  43. Testing the model Hany SalahEldeen & Michael Nelson 30 Modeling Temporal Intention. WADL2013

  44. Our investigation angles • The state of the archived content • The age of the shared resource • The states of the resource: • Missing from the live web • Changed from what the author intended to share • Detect the author’s intention and collect a dataset • Model this intention • Create a time-based navigation tool to match the predicted intention Hany SalahEldeen & Michael Nelson Modeling Temporal Intention. WADL2013

  45. TimeLord Navigator Hany SalahEldeen & Michael Nelson 31Modeling Temporal Intention. WADL2013

  46. Thanks! Hany SalahEldeen Web Science & Digital Libraries Old Dominion University Email: hany@cs.odu.edu @hanysalaheldeen Hany SalahEldeen Hany SalahEldeen & Michael Nelson Modeling Temporal Intention. WADL2013

  47. TimeLord Navigator Demo: www.cnn.com www.bbc.com Hany SalahEldeen & Michael Nelson Modeling Temporal Intention. WADL2013

  48. Evaluation Hany SalahEldeen & Michael Nelson 13 Modeling Temporal Intention. WADL2013

  49. Actual Vs. Estimated Dates Hany SalahEldeen & Michael Nelson 14 Modeling Temporal Intention. WADL2013

More Related