1 / 28

Reconstructing the past with MediaWiki : Programmatic Issues and Solutions

Reconstructing the past with MediaWiki : Programmatic Issues and Solutions. Shawn M. Jones sjone@cs.odu.edu Old Dominion University. Reconstructing the Past with the Internet Archive. Images. HTML. JavaScript. CSS. Our goal: Temporal Coherence

tala
Download Presentation

Reconstructing the past with MediaWiki : Programmatic Issues and Solutions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Reconstructing the past with MediaWiki: Programmatic Issues and Solutions Shawn M. Jones sjone@cs.odu.edu Old Dominion University

  2. Reconstructing the Past with the Internet Archive Images HTML JavaScript CSS Our goal: Temporal Coherence Make the page look as it looked at the time it was archived.

  3. Some Results from the Internet Archive Are Lacking Images change between the time the Archive crawls the main page and the time it gets to the images Sometimes embedded images are missing when the Archive gets to them Sometimes the page is designed for a specific browser in mind Image from “A Framework for Evaluation of Composite Memento Temporal Coherence” by S. Ainsworth, M. L. Nelson, H. Van de Sompel. http://arxiv.org/abs/1402.0928

  4. MediaWiki Shouldn’t Have This Problem HTML Images JavaScript CSS

  5. What we’re not doing

  6. Interest in Reconstructing the Past With MediaWiki

  7. Simplified Memento Overview

  8. Rules for Reconstructing the Past With MediaWiki Do not modify any existing MediaWiki code! Conform to MediaWiki coding standards And…

  9. Reconstructing the Past

  10. Accessing Old Article Text The oldid argument references a revision of a page within MediaWiki's database Merely visiting the URI with the oldid will give you the text content of the page as it existed at that revision

  11. Reconstructing the Past

  12. Including the Right Template • This gives us: • $title - the Title object for the given page • $parser - the Parser object for the given page • $id - the revision ID (oldid) for the Template page • Using $parser, and $title, we can change the $id and fetch an old revision of the Template

  13. Reconstructing the Past

  14. But What About Images? This Map is important to understanding the content of this article This image is changed as the article is changed, to reflect its content

  15. It’s the same map if we look at the June 6, 2013 revision now Users can't view this embedded resource as it looked on June 2013 while reading the article from that time period

  16. What should have happened This is the the map from June, 2013 that should have been displayed This is the current map The content of the article won't match the data in this visual aide, possibly confusing a user who wanted historical information on this topic

  17. We Tried To Solve This Upon further inspection of the code in MediaWiki, the $time argument from this function is never used as detailed here

  18. We Just Solved This Upon further inspection of the code in MediaWiki, the $file argument’s getHistory() function can be used to acquire previous revisions of images

  19. Reconstructing the Past

  20. What about CSS/JavaScript? The present CSS of this page conflicts with the past Template.

  21. We Couldn’t Solve This The data is present, but we could not find any way for an extension to access or render it.

  22. Recap on Reconstructing the Past

  23. Uniform solution • RFC 7089, Memento, was designed to provide uniform access to past versions of all resources on the Web • Memento provides a web standard to access these resources

  24. Resources • Memento Protocol: http://tools.ietf.org/html/rfc7089 • Memento Website: http://www.mementoweb.org/ • Memento MediaWiki Extension:http://www.mediawiki.org/wiki/Extension:Memento • Memento Chrome Extension:http://bit.ly/memento-for-chrome • More details:http://ws-dl.blogspot.com/2014/04/2014-04-01-yesterdays-wiki-page-todays.html • Contact me: sjone@cs.odu.edu

  25. Backup Slides

  26. Sample URI-R (Step 1) HTTP Response HTTP/1.1 200 OK Date: Sun, 25 May 2014 21:39:02 GMT Server: Apache X-Content-Type-Options: nosniff Link: http://ws-dl-05.cs.odu.edu/demo/index.php/Daenerys\_Targaryen; rel="original latest-version", http://ws-dl-05.cs.odu.edu/demo/index.php/Special:TimeGate/Daenerys\_Targaryen; rel="timegate", http://ws-dl-05.cs.odu.edu/demo/index.php/Special:TimeMap/Daenerys\_Targaryen; rel="timemap”; type="application/link-format” Content-language: en Vary: Accept-Encoding,Cookie Cache-Control: s-maxage=18000, must-revalidate, max-age=0 Last-Modified: Sat, 17 May 2014 16:48:28 GMT Connection: close Content-Type: text/html; charset=UTF-8

  27. Sample URI-G (Step 2) HTTP Response HTTP/1.1 302 Found Date: Sun, 25 May 2014 21:43:08 GMT Server: Apache X-Content-Type-Options: nosniff Vary: Accept-Encoding, Accept-Datetime Location: http://ws-dl-05.cs.odu.edu/demo/index.php?title=Daenerys_Targaryen&oldid=1499 Link: <http://ws-dl-05.cs.odu.edu/demo/index.php/Special:TimeMap/Daenerys_Targaryen>; rel="timemap”; type="application/link-format", <http://ws-dl-05.cs.odu.edu/demo/index.php/Daenerys_Targaryen>; rel="original latest-version” Connection: close Content-Type: text/html; charset=UTF-8

  28. Sample URI-M (Step 3) HTTP Response HTTP/1.1 200 OK Date: Sun, 25 May 2014 21:46:12 GMT Server: Apache X-Content-Type-Options: nosniff Memento-Datetime: Sun, 22 Apr 2007 15:01:20 GMT Link: <http://ws-dl-05.cs.odu.edu/demo/index.php/Daenerys_Targaryen>; rel="original latest-version”, <http://ws-dl-05.cs.odu.edu/demo/index.php/Special:TimeGate/Daenerys_Targaryen>; rel="timegate”, <http://ws-dl-05.cs.odu.edu/demo/index.php/Special:TimeMap/Daenerys_Targaryen>; rel="timemap”; type="application/link-format” Content-language: en Vary: Accept-Encoding,Cookie Expires: Thu, 01 Jan 1970 00:00:00 GMT Cache-Control: private, must-revalidate, max-age=0 Connection: close Content-Type: text/html; charset=UTF-8

More Related