1 / 16

My Website Was Lost, But Now It’s Found

My Website Was Lost, But Now It’s Found. Frank McCown CS 110 – Intro to Computer Science April 23, 2007. Frank McCown. Education Ph.D. in Computer Science – Old Dominion Univ. (2007 expected) M.S. in Computer Science – Univ of Arkansas in Little Rock (2002)

neith
Download Presentation

My Website Was Lost, But Now It’s Found

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. My Website Was Lost, But Now It’s Found Frank McCown CS 110 – Intro to Computer Science April 23, 2007

  2. Frank McCown • Education • Ph.D. in Computer Science – Old Dominion Univ. (2007 expected) • M.S. in Computer Science – Univ of Arkansas in Little Rock (2002) • B.S. in Computer Science – Harding University (1996) • Work Experience • 1997-2004 – Instructor of CS at Harding University (Searcy, AR) • 1996-1997 – Software Eng for Lockheed Martin (Denver, CO) • 1995 – Software Engineer Intern for Auto-trol (Denver, CO) • Honors • 2007 – Outstanding Graduate Research Assistant • 2006 – College of Sciences Dissertation Fellowship • 2005 – Outstanding Graduate Assistant • 2004 – Dominion Scholar

  3. No preference Academia Industry Industry vs. Academia 2000 survey by The Scientist magazine asked their readers: Overall which environment do you prefer? 73% of survey respondents had held research positions in industry and academia. http://www.the-scientist.com/2001/4/16/28/2/

  4. Industry vs. Academia • Movement • Academia  Industry is common • Industry  Academia very uncommon • Flexibility • Schedule • Focus • Compensation

  5. Research Interests • Digital preservation • Will we be able to see our websites 20 years from now? • Web crawling • How can search engines and web archives duplicate/ download our websites more efficiently and effectively? • Search engines • How much/what content do commercial search engines index and cache? • How synchronized are search engines APIs with what the general user sees?

  6. Black hat: http://img.webpronews.com/securitypronews/110705blackhat.jpgVirus image: http://polarboing.com/images/topics/misc/story.computer.virus_1137794805.jpg Hard drive: http://www.datarecoveryspecialist.com/images/head-crash-2.jpg

  7. Web Infrastructure

  8. Cached Image

  9. First developed in fall of 2005 • Available for download at http://www.cs.odu.edu/~fmccown/warrick/ • www2006.org – first lost website reconstructed (Nov 2005) • DCkickball.org – first website someone else reconstructed without our help (late Jan 2006) • www.iclnet.org – first website we reconstructed for someone else (mid Mar 2006) • Internet Archive officially endorses Warrick (mid Mar 2006)

  10. Warrick-related Publications • Frank McCown, Norou Diawara, and Michael L. Nelson. Factors Affecting Website Reconstruction from the Web Infrastructure. JCDL 2007. June 2007. Vancouver, British Columbia, Canada. • Catherine C. Marshall, Frank McCown, and Michael L. Nelson. Evaluating Personal Archiving Strategies for Internet-based Information. IS&T Archiving 2007. May 2007. Arlington, Virginia. • Frank McCown and Michael L. Nelson. Characterization of Search Engine Caches. IS&T Archiving 2007. May 2007. Arlington, Virginia, USA. • Frank McCown, Joan A. Smith, Michael L. Nelson, and Johan Bollen. Lazy Preservation: Reconstructing Websites by Crawling the Crawlers. WIDM 2006. November 2006. Arlington, Virginia. • Frank McCown and Michael L. Nelson. Evaluation of Crawling Policies for a Web-Repository Crawler. HYPERTEXT 2006. August 2006. Odense, Denmark.

  11. Search Engine APIs Frank McCown and Michael L. Nelson. Poster: Search Engines and Their Public Interfaces: Which APIs are the Most Synchronized? WWW 2007 Frank McCown and Michael L. Nelson. Agreeing to Disagree: Search Engines and their Public Interfaces. JCDL 2007

  12. Thank You Questions?

More Related