140 likes | 275 Views
Separating the wheat from the chaff: Identifying key elements in the NLA .au domain harvest. Preservation for Ongoing Accessibility : research group Professor Ross Harvey Dr Bob Pymm Dr Anne Lloyd Geoff Fellows Jake Wallis.
E N D
Separating the wheat from the chaff: Identifying key elements in the NLA .au domain harvest Preservation for Ongoing Accessibility: research group Professor Ross Harvey Dr Bob Pymm Dr Anne Lloyd Geoff Fellows Jake Wallis
Separating the wheat from the chaff: Identifying key elements in the NLA .au domain harvest Pandora - http://pandora.nla.gov.au NLA solution to website preservation Archive of over 1.7 terabytes of data selective - identifies specific sites for harvest and gains permission to archive
Separating the wheat from the chaff: Identifying key elements in the NLA .au domain harvest Internet Archive - http://www.archive.org/ Automated Harvests ‘the web’ issues? cost reliability of the crawl eg deep web
Separating the wheat from the chaff: Identifying key elements in the NLA .au domain harvest .au Harvest by Internet Archive first ran 2005 - producing 6.9 terabytes of data, 185 million unique files Issues? difficulties with certain file types password-protected sites difficulty in accessing the ‘deep’ web
Separating the wheat from the chaff: Identifying key elements in the NLA .au domain harvest .au Harvest September 2006 – more sophisticated crawl 19 terabytes of data, 596 million files predominant dataset for POA group
Separating the wheat from the chaff: Identifying key elements in the NLA .au domain harvest Research potential? digital preservation Australian digital culture
Separating the wheat from the chaff: Identifying key elements in the NLA .au domain harvest 3 broad questions What are the contents of the harvests? How can access be provided to this content? What is the value of the domain harvests in relation to the NLA’s overall web preservation interests?
Separating the wheat from the chaff: Identifying key elements in the NLA .au domain harvest Blogs low skill threshold technology as barometer of engagement social space catalyst for online community a new and important collecting point for digital cultural heritage
Separating the wheat from the chaff: Identifying key elements in the NLA .au domain harvest Archiving and preserving blogs how to identify Australian specific material? what to capture selection criteria? linked material? frequency of capture to ensure accurate representation provision of access to harvested blog content
Separating the wheat from the chaff: Identifying key elements in the NLA .au domain harvest Aspirations a conceptual framework for studies in digital anthropology a broadening of voices within the Australian public sphere
Separating the wheat from the chaff: Identifying key elements in the NLA .au domain harvest Questions/comments?