120 likes | 187 Views
Digital histories. Workshop 1: introduction K. Navickas. Searching for information and making notes before digitised sources. Dewey catalogue. County record office; Library; National Archives. My notes, organised by where I read the information, and their cataloguing system.
E N D
Digital histories Workshop 1: introduction K. Navickas
Searching for information and making notes before digitised sources Dewey catalogue County record office; Library; National Archives My notes, organised by where I read the information, and their cataloguing system A bit of a gamble if I get what I want/looking for Card catalogue Boxes of books & archives
Searching for information and making notes after digitisation – googlerisation? Digitised archives and newspapers from all over the world Download them to my own computer Write notes on computer – or annotate files • Catalogue system – online • could be determined by original order of the repository • could be completely new system • no system? – e.g. flickr collections – crowdsource tagging What archives available dependent on what is digitised – dependent on funding, conservation; volunteers Pay to access? Still a gamble to find what I want…
Metadata old style: Leonard Bloomfield, Language (New York: Holt, Rhinehart & Wilson, 1933) http://exchanges.history-compass.com/2010/03/30/doing-local-history-card-catalogues-manual-searches-and-historiography/
Metadata now: View page source and see the html or the xml schema
FAQ for the Burney collection of 17th and 18th century newspapers http://gdc.gale.com/products/17th-and-18th-century-burney-collection-newspapers/acquire-implement/faqs/#raw-text
This Day wvii) he offered to publickTnfivtiG , a: a commudiot.s ,]oom, oppoflte the New Inn, Surry S .l of 'W;liLiniter Bridge, at is. each, the Ethiopian Sav:;ge. I his aftrinihiriag Animal is of a different Species fi-om any ever feen in Etirope, and feenrs to hie the Link betwcen tile R;:;ion:l and 1,> ut, Ci-eation, as he is a ftrikingRelinbl!ance of the Huma: Species, and is allQwcd to liothleg~reateft *:urioG~y ever exhijib-J. in Lnoiantd. lligl: \Varer- tllsDdy at l om.,. n-Isridge, ilt C illiiitcsaIter S inlthc Mood nig, andl at Minotes .:ter 5 in rtieAhternonit. B> 111; :>ocsZ 109, 4 lIper (t. t I -6 , 6?', k a ' hIdia ditto, - 4 per Ct. 17',"', 7' T South Sea b3itto, - Ind. Bctin;, Ss d IOS. Dif, Ditto Old Ann. `o § Navy and Vi&t. Pi:is.- )itto New Ann. - Long Annuities, ita I 3 :ier Ct. 13k. red. 6 oEl a 6ai c Short ditto x77S, 3 C rCt. Cf. 6xz a i Scrip, 62 Ditto 1724, - Omniniu, - Ditto 1751, A-nti. 17ZS, 13 3 . AnCt. 7_ OCR text from the Burney newspaper collection – what on earth is it saying? OCR is rubbish with tables and old fonts
Big data? • From ‘computing and history’ to data and text mining, corpus linguistics, topic modelling • Are we moving from ‘close reading’ to ‘distant reading’? Methods: • N-grams – finding the proportion of occurrence of a word in a corpus of texts • Topic-modelling - assessing probability of occurrence of a group of words within a text Study: • ‘culturomics’