110 likes | 200 Views
History of CERN CDRom. Aim: develop a platform independent application destined for CDRom to search & navigate through CERN historical material Technology chosen => JAVA stand-alone appl. Material. Period covered by application: 1950-1962 Material on CDS: Photos (1952-1962)
E N D
History of CERN CDRom Aim: • develop a platform independent application destined for CDRom to search & navigate through CERN historical material Technology chosen => JAVA stand-alone appl Jennifer BUTIN - History of CERN CDRom
Material Period covered by application: 1950-1962 Material on CDS: • Photos (1952-1962) • Press cuttings (1950-1960) Material not on CDS: • Annual Reports (1955-1962)
Photos & Press cuttings • Aim: Create database for bibliographic info • Task: transform ALEPH data... PHO 0001023 BASE L 81 PHO 0001023 YR L 1958 PHO 0001023 SW L $$sn$$w9850 PHO 0001023 ER L CERN-CO-5800983 PHO 0001023 OS L 983 (1958) PHO 0001023 TI L The Mercury computer PHO 0001023 IM L $$d12 Sep 1958 PHO 0001023 SU L Computers and Control Rooms PHO 0001023 EXT L $$xhttp://preprints.cern.ch/cgi-bin/setlink?base=PHO&categ=photo-co&id=5800983$$nAccess to the pictures PHO 0001023 ME L FILM PHO 0001023 NI L Archive Collection PHO 0001023 ICO L http://preprints.cern.ch/photo/photo-co/5800983.gif PHO 0001023 AB L Built by Ferranti (UK), the Mercury was the first "central" computer of CERN. It was housed in building 2. PHO 0001023 CATZZ L $$aCM$$b50$$c036136$$d036136
title.txt date.txt eref.txt abstract.txt subject.txt image.txt 2 Nov 1953 20 Jun 1953 4 Nov 1953 1 Dec 1953 29 Dec 1953 25 Dec 1953 7 Dec 1953 1 Dec 1953 2 Nov 1953 ... photo bi title.txt date.txt eref.txt journal.txt author.txt subject.txt image.txt press BI database • …to flat file database for application • transformation scripts written in Perl i.e. date.txt
Search: photos & press • Search on bibliographic info only • Search engine in Java Types of search: • Simple - substring in 1 field • Advanced - 2 substrings in chosen fields + boolean operator (And, Or, Not) • + Option to specify a subject category • + Option case sensitive
Record PreBase Base PhoRecord PreRecord LinkList PhoBase OrSearch AndSearch SimpleSearch Search BinSearch NotSearch Overview of search classes creates defined on Abstract class
SearchApp RecDisplay LinkList ImageDisplayer JpegFrame RecGraphics Record PreRecord PhoRecord ScrollableCanvas Overview of Display classes 1 creates 1 1 Abstract class
Annual Reports Task: organise scanning of documents + design search/navigation 2 possibilities in application: • navigation via table of contents • full text search
tif ascii template txt file new template txt file Java classes Annual reportsNavigation via table of contents Scan TOC Acrobat Capture OCR Perl script Human editing Perl script
Scan chapters Acrobat Capture (OCR) tif Acrobat Catalog pdf Normal Index + pdf Normal Annual ReportsFull text search • Search with Acrobat Reader + search