90 likes | 231 Views
Datasets of the KB. Steven Claeyssens – 19 September 2013. Datasets. General characteristics Examples Technical c haracteristics Legal aspects. Datasets: general characteristics. Collections of digital data (metadata, data).
E N D
Datasets of the KB • Steven Claeyssens – 19 September 2013
Datasets of the KB Datasets • General characteristics • Examples • Technical characteristics • Legal aspects
Datasets of the KB Datasets: generalcharacteristics Collections of digital data (metadata, data) • Born digital or the result of mass digitisation (reborn digital) • Dutch cultural heritage material, primarily text • New units of publication (distant reading)
Datasets of the KB Datasets: examples 1. Early Dutch Books Online (EDBO) • Books published in the Netherlands, 1781-1800 • 11,240 volumes; 9,710 titles • 2 million pages • www.earlydutchbooksonline.nl
Datasets of the KB Datasets: examples 2. Historical Newspapers • Published in the Netherlands and former colonies, 1618-1995 • 1,457 titles • ca. 9 million pages • ca. 99 million articles • kranten.kb.nl
Datasets of the KB Datasets: examples 3. Periodicals • Published in the Netherlands, 1850-1940 • 80 titles • 1.5 million pages • tijdschriften.kb.nl
Datasets of the KB Datasets: technicalcharacteristics • Documents in PDF and/or JPEG • Metadata in Dublin Core • OCR with word coordinates Machine readable access • (Semi-)structured in XML • Access via SRU and OAI-PMH
Datasets of the KB Datasets: legalaspects • In theory: as ‘open’ as possible • Public Domain = Public Domain • Metadata by KB = CC0 • In practice: most datasets are hybrid • Solution: negotiating rights for KB site and for researchers
Datasets of the KB Thankyouforyour attention. Questions? • www.kb.nl/dataservices • E dataservices@kb.nl • E steven.claeyssens@kb.nl • T @sclaeyssens