1 / 9

Datasets of the KB

Datasets of the KB. Steven Claeyssens – 19 September 2013. Datasets. General characteristics Examples Technical c haracteristics Legal aspects. Datasets: general characteristics. Collections of digital data (metadata, data).

bob
Download Presentation

Datasets of the KB

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Datasets of the KB • Steven Claeyssens – 19 September 2013

  2. Datasets of the KB Datasets • General characteristics • Examples • Technical characteristics • Legal aspects

  3. Datasets of the KB Datasets: generalcharacteristics Collections of digital data (metadata, data) • Born digital or the result of mass digitisation (reborn digital) • Dutch cultural heritage material, primarily text • New units of publication (distant reading)

  4. Datasets of the KB Datasets: examples 1. Early Dutch Books Online (EDBO) • Books published in the Netherlands, 1781-1800 • 11,240 volumes; 9,710 titles • 2 million pages • www.earlydutchbooksonline.nl

  5. Datasets of the KB Datasets: examples 2. Historical Newspapers • Published in the Netherlands and former colonies, 1618-1995 • 1,457 titles • ca. 9 million pages • ca. 99 million articles • kranten.kb.nl

  6. Datasets of the KB Datasets: examples 3. Periodicals • Published in the Netherlands, 1850-1940 • 80 titles • 1.5 million pages • tijdschriften.kb.nl

  7. Datasets of the KB Datasets: technicalcharacteristics • Documents in PDF and/or JPEG • Metadata in Dublin Core • OCR with word coordinates Machine readable access • (Semi-)structured in XML • Access via SRU and OAI-PMH

  8. Datasets of the KB Datasets: legalaspects • In theory: as ‘open’ as possible • Public Domain = Public Domain • Metadata by KB = CC0 • In practice: most datasets are hybrid • Solution: negotiating rights for KB site and for researchers

  9. Datasets of the KB Thankyouforyour attention. Questions? • www.kb.nl/dataservices • E dataservices@kb.nl • E steven.claeyssens@kb.nl • T @sclaeyssens

More Related