1 / 53

Digitization and scientific digital librar ies

Digitization and scientific digital librar ies. Martin Lhoták Knihovna AV ČR, v. v. i. Academy of Sciences Library 3.6.2009 UISK, Universita Karlova v Praze. Content. Digitization Centre of Acad. of Sci. Library Kramerius – software for dissemination

stan
Download Presentation

Digitization and scientific digital librar ies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Digitization and scientific digital libraries Martin Lhoták Knihovna AV ČR, v. v. i. Academy of Sciences Library 3.6.2009 UISK, Universita Karlova v Praze

  2. Content • Digitization Centre of Acad. of Sci. Library • Kramerius – software for dissemination • Digital Library of the Academy of Sciences • Software for metadata creation • „Digitization Registry CZ“ project

  3. Digitization Centre of the AS Library • In operation since1.1.2004 • Builded with support from EU Solidarity fund after floods in Czechia in 2002 • Main aim - to build a digital library of scientific publications (books, articles,…), published in the Academy of Science of the Czech Rep. Digital Library of ASCR • Partner of DML-CZ: Czech Digital Matemathical Library project since 2005

  4. The Academy of Science of the Czech Republic • > 50 scientific institutes • 8000 employees, (4000 R&D) • > 11 000 articles, reports, etc. a year • publish > 90 journals (circa 3000 articl.) • > 100 years history

  5. Digitization Centre of the AS Library • 1 x A0 color scanner ProServ ScanTech 600i • 1 x A1 color scanner Digibook 10000 • 2 x A2 bw scanners Zeutschel OS 7000 • 1 x A4 fast production scan. Panasonic • Staff – 8 to 10 people • Provides servis also to other institutions • Monthly production 40 - 50.000 pages • Overall production > 2.000.000 pages • Planned acquisition – ScanRobot http://www.treventus.com/

  6. Image Adjusting • Software Book Restorer from i2S • Designed to process scanned books • Geometrical correction • Crop • Blur • Binarization • Despecle

  7. Basic Metadata • XML (DTD of The Czech National Library) • Title basic biblographic data • Book/Journal structure • Physical size of the book/journal • Numbers of pages • Software Sirius (CZ)

  8. OCR • Fine Reader 8.1 • 2 runs: - 1. to recognize language of paragraph - 2. to do OCR with right language OCR workflow developed by DML-CZ team of Dr. P. Sojka • Output – double layer PDF: - 1. layer scanned picture - 2. layer „OCRed“ text

  9. Kramerius – development group and used technology • Open source – development from 2003 • Main purpose – accessing/dissemination of digitized documents (monographs and periodicals) • Czech National Library, Academy of Sciences Library, Qbizm technologies, Moravian Library in Brno • Funded mostly from Ministry of Culture and Academy of Sciences Grant Agency • Used technologies: JAVA, Linux, Apache, Tomcat, Postgres SQL, Lucene

  10. Kramerius – current status • version: 3.3.0, build: 29.7.2008,

  11. Kramerius – current status • DTD for periodicals a monographs • Import of XML, TXT and graphic files • Grafický formát DjVu, JPG, PNG, PDF • Fulltext search (Lucene) • Replication of the data between individual instalations • OAI-PMH – for metadata harvesting • METS, PREMIS, MIX – metadata standards

  12. Kramerius – current status • International an national Connections: - The European Library http://www.theeuropeanlibrary.org - Uniform Innformation Gateway JIB http://www.jib.cz/ • Links to libraries OPACs • Persistent URLs enables persistent linking

  13. Kramerius – new plans of development • Fundamental change – use of the FEDORA repository (open source USA) • Reasons – FEDORA is robust engine with support of compound objects and it is also usefull by means of long term preservation • Enhancement of administration – users and access rights • Batch operations with digitized documents • New types of docs (maps, audio, video,…)

  14. Kramerius – institutional users • Czech National Library, Moravian Library in Brno, State Technical Library, Academy of Sciences Library • Regional Scientific Libraries: Havlíčkův Brod, Hradec Králové, Olomouc, Ostrava, Zlín • Muzeum Libraries: UPM Praha, ŽM Praha, DA Praha, MVČ Hradec Králové • In total circa 5.500.000 pages (circa 500 periodical titles amd 4500 monographs)

  15. Academy of Sciences Digital Library • Funded by Academy of Sciences (2004-2009) • Digitization of historical issues (1890-1990), • Digitized circa 1 500 000 pages • Development of Kramerius system • Accesible 1 000 000 pages, (no articles separation) • Fulltext search • http:\\kramerius.knav.cz

  16. Academy of Sciences Digital Library • New issues – different approach • Open source E-prints (Uni of Southampton) • Agreements with the Academy Institutes – conditions of dissemination • Final goal – merge of both digital libraries (solution probably Drupal/FEDORA – Islandora?)

  17. Collaboration with Google • Digitized journals from Kramerius system - indexing of fulltexts, automatic detection of articles, link from Google to article’s first page or abstratct • New articles in E-prints - indexing of fulltexts, link from Google

  18. Academy of Sciences Central Data Repository • Huge amount of data from digitization • Disk array 30 TB with mirror • Tape library up tp 500 tapes • 3 different location for long term storage • Long term preservation for R&D outputs of the Czech Academy of Sciences • Institutional Repository

  19. System for journal publishing administration • Proven professional system (Manusript Central, Editorial Manager) • Better price for implementation and every year service fees with purchase as consortium • On-line submission system • Complete evidence of authors, reviewers and articles • Automated administration of peer review • Recently 8 journals

More Related