280 likes | 380 Views
SWIB 2012. Linked Open Library Data in Practice: Lessons Learned and Opportunities for data.bnf.fr. Romain Wenz Bibliothèque nationale de France Conservateur Département de l’information bibliographique et numérique. What it looks like. Web pages about Authors, Works, Subjects
E N D
SWIB 2012 Linked Open Library Data in Practice: Lessons Learned and Opportunities for data.bnf.fr Romain Wenz Bibliothèque nationale de France Conservateur Département de l’information bibliographique et numérique
What it looks like • Web pages about • Authors, • Works, • Subjects • Gathering information • Library records (12 million at BnF) • Archive materials • Digital objects (2 million at BnF: Gallica)
Part I • The purpose and difficulties • Build Web pages • About writers, books, subjects • Linking to all resources in the library • Completely automatic
Exemple • Information about Cicero, http://data.bnf.fr/11885977/ciceron/ • Most studied books, editions of theses books • Digitized books, • Activities, such as translations by Cicero
Regroupement par« Œuvres » http://data.bnf.fr/11952658/dante_alighieri_la_divine_comedie/ • Manuscripts • Editions • Digital books
About a « theme » • Books about diving http://data.bnf.fr/12647518/natation/
Several formats • Marc catalogues • XML-EAD archives and manuscripts • Dublin Core digital Library • Authorities: • Persons and Organisations • Works (Uniform titles) • Subject Headings
Several structures • Library records : flat structure • Archival fonds with hierarchical structure and heritage • Digital Content that can be processed: tables of contents, OCR
Purpose: info about concepts • Pages for humans • Structure for machines
Links and authorities • ARK identifiers from authorities • Materials to make the matchings: • Dates • Preferred and alternative labels • Graph of links : relations, roles
Workflow Digital documents Web pages for humans Archives and Manuscripts Matchings- Alignments data for computers Library catalogue records
Ontologie complexe 13 Romain WENZ BnF-IBN
Part II • Feedback on activities
How? • FRBR principles • Things that work
Principes FRBR • Functional Requirements for Bibliographic Records • Uses • Dates • Labels • Related roles • Wich roles: • creation of a work • production of a version: language, type, • material production: publication, • life of an item
Why FRBR? Linking writers and works with a useful type of links: Writer of a work Contributor of an edition: translator, preface, … Producer : physical copy with a printer, distributor Associated with a unique item: owner, annotator
From a bibliographic record • Make the link towards a work • Common properties • Possible « expressions » • Author • Dates • Name • Role • Type of document • Language • Date • Title
Matching (« Aligning ») • Using a « prediction function » to: • Predict to wich Work a bibliographic ressource is associated : • Words of all titles • Goups of words • Give a threshold • Stopwords and improvements
Clustering • From the manifestations that are not matched • If there are enough common points • What it looks like in theory… • and in practice
The purpose • Gather data • Make them useful on the Web • Upgrade the catalogs
Part III • « Linked Open Library »
Open: TechnicalLegal • With the “Open data” initiatives led by the French government, it is possible to use an Open Licence. • Currently a strong state incentive around open data and formats • Once data is linked and open, what comes next?
First, changes in general use, since people can now find BnF’s resources directly on the Web. • Mailing address: lots of mail, « new publics » • Use statistic: 80%+ users from search engines • R and D: Improvements to integrate in main catalogues and archives
Secondly, the data is being used by broader communities. • small public libraries, new procedures are being explored for re-use of the dataset in local catalogues. Example of « OpenCat » with Fresnes • Use in other contexts: example of IF verso (translations) Institut français http://ifverso.com/ • Specific catalogues (bindings)
In the long term ? • Semantic Web technologies could set a standard for library data, • if we keep them • linked and • open.
Library missions • Strengths or weaknesses? • Descriptive information :trust produced to handle a collection and not for marketing purposes • Describing local « concepts » : local use For documents, not encyclopaedically • Use of standards: long-time perspective MARC catalogues, EAD archives, DC digital collection • Already « machine-readable » But not with Web standards yet
Thanks romain.wenz@bnf.fr Projet: data@bnf.fr