280 likes | 288 Views
Explore evidence-based decisions through clustering catalogue metadata, preservation, and retro-digitization. Discover insights from Jacob Burckhardt's works and reception in academic institutions.
E N D
Jacob Burckhardt (1818-1897) as a Problem for Libraries and their UsersClustering of Catalogue Metadata for Evidence Based Decisions in Sharing, Preservation and Retro-Digitization Dr. Rupert Schaab PRINT ARCHIVE NETWORK FORUM ALA Annual 2018 New Orleans 22. 6. 2018
Academic Libraries in Germany diselect nearly 2 Mio. volumes p. a. Corinna Roeder: Aussonderung von Printbeständen an wissenschaftlichen Bibliotheken in Deutschland, 2016
Die Kultur der Renaissance in Italien –holdings at Göttingen University All in all: 37 copies in 25 manifestations Among them at the central library 10 copies in 10 manifestations 57 % off all published between 1919 and 1943, only 27 % after 1944 1860 1 copy 1869 1 copy 1877/78 1 copy 1885 1 copy 1899 1 copy 1908 1 copy 1919 3 copies 1920 1 copy 1922 2 copies 1928 2 copies 1930 4 copies [c. 1934] 1 copy [1934] 1 copy ca. 1935 3copies 1936 1 copy 1939 1 copy 1940 1 copy [1943] 1 copy 1952 1 copy [1955] 1 copy 1958 1 copy 1966 1 copy 1976 1 copy 1989 4 copies 2000 1 copy
The Civilization of the Renaissance in Italy –holdings at Oxford University G 1943 1 copy c1944 12 copies 1945 7 copies 1950 3 copies 1951 2 copies 1954 1 copy G c1955 1 copy G c1956 1 copy G 1958 3 copies G 1960 3 copies 1960 4 copies c1960 3 copies 1975 1 copies G 1976 15 copies 1981 4 copies 1995 1 copy 2010 1 copy All in all: 93 copies in 34 manifestations Among them 9 copies in 9 manifestations in the Bodleian Library Only 17 % off all published between 1919 and 1943, but 67% after 1944 G 1860 1 copy G 1869 1 copy 1878 5 copies G 1885 1 copy F 1885 1 copy 1890 1 copy 1892 1 copy 1898 1 copy G 1904 1 copy G 1913 1 copy 1914 1 copy G 1919 1 copyI 1921 1 copy G 1926 1 copy G 1928? 1 copy G 1934 2 copies 1937 9 copies
German Journal Database (ZDB): Number of copies (x) per manifestations (y) - 72% of all manifestations are represented in fewer than 4 copies - If the German libraries want to retain at least 3 copies of every manifestation, they would have to preserve 59% of all holdings - Because of the longevity of a lot of journals, the possible reduction by volumes will be significant higher. 2016
German Collective Collection: Duplication Rates Group-scale Duplication 87% held in < 5 libraries Global Duplication (WorldCat) 76% held in < 5 libraries Preliminary results presented byConstance Malpas (OCLC), October 2016 N = 59M titles (OCLC control numbers)
author The dissemination of attributes using the structure of Functional Requirements of Bibliographic Records(FRBR-work-clusters needed) work expression manifestation item e.g. VIAF-ID, variant nameforms-> clustering of all nameforms e.g. uniform title, subject headings, classification-> clustering of all records of a work needed
Elements of different FRBR-levels in one record Heidrun Wiesenmüller, Bibliothekartag 2009 „The relationship between a work and a manifes-tation that embodies the work may also be recorded without identifying the expression through which the work is realized“. RDA 17.4.1
How many different english expressions are part of the LoC? • The civilization of the Renaissance in Italy : an essay / Jacob Burckhardt. • The civilisation of the period of the renaissance in Italy, by Jacob Burckhardt; authorised translation by S. G. C. Middlemore ... • The civilization of the Renaissance in Italy : an essay / by Jacob Burckhardt ; the translation of S.G.C. Middlemore, rev. and edited by Irene Gordon. • The civilization of the Renaissance in Italy; an essay. Introd. by HajoHolborn. [Translation by S. G. C. Middlemore]. • The civilization of the Renaissance in Italy, and other selections. Edited and abridged with an introd. by Alexander Dru. ….
The Translation of Samuel George Chetwynd Middlemore (+1890) • Prefaceof 1th edition 1878: „The translation ist madefromthe 3rd editionofthe original, recentlypublished in Germany, withslightadditionstothetext, and large additionstothenotes, by Dr. Ludwig Geigerof Berlin. It also containssomefresh matter communicatedby Dr. Burckhardt to Professor Diego Valbusaof Mantua, theItaliantranslatorofthe Book.“ • Middlemoreto Burckhardt 20.3.1889: „Nun darf ich Ihn fragen, ob Sie irgend einige Zusätze, Anmerkungen oder Verbesserungen mit der neuen Auflage einzuverleiben verlangen möchten?“ • Burckhardt toMiddlemore 23.3. 1889: „Zusätze habe ich Ihnen keine zu senden, da ich den italienischen Studien leider entfremdet worden bin. - Eine neuere Auflage als die IV. deutsche (wiederum von Prof. Geiger besorgte) giebt es nicht und ich glaube auch daß keine weitere folgen wird, indem das Buch jetzt seinen Dienst gethan zu haben scheint.“
No clustering No clustering Same manifestation as No. 1,printer in place of pubisher Noclustering, although uniform title at hand Different romanization and normalization Cluster of 2 expressions but in reality same manifestation Cluster of 2 manifestations but in reality one Cluster of „599“ expressions? Cluster of „599“ manifestations from different expressions?
No clustering of cyrillic and romanized titles No clustering of the new translationdifferent romanizations
Translations, transliterations and typos • Basilea • Basel • Baßel • Bafel • Bâle • Basle • Bazel‘ • Gottinga • Goettinga • Göttingen • Götingen • Goettingen • Goettingue • Goettinguen • Heidelberga • Heidelberg • Haidelberg • Heidleberg • Heidelburg • …
Changes of publisher names and their appearances Vandenhoeck Vandenhoeck & Ruprecht Vandenhoeck und Ruprecht Vandenhoeck u. Ruprecht Vandenhoeck et Rvprecht Vandenhoeck et Ruprecht Bandenhoed und Ruprecht Vandenhoeck und Rupprecht Vandenhoeck & Reprecht Vandenhoed und Ruprecht Eichenberg und Vandenhoeck Vidua Vandenhoeck Wittwe Vandenhoeck Abram Vandenhoeck Abramvs Vandenhoeck Abraham Vandenhoeck Abrahamvs Vandenhoeck Abraham van den Hoeck V&R unipress V&R Unipress V & R unipress V & R Unipress
Big variety in describing manifestations,great dependances from the cataloging languages, orthography, abbreviations, guesses … [Nachdr.] Sonderausg. Lizenzausg. Vollst. Ausg. Mit zahlr. Abb. Neudr. d. Urausg. Unveränd. Nachdr. d. Ausg. von 1955 Ungekürzt. Ausg. In der Textfass. d. Erstausgabe Ungek. Ausgabe 2. durchges. Aufl. 1950 [sic !] 28.-32. Tausend (1947) [1948] [1930?] [ca. 1930] 15. Aufl. d. Urausg., Ill. Ausg. 9., durchgearb. Aufl. 6. Aufl., unveränd. Nachdr. der 4. Aufl. 5. Aufl., unveränd. Abdr. der 4. Aufl. 1896 4., durchges. Aufl., besorgt von Ludwig Geiger 1885 Große illustriertePhaidon-Ausg. [ca. 1934] Grosse illustr.Phaidon-Ausg. ([1934]) Grosse ill.Phaidon-Ausg. “1.3 Mio. different values for the edition field“ (Geipel/Pohl, Culturegraph, 2011)
International StandardBook Number (ISBN) Same ISBN usedfor different manifestations • isbn 3499163225 - 9 manifestations Same ISBN usedfor different works • isbn 340754670X - 10 works Incorrect ISBNs (conflictnumber vs. check digit) • ISBN 3525513556 for 3525513666 In the German National Library Catalogue < 0,5 % multiple used ISBNs (in 2005) Heinrich Allers in Inetbib 2005
Works in WorldCat by FRBR types Caution: „Revised, augmented and collected/selected works together account for only 4% of the Works in WorldCat. Yet, this works represent more than 12 % of the manifesta- tions (records).“ Bennet/Lavoie/O‘Neill (OCLC): The concept of a Work in WorldCat, 2003
- Can we cluster records of the same manifestation?- How to count copies of a clustered manifestation (n=x)?- Can we migrate, that there is a retention comittement ∞, a deacidificed ph or a digital copy ☻?- How could we do this across different databases?
Selected Literature • Rick Bennet et. al.: The Concept of a Work in WorldCat - An Application of FRBR, in: Library Collections, Acquisitions, and Technical Services 27 (2003), S. 45-59 • Constance Malpas, Brian Lavoie: Strength in Numbers. The Research Libraries UK (RLUK) Collective Collection. Dublin OH 2016 • Magnus Pfeffer: Using clustering across union catalogues to enrich entries with indexing information, in: Data Analysis, Machine Learning and Knowledge Discovery, hg. von M. Spiliopoulou, Hildesheim 2014, S. 437-445 • Corinna Roeder: Aussonderung von Printbeständen an wissenschaftlichen Bibliotheken in Deutschland - Ein Überblick über die aktuelle Praxis und Rechtslage, in: Bibliotheksdienst 50 (2016) S. 1014-1039 • Rupert Schaab: Überlieferungssicherung als Gemeinschaftsaufgabe – Ein Vorschlag an die Wissenschaftlichen Bibliotheken Deutschlands, in: Bibliothek Forschung und Praxis 41 /2017), S. 391-397 • Gail Thornburg: A candid look at collected works. Challenges of clustering aggregates in GLIMIR and FRBR, in: Information technology and libraries 33 (2014), S. 53-64