170 likes | 340 Views
Music OCLC Users Group Annual Meeting San Jose, California 2013 February 27. Jay Weitz. Senior Consulting Database Specialist WorldCat Quality Management Division OCLC. Plenary Session: WorldCat Local Panel. Reintroducing GLIMIR. Reintroducing GLIMIR: Definition and Objectives.
E N D
Music OCLC Users Group Annual Meeting San Jose, California2013 February 27 Jay Weitz Senior Consulting Database Specialist WorldCat Quality Management Division OCLC Plenary Session: WorldCat Local Panel ReintroducingGLIMIR
Reintroducing GLIMIR: Definition and Objectives GLIMIR = Global LIbrary Manifestation IdentifieR • To identify records describing the same manifestation: Manifestation Clusters. • Parallel records: Same resource with same content in same format, but described in different languages of cataloging. • Create OCLC Manifestation Identifiers (OMI) and index them in WorldCat. • To identify records describing different manifestations with the same content: Content Clusters. • Originals, reprints, microform reproductions, digital reproductions. • Create OCLC Content Identifiers (OCI) and index them in WorldCat. • To improve FRBR work sets by merging those containing records that GLIMIR assesses to be equal in content. • Informing FRBR of algorithm improvements.
Reintroducing GLIMIR: Relation to FRBR and DDR Duplicate Detection and Resolution (DDR): • Works as an offline process. • Launches queries to find candidate duplicates. • Resolution program determines “retained” record. • GLIMIR adapts DDR algorithms, creates clusters and identifiers. FRBR algorithm: • Works in real time. • Makes author/title key. • Creates work clusters. • Assigns the OCLC Work Identifier (OWI).
Reintroducing GLIMIR:Diagram of Metadata and Identifier Structure • Identifiers at all levels • Holdings at all levels • Metadata summaries at higher levels
Reintroducing GLIMIR: Before Worldcat.org: Before GLIMIR: Multiple Works, Scattered Holdings • Retrieves and displays one representative record per work set. • Currently there may be multiple work sets for the same work (particularly for works without clear authors). • Depending on the search, these records may be scattered in large result sets.
Reintroducing GLIMIR: After Worldcat.org: After GLIMIR: One Work, Consolidated Holdings • Consolidated work set (more likely to get a thumbnail image). • Includes translations. • Briefer short lists, more complete retrieval.
Reintroducing GLIMIR: Perceived Duplicates • Perception of duplicate problem in WorldCat has worsened as more non-English language of cataloging records are loaded and parallel records are added. • Holdings scatter. • DDR has deleted nearly 13 million records since 1992. • Perception of duplicates in WorldCat remains. • GLIMIR OMI should have a bigger impact on perceived duplication. • Importance of good work groups.
Reintroducing GLIMIR: De-Duplication GLIMIR complements de-duplication: • Hides records that are duplicates but cannot be de-duplicated (styles/rules too different, sparse records). • Surfaces holdings, hides less desired descriptions. • Gives more accurate count of the numbers of manifestations in WorldCat.
Reintroducing GLIMIR: De-Duplication Just as with FRBR, improvements to general matching have been identified: • Typo tolerance in pagination. • Improvements to lists of noise titles. • Improved language and transliteration sensitivity. • Interpretation of size (e.g. gr8 = octavo = 8o = 22 cm = 8 in.) • Normalizing titles.
Reintroducing GLIMIR: Music and Film • “Cast list.” • Dates. • Scores, Parts, Scores and Parts.
Reintroducing GLIMIR:Same Search with GLIMIR Option Selected
Reintroducing GLIMIR: Cluster HoldingsInformation Displays on Each Bibliographic Record
Reintroducing GLIMIR: Acknowledgements • Robert Bremer • Ted Fons • Janifer Gatenby • Richard O. Greene • Ying Li • W. Michael Oskins • Patricia Schuette Sexton • Gail Thornburg • Kelly Womble