1 / 37

CLiMB: Computational Linguistics for Metadata Building

Center for Research on Information Access Columbia University Libraries. CLiMB: Computational Linguistics for Metadata Building. CLiMB: Interdisciplinary Research Project at Columbia University. Funded by Mellon Foundation 2002-2004 Center for Research on Information Access (CRIA)

lawson
Download Presentation

CLiMB: Computational Linguistics for Metadata Building

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Center for Research on Information Access Columbia University Libraries CLiMB: Computational Linguistics for Metadata Building CLiMB - Columbia University

  2. CLiMB: Interdisciplinary Research Project at Columbia University Funded by Mellon Foundation 2002-2004 • Center for Research on Information Access (CRIA) • Libraries • Computer Science Department CLiMB - Columbia University

  3. Problems in Image Access • Cataloging digital images • Traditional approach: manual expertise • labor intensive • expensive • Can automated techniques help? CLiMB - Columbia University

  4. Can we harvest image descriptors? angled porch v-shaped plan sandstone boulders CLiMB - Columbia University

  5. CLiMB Technical Contribution • CLiMB will identify and extract • proper nouns • terms and phrases • from text related to an image: September 14, 1908, the basis of the Greenes' final design had been worked out. It featured a radically informal, V-shaped plan (that maintained the original angled porch) and interior volumes of various heights, all under a constantly changing roofline that echoed the rise and fall of the mountains behind it. The chimneys and foundation would be constructed of the sandstone bouldersthat comprised the local geology, and the exterior of the house would be sheathed in stained split-redwood shakes. —Edward R. Bosley. Greene & Greene. London : Phaidon, 2000. p. 127

  6. Overall Goals • Research: Development of richer retrieval through increased numbers of descriptors • Practice: Development of suite of CLiMB tools • Resources: Vocabulary list which can be used by other visual resource professionals The essence of CLiMB: • Use scholars themselves as “catalogers” by utilizing scholarly publications • Enhance existing descriptive metadata CLiMB - Columbia University

  7. CLiMB Project Teams Coordinating Collections (Curatorial) Technical External Advisory CLiMB - Columbia University

  8. CLiMB Committees Coordinating Curatorial Technical • Judith Klavans • Stephen Davis • Angela Giral • Patricia Renfro • Bob Wolven • Judith Klavans • Stephen Davis • Angela Giral • Amy Heinrich • David Magier • Bob Scott • Bob Wolven • Roberta Blitz • Stephen Davis • Judith Klavans • Vera Horvath • David Elson • Roberta Blitz CLiMB - Columbia University

  9. Squeezing Metadata out of Scholarly Texts • Image collection • Associated text • Target object identification (TOI) • CLiMB suite of tools • Evaluation CLiMB - Columbia University

  10. Other Texts Source TEXT TOIs Test Records Image Collections Run CLiMB Suite of Tools Generate TEI Markup Result: Enriched XML AAT / BBIs / etc. Select words & phrases to include in Core Descriptive Records CLiMB Enriched Descriptive Records Core Descriptive Records Image Search Platform Image Search Platform with CLiMB Metadata CLiMB Processes User Evaluation Inputs Phase I process texts II • Art Librarians • Subject Specialists • Catalogers • Search & Retrieval Experts select metadata from texts III use CLiMB metadata in image search platform end-users CLiMB - Columbia University

  11. Squeezing Metadata out of Scholarly Texts • Image collection • Associated text • Target object identification (TOI) • CLiMB suite of tools • Evaluation CLiMB - Columbia University

  12. CLiMB Collections • Greene & Greene Architectural Drawings, • Avery Architectural and Fine Arts Library • Chinese Paper Gods, • C.V. Starr East Asian Library • Photographs from the Archives, • American Institute of Indian Studies CLiMB - Columbia University

  13. Greene & Greene Architectural Records and Papers Collection Drawings and Archives Avery Architectural and Fine Arts Library Columbia University Libraries CLiMB - Columbia University

  14. Charles Sumner Henry Mather Greene Greene (1868 - 1957) (1870 - 1954) CLiMB - Columbia University

  15. NYDA.1960.001.00023 All Saints Episcopal Church (Pasadena, Calif.). Alterations1902-1903 CLiMB - Columbia University

  16. Greene & Greene Catalog Record Author: Greene & Greene. Title: [Mrs. Dudley P. Allen house, 1188 Hillcrest Avenue (Pasadena, Calif.). Alterations.] Residence of Mrs. Dudley P. Allen, 1188 Hillcrest Ave., Pasadena, Cal. [graphic] : Alteration / Greene & Greene, Architects. Published: [1917] Physical Details: 4 sheets : various media ; 87.8 x 57.3 cm. (34 5/8 x 22 5/8 in.) Location: Columbia University, Avery Architectural Drawings Other Authors: Greene, Charles Sumner, 1868-1957. Greene, Henry Mather, 1870-1954. Subjects: Houses Alterations Architecture--Designs and plans--United States. Mrs. Dudley P. Allen house, 1188 Hillcrest Avenue (Pasadena, Calif.) Component Item: [1] Item no. NYDA.1960.001.03224. [AVERYimage]. Electric lighting -- floor plan, part plan of basement : Sheet no. Component Item: [2] Item no. NYDA.1960.001.00073. [AVERYimage]. [Electric lighting] -- floor plan, part plan of basement. CLiMB - Columbia University

  17. Greene & Greene Bibliography • Bosley, Edward R. Greene & Greene. London : Phaidon, 2000. • Current, William R. Greene & Greene: architects in the residential style. Fort Worth [Tex.] : Amon Carter Museum of Western Art, [1974] • Makinson, Randell L. Greene & Greene: architecture as fine art. Salt Lake City : Peregrine Smith, c1977. • Makinson, Randell L. Greene & Greene: the passion and the legacy. Salt Lake City : Gibbs and Smith, c1998. • Smith, Bruce. Greene & Greene masterworks. San Francisco : Chronicle Books, c1998. • Strand, Janann. A Greene & Greene guide [Pasadena, Calif. : G. Dahlstrom, 1974] CLiMB - Columbia University

  18. CLiMB - Columbia University

  19. C.V. Starr East Asian Library, Columbia University Chinese Paper Gods Anne S. Goodrich Collection CLiMB - Columbia University

  20. Pan-hu chih-shen God of tigers CLiMB - Columbia University

  21. Chinese Paper Gods Catalog Record Title: Chuang gong chuang mu [graphic]. Published: [193-] Physical Details: 1 print : wood-engraving, color ; 34 x 30 cm. In: Anne S. Goodrich Collection. Location: Columbia University, C.V. Starr East Asian Library (CJK) EAX GAC 1 no. 16 Subjects: Gods, Chinese, in art. Folk art--China. Genre Or Form: Woodcuts--Chinese. Notes: Date according to time period Anne S. Goodrich collected prints in Beijing. Record ID: NYCP02-F20 CLiMB - Columbia University

  22. Chinese Paper Gods Bibliography • Day, Clarence Burton. Chinese peasant cults : being a study of Chinese paper gods. Taipei : Ch'eng Wen Pub. Co., 1974. • Goodrich, Anne Swann. Peking paper gods : a look at home worship. Nettetal : Steyler Verlag, 1991. • Laing, Ellen Johnston. Art and aesthetics in Chinese popular prints: selections from the Muban Foundation collection. Ann Arbor, MI : Center for Chinese Studies, University of Michigan, c2002 CLiMB - Columbia University

  23. Chinese gods: selection from LC Authority File HEADING: Nezha (Chinese deity) Used For/See From: Daluoxian (Chinese deity) Jinhuan Yuanshuai (Chinese deity) Jinkang Yuanshuai (Chinese deity) Li Nezha (Chinese deity) Luoche Taizi (Chinese deity) Ne Zha (Chinese deity) Nezhataizi (Chinese deity) No-cha (Chinese deity) Nuozha (Chinese deity) Tailuoxian (Chinese deity) Taizi Yuanshuai (Chinese deity) Taiziyeh (Chinese deity) Yühuang Taizi (Chinese deity) Zhongtan Yuanshuai (Chinese deity) Search Also Under:  Gods, Chinese CLiMB - Columbia University

  24. CLiMB - Columbia University

  25. CLiMB - Columbia University

  26. Three Testbed Collections • Greene & Greene • detailed records • more difficult to associate text with image • Chinese Paper Gods • strong associations • problems with transliteration and variants • South Asian Temples • large set of digital images • diacritics and variants CLiMB - Columbia University

  27. CLiMB Collections: Future • Additional collection of digital images • Close association between image and text • Regularized metadata Suggestions: • Catalogue raisonné • Museum collection catalog • Exhibition catalog CLiMB - Columbia University

  28. Squeezing Metadata out of Scholarly Texts • Image collection • Associated text • Target object identification (TOI) • CLiMB suite of tools • Evaluation CLiMB - Columbia University

  29. Target Object Identification (TOI) • Define based on institutional needs • Varies from collection to collection • Greene & Greene – Project • Chinese Paper Gods – Deity • South Asian Temples –Location & Temple • Compile authority list CLiMB - Columbia University

  30. CLiMB - Columbia University

  31. Project Name Matching • Locate project names in Greene & Greene • Challenge: finding variant name forms • Robert R. Blacker house (TOI) • Blacker estate • The house • Possible techniques to improve matching • Developing a semi-automatic technique • Use existing information to label text • An iterative platform for manual intervention CLiMB - Columbia University

  32. Squeezing Metadata out of Scholarly Texts • Image collection • Associated text • Target object identification (TOI) • CLiMB suite of tools • Evaluation CLiMB - Columbia University

  33. CLiMB Suite of Tools http://www.columbia.edu/cu/cria/climb/presentations.html CLiMB - Columbia University

  34. Squeezing Metadata out of Scholarly Texts • Image collection • Associated text • Target object identification (TOI) • CLiMB suite of tools • Evaluation CLiMB - Columbia University

  35. Next Steps – CLiMB Evaluation Current Developments • Meeting with experts – October 17th • Survey with experienced image searchers Long Term Goal • Test CLiMB tools and data in an image search platform CLiMB - Columbia University

  36. CLiMB: Computational Linguistics for Metadata Building • Image collection • Associated text • Target object identification (TOI) • CLiMB suite of tools • Evaluation CLiMB - Columbia University

  37. Thank you! Any questions? www.columbia.edu/cu/cria/climb CLiMB - Columbia University

More Related