1 / 35

ARTstor A digital library of online collections for Education, research and scholarship

ARTstor A digital library of online collections for Education, research and scholarship. Digital Art History Workshop Malaga, Spain September 24, 2011 Dr. William Ying CIO and VP of Technology ARTstor. Willem van de Velde I: Calm Sea, Alte Pinakothek (Munich, Germany); Scala/Art Resource.

cwen
Download Presentation

ARTstor A digital library of online collections for Education, research and scholarship

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ARTstor A digital library of online collections for Education, research and scholarship Digital Art History Workshop Malaga, Spain September 24, 2011 Dr. William Ying CIO and VP of Technology ARTstor Willem van de Velde I: Calm Sea, Alte Pinakothek (Munich, Germany); Scala/Art Resource

  2. Application of a KOS in ARTstor and Shared Shelf: A Digital Library and a Networked Image Cataloguing and Management Solution Essential to the successful implementation and use of any digital library is the organization of that library, by one or more knowledge organization systems (KOS). KOS includes classification and categorization schemes that organize materials at a general level, subject headings that provide more detailed access and authority files that control variant versions of key information. KOS also includes highly structured vocabularies, such as thesauri, and less traditional schemes, such as semantic networks and ontology. This presentation will explore how ARTstor Digital Library over the years has improved it’s usefulness to the scholarly community by applying different increasing sophisticated Knowledge Organization Systems.

  3. ARTstor Digital Library Today • Founded by The Andrew W. Mellon Foundation in 2001 • Independent nonprofit organization since 2003 • Launched in July 1, 2004 • 1.3 million images and growing • 200+ museum & special collections • Museums, archives, libraries, artists, artists’ estates, scholars, photographers • Password-protected database restricted to educational and scholarly users

  4. ARTstor is a repository of aggregated collections

  5. ARTstor is a networkof educational and scholarly users

  6. ARTstor serves 1,350+ subscribing educational institutions and museums in 45 countries

  7. ARTstor is a workspacefor research and teaching

  8. Commitment to research Research is not only the work of scientists. It is the work of any thoughtful person who exercises any profession….The Andrew W. Mellon Foundation will continue to support research in the humanities and the arts in the belief that research as an activity is what we mean when we say that every mind should steadily be engaged in making “shape, order, meaning, purpose, where there was none, or none discernible.”-Don M. Randel, President, The Andrew W. Mellon Foundation, March 2011

  9. Curated collections for scholarshipARTstor keyword search – “Hercules”

  10. Differentiating general interest content from scholarly content Google images search – “Hercules”

  11. ARTstor collection types • ARTstor collection - 200+ museum & special collections • Hosted Institutions collections – collections maintained and cataloged locally by participating institutions and ingested and hosted in ARTstor • Personal collection – images uploaded and metadata maintained in ARTstor by individual users

  12. Yearly ARTstor repository growth by content type

  13. How can we help users find what they want? • Challenges: • Diversified collections coming to us without uniform metadata • Metadata completeness differ tremendously between collection types • No standard way to name an art work or creator

  14. Using Title to find records in ADL • ARTstor now uses “exact matches” for search • Jesus Carrying the Cross (The Way to Calvary)* • The Procession to Calvary • The Hunters in the Snow — also known as The Return of the Hunters* • Actually we can use either title above to find the painting because we have both images with different title in a cluster. • Would “fuzzier” search help? Match 3 out of 4 words etc? • Un Enterrement a Ornans • Funeral at Ornans • Work registry with foreign title would help! • Dis-ambigous simple search result

  15. Best kept secret of ARTstor • The more you type in the search box, the less likely you will find the art work you want!

  16. Knowledge Organization for Digital Libraries • Essential to the successful implementation and use of any digital library is the organization of that library, either directly or indirectly, by one or more knowledge organization systems (KOS). • Knowledge organization systems include • classification and categorization schemes that organize materials at a general level, • subject headings that provide more detailed access, and • authority files that control variant versions of key information such as geographic names and personal names.

  17. How is ARTstor using Knowledge Organization Systems? • Simple search • Advanced search – Title, Creator, Date, Classification, Geography • Faceted search – Classification, Date, Geography • Browse – Geography, Classification • Topics – ARTstor created, User created • Cluster and Associated Images

  18. In the beginning • ARTstor started with 400,000 images • One image – one metadata record • NO KOS!!! • Plain old simple keyword search (POSKS)

  19. We even try “advanced search” and it was a disaster! • We even allow our users to do “advanced search” on a lot of the fields in the metadata record such as “material” and “date” which makes no sense since the data were not normalized. The material could be O/C, oil on canvas, or 20 different other combinations! • It was so bad, we have to remove the “Advanced search” feature!

  20. Next, the beginning of KOS, we start to “organize” our metadata and add “knowledge” • Add 16 Classification: While no one standard served as the basis for this classification scheme, the Metadata team consulted the Getty Art & Architecture Thesaurus (AAT), among other vocabularies, when formulating these categories. • Architecture and City Planning • Decorative Arts, Utilitarian Objects and Interior Design • Drawings and Watercolors • Fashion, Costume and Jewelry • Film, Audio, Video and Digital Art • Garden and Landscape • Graphic Design and Illustration • Humanities and Social Sciences • Manuscripts and Manuscript Illuminations • Maps, Charts and Graphs • Paintings • Performing Arts (including Performance Art) • Photographs • Prints • Science, Technology and Industry • Sculpture and Installations

  21. Date and Geography • Creation date in ARTstor record is free text • Special algorithm and manual process is used to create numeric earliest and latest date • Standardized geographic terms have been applied to the descriptive data for ARTstor images, according to a controlled list of country names based on the Getty Thesaurus of Geographic Names (TGN). Geographic terms were assigned according to two different criteria: For site-specific works (architecture, mural painting, public monuments, etc.), the country term was assigned based on the location of the work. For objects now in repositories, country terms were assigned on the basis of the nationality of the creator. In cases where these two criteria overlapped (e.g. an American architect’s preparatory drawing for a building built in Spain), the records were assigned two country terms.

  22. Collection ranking • Every image record in a collection is assigned a collection ranking • Images that have high ranking will show up first in the search result page

  23. Finally, the beginning of Meaningful Concept Display • Sorting in search result page: Date, Relevancy (collection ranking) • Advanced search • Title and/or creator • Date range • Geography • Classification • Dynamic Filtered search • Classification • Geography • Date

  24. Knowledge from Image Groups created by user • Simple IG – active sharing by browsing Public folder, Institution folder, password protected folder etc. • Describe IG – active and searchable inside an institution • Describe IG – searchable and shared across institution (future)

  25. Browsing • We can also browse by the new found “knowledge” • Geography • Classification

  26. Browse by Featured Groups – knowledge created by ARTstor and Scholars • In each Sample Topic group, you will find iconic images mixed with other selections that are meant to trigger new ideas and provoke deeper research. Each Sample Topic is intended to be an inclusive, rather than comprehensive, introduction to a particular subject or discipline. We encourage you to use these Sample Topics to search and browse for more images related to these and other subjects in ARTstor. • The Travel Award winning groups are made available as excellent, user-contributed examples of integrating ARTstor images into teaching and research. We hope these groups will illustrate and inspire the cross-disciplinary application of ARTstor images.

  27. Clustered images • Through a lot of BSW (Blood , sweat and tears) images of the same work, whether duplicates or details, have been clustered behind a preferred image.

  28. Associated images (group knowledge) • Using Item-to-item Collaborative filtering approach, we decide to follow Amazon’s approach • Using implicit data collection, instead of using external data, we use data captured in image groups created by instructor grade users. We assume that each individual image group contain images that are related to each other. • We attempt to help our user find images related to the one that he/she is interested in.

  29. How did we do it? • First, we pick all images that appear in at least X Image Groups created by Instructor grade users. • Next, for each of these images, we form a cluster that include all image that appear in at least Y Image Groups with this image. • Next, we combine all images that belong to the same duplicate cluster together. Different users may use different version of Mona Lisa. • Next we rank all images inside a cluster by how many times this images appear with the master image. • Next, we exclude all cluster that have less than Z members • We come up with all these Collaborative filtering clusters.

  30. Name authority • We have created a system to use ULAN (Getty List of Artist Names) to normalize “creator” in the 1.4 million records in ARTstor. We are only 30% finished and hundred of thousands of new records adding to ARTstor every year. • The problem is we cannot use ULAN as a major facet to search into ARTstor as most of the record will not be discoverable this way. • Instead, we have added variant names from ULAN to ARTstor records that have been matched.

  31. Michelangelo and his variant names from ULAN • Buonarroti, Michelangelo (preferred,V,index), Michelangelo Buonarroti (V,display) • Michelangelo Buonarotti , Michelangelo , Michelagnolo di Lodovico Buonarroti Simoni, • Michelagniolo di Lodovico de Lionardo di Buonarroto Simoni • Michelagniolo di Lodovico di Lionardo di Buonarroto Simoni , Buonarroti, • Michel Angelo, Buonarroti, Michelagniolo , Bonarroti, Michelangelo, • Bonorotti,Michelangelo , Buonarota, Michelangelo , Michael Angelo Buonaroti , • Michael Angelo Buonarotti , Michelagniolo Buonarroti, Michelagnolo Buonarotti, • Michelagnolo Buonarruoti, Michelange Bonaroti, Michelang. o Bonarota • Michel Angel de Bonarrotta, Michelangelo Bonarota, Michelangelo Bonaroti • Michelangelo Buonarota, Michelangelo Buonaroti, Michelangelo Buonarrota • Michelangelo Buonnaroti, Michelangiolo Buonaroti, Micheleangelo Buonarota • Michel Angelo, Michel'Angelo, Michael Angelo, Michel Ange, Michel-Ange • Michel Aniol, Mighelagnolo, Miguel Angelo, Mikelandzhelo, Mikel-Andzhelo, Mikilanjilu

  32. Using other external ontology • From Freebase's ontology, we can further extract: • Period name, start and end dates • Creator names, birth and death dates, belonging to a period

  33. External data source • Once the two data sources were worked out, • Freebase's creator data was mapped to ULAN creator data. • Thus, creators were grouped by period.

  34. Freebase, ULAN and ARTstor relationship By computer By hand Freebase Period and Creator ARTstor records ULAN Creator

More Related