1 / 28

Using Pivots to Explore Heterogeneous Collections

Using Pivots to Explore Heterogeneous Collections. A Case Study in Musicology. Daniel Alexander Smith 8 December 2009. musicSpace. http://mspace.fm/projects/musicspace IAM Group, School of Electronics and Computer Science Music, School of Humanities. Outline. How musicologists use data

jmccollum
Download Presentation

Using Pivots to Explore Heterogeneous Collections

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using Pivots to Explore Heterogeneous Collections A Case Study in Musicology Daniel Alexander Smith8 December 2009

  2. musicSpace http://mspace.fm/projects/musicspace • IAM Group, School of Electronics and Computer Science • Music, School of Humanities

  3. Outline • How musicologists use data • Limitations of existing approaches • Our data extraction and integration methodology • Interface walkthrough

  4. musicSpace Tasks • Triage data partners sources • Extract information • Map data sources to schemas/ontologies • Produce interface over aggregated data • Customise interface based on feedback

  5. Data in Musicology

  6. Musicologists consult many data sources

  7. . . . but what if they could use just one?

  8. Intractable research questions • Which scribes have created manuscripts of a composer’s works, and which other composers’ works have they inscribed? • Which poets have had their poems set to music by Schubert, which of these musical settings were only published posthumously, and where can I find recordings of them? • Which electroacoustic works were published within five years of their premier?

  9. Why they are intractable (1) • Need to consult several sources • Metadata from one source cannot be used to guide searches of another source • Solution: Integrate sources

  10. Why they are intractable (2) • They are multi-part queries, and need to be broken down with results collated manually • Requires pen and paper! • Solution: Optimally interactive UI

  11. Why they are intractable (3) • Insufficient granualrity of metadata and/or search option • Solution: Increase granularity

  12. Metadata Extraction

  13. Previous work • Comb-e-chem modelled Chemistry data • We use similar approach • Translated this work to the arts • Musicology modelled using Semantic Web technologies

  14. Musicology Data Sources • Disparate data • How to pull them together and view on demand

  15. musicSpace Data Partners

  16. Data and Info Management problems • Sources allow searching, but not over everything • Data export (MARC typically) shows extra fields, e.g. characters in opera, document types hidden amongst metadata • Sometimes viewable on original site, but not searchable • Offering extracted metadata already a benefit with one source

  17. Grove Extraction Example • More complicated, as Grove is a full text encyclopaedia • Some digitisation via Grove Music Online • Weak semantic metadata extraction • Thus we performed some data entry

  18. Grove Works Lists Source Data

  19. Works List Metadata Tool

  20. Data Integration

  21. Integration • Domain Expert + Technologist partnership • This will be case for some time now • Technology to best automate tasks to make domain expert’s job less onerous

  22. Metadata mapping • Domain experts devise single schema • Provide mappings of fields in a particular data source to that unified schema • Enables an interface across all sources

  23. Downside • New source comes online with information not covered by unified schema • Have to make changes to all mappings to ensure accurate coverage

  24. New Approach: Pivoting • Marking up a single source, versus pushing all to a single schema • Use a pivot instead to situate metadata for integration • Essentially means that the interface does the heavy lifting of integration • Reduced effort by domain experts

  25. Interface Video

  26. Interface Video • Find a composer • See all copyists of their manuscripts • Choose a copyist and see which other composers that copyist has worked on

  27. Thank youhttp://ecs.soton.ac.uk/projects/musicspace ds@ecs.soton.ac.uk

More Related