540 likes | 619 Views
An Iterative Approach to Building Sustainable Repository Services on Fedora Open Repositories 2009, May 19, 2009. Outline. Organizational overview and background Claire Stewart, Head, Digital Collections Winterton Collection project
E N D
An Iterative Approach to Building Sustainable Repository Services on Fedora Open Repositories 2009, May 19, 2009
Outline • Organizational overview and background • Claire Stewart, Head, Digital Collections • Winterton Collection project • Karen Miller, Monographic and Digital Projects Cataloger, Bibliographic Services • Iterative approach • Bill Parod, Repository Architect, Enterprise Systems
Repository Implementation Group project schedule may 2008 may 2008
Full cataloging for each of the 76 original collections and at the container level (album, envelope, etc.) for collections of more than one container. • Individual photographs are not (generally) cataloged fully: • Title • Note (optional) • Publisher or Creator (if available) Winterton Collection cataloging
Full cataloging included • Title • Dates of coverage • Abstract • Scope and contents description • Biographical or historical note • Physical description (size of album, how many pages, photos, etc.) • Subject headings
Providing cataloging at the album level means that • Many individual photographs will not be described concisely by the subject headings assigned. • Some subject headings may not apply at all to some photographs.
Transcribing only the photograph titles results in such problems as these when keyword searching: • Non-English words are not translated • People referred to in captions by their initials, not names • Animals referred to by given name, not by species • Non-descriptive captions
Repository Development Strategy Implement models and services for ingest, preservation, and access of core content. 2. Provide tools for staff to ingest and manage repository content. 3. Facilitate integration of repository materials with end-user tools and services. 4. Iterate…
Draw Detailed Requirements from Project Commitments: A) OAI-ORE Annotation of OCA texts B) Cross Collection Search Project C) Winterton Photography Collection D) Kirtas Mounting Books Project E) EAD Initiative F) Hesler Photography Collection G) Chemical Bulletin H) Fava Masks I) Curator-driven Digitization Project J) Charlotte Moorman / Prgm. African Studies Audio
Inventory Content Types 1) EAD encoded finding aids 2) TEI encoded text transcriptions 3) High resolution images 4) Virtual crops of high resolution images 5) Page imaged books 6) 3D objects 7) Aggregations: full text, fielded, and faceted search 8) Audio 9) Video
Services by Content Type Text Service Image Service Metadata Conversion Service Discovery Service
TEI Objects TEI Disseminator Methods: getTOC getImageTextTOC getStructuredTextTOC getHeader(xml:id) getHeading getChunk(xml:id) getPageByNumber(pageOrdinal) getPageByID(xml:id) reindex Datastreams: DC MARCXML DejaVu Book ORE REM Page Image ORE REM TEI RELS-EXT Text Service EAD Objects EAD Disseminator Methods: getEADHeader getComponentAsHTML(unitid) getComponentStructure getChildComponents(unitid) getComponents getComponentStructure(unitid) getAncestorComponents(unitid) getComponentChildrenAsJSON(unitid) getComponentAsEmbeddedHTML(unitid) getComponent (unitid) getElementById (xml:id) getArchDescNoComponents getElementsByName(element_name) getDigest(unitid) getComponentAsDC(unitid) getComponentAsMODS(unitid) reindex Datastreams: DC MODS EAD EAD to DC XSL EAD to MODS XSL EAD to HTML XSL EAD to HTML Frag XSL EAD Children to JSON XSL RELS-EXT
EAD Objects EAD Service methods: getEADHeader getComponentAsHTML(unitid) getComponentStructure getChildComponents(unitid) getComponents getComponentStructure(unitid) getAncestorComponents(unitid) getComponentChildrenAsJSON(unitid) getComponentAsEmbeddedHTML(unitid) getComponent (unitid) getElementById (xml:id) getArchDescNoComponents getElementsByName(element_name) getDigest(unitid) getComponentAsDC(unitid) getComponentAsMODS(unitid) reindex Datastreams: DC MODS EAD EAD to DC XSL EAD to MODS XSL EAD to HTML XSL EAD to HTML Frag XSL EAD Children to JSON XSL RELS-EXT
Text Service Stack Enhancement Options Fedora Text Disseminator getComponent: unitid getComponentAsHTML: unitid getComponentAsDC: unitid getComponentAsMODS: unitid .... reindex Add Fedora Disseminator Methods Add Fedora Disseminators SGREPServlet Encapsulate query syntax XSLT optional on query result Add/Modify XSLT Processing on Retrieval Add/Modify SGREP Queries SGREP : Executable program on service host Replace Retrieval Software Examples: EAD “Digest”- C0n + title/id of children and ancestors JSON support for EXT-JS HTML design iteration EAD to MODS conversion maturation
Cropped Image Single Image File Referenced By Crop Information: <svg:svg xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:svg="http://www.w3.org/2000/svg"> <svg:image x="0" y="0" width="10656" height="7992" xlink:href="inu-wint/inu-wint-22.30.jp2"> <svg:clipPath> <svg:rect x="0" y="1166" width="8034" height="6036"/> </svg:clipPath> </svg:image> </svg:svg>
CroppedPhoto Single Image File Referenced By Crop Information: <svg:svg xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:svg="http://www.w3.org/2000/svg"> <svg:use xlink:href="http://repository.library.northwestern.edu/fedora/get/inu:inu-wint-22-30/DELIV-OPS"> <svg:clipPath> <svg:rect x="4246" y="1436" width="2997" height="2518"></svg:rect> </svg:clipPath> </svg:use> </svg:svg>
Image and Crop Objects Crop Object Datastreams: DC MODS PREMIS SVG RELS-EXT Image Service methods (supported by both image and crop objects): getWithWidth(width) getWithLongSide(length) getWithHeight(height) getCropWithWidth(x,y,width, height,destwidth) getCropWithHeight(x,y,width, height,destheight) getCropWithSize(x,y,width,height, destwidth , destheight) getWithSize(destwidth , destheight) Image Object Datastreams: DC MODS PREMIS SVG TIFF EXIF JP2 MIX_TIFF MIX_JP2 RELS-EXT
http:/.../fedora/get/inu:inu-wint-22-30-2/inu:sdef-addimage/getWithLongSide?length=150http:/.../fedora/get/inu:inu-wint-22-30-2/inu:sdef-addimage/getWithLongSide?length=150
Image Service Stack Enhancement Options Fedora Image Disseminator getWithWidth(width) getWithLongSide(length) getWithHeight(height) getCropWithSize(x, y, width, height, destwid…) Add Fedora Disseminator Methods Add Fedora Disseminators Image Servlet Encapsulate rendering parameters Object specific rendering parameters (SVG) User request rendering parameters Rendering service parameters and location Add/Modify Rendering Options Add/Modify Rendering Service Parameters Rendering Service : Aware, DJatoka Replace Rendering Software Examples: Added getLongSide(length) Added rotation Optimized rendering parameters Rendering features - vector overlay Object reference chaining Djatoka experimentation
Image/Crop Objects Image Service methods: getWithWidth(width) getWithLongSide(length) getWithHeight(height) getCropWithWidth(x,y,width, height,destwidth) getCropWithHeight(x,y,width, height,destheight) getCropWithSize(x,y,width,height, destwidth , destheight) getWithSize(destwidth , destheight) Datastreams: DC MODS PREMIS SVG TIFF EXIF JP2 MIX_TIFF MIX_JP2 RELS-EXT EAD Objects EAD Service methods: getEADHeader getComponentAsHTML(unitid) getComponentStructure getChildComponents(unitid) getComponents getComponentStructure(unitid) getAncestorComponents(unitid) getComponentChildrenAsJSON(unitid) getComponentAsEmbeddedHTML(unitid) getComponent (unitid) getElementById (xml:id) getArchDescNoComponents getElementsByName(element_name) getDigest(unitid) getComponentAsDC(unitid) getComponentAsMODS(unitid) reindex Datastreams: DC MODS EAD EAD to DC XSL EAD to MODS XSL EAD to HTML XSL EAD to HTML Frag XSL EAD Children to JSON XSL RELS-EXT
SOLR • MODS described collections • Metadata conversion services • Faceting • “Searchable” Interface • MODS Collection Datastream • Facet list • Field List Searching
Project Checklist A) OAI-ORE Annotation of OCA texts B) Cross Collection Search Project C) Winterton Photography Collection D) Kirtas Mounting Books Project E) EAD Initiative F) Hesler Photography Collection G) Chemical Bulletin H) Fava Masks I) Curator-driven Digitization Project J) Charlotte Moorman / Prgm. African Studies Audio