300 likes | 429 Views
Hey! What About Access? A guide to practical decision making. Roy Tennant California Digital Library escholarship.cdlib.org/rtennant/presentations/2001sfs/. Reality Check. It’s OK — you will survive We aren’t all Harvard, Univ. of Michigan, or Cornell
E N D
Hey! What About Access?A guide to practical decision making Roy Tennant California Digital Library escholarship.cdlib.org/rtennant/presentations/2001sfs/
Reality Check • It’s OK — you will survive • We aren’t all Harvard, Univ. of Michigan, or Cornell • You do not need to be a scanning expert to do good work • If you do not intend to throw away the originals (scanning for access rather than preservation), the requirements are less
Selection • Choose as if you will never have another chance (because you may not) • Choose for critical mass, not an exhibit • Consider technical challenges and limitations
Acquisition • Scanning is sexy… • …but outsourcing is foxy • Scan at the highest resolution and bit depth you can afford • You can afford more than you realize: • Flatbed scanners are inexpensive and getter more so • Disk space is very inexpensive (fast approaching $1/GB) • RAM is cheap
Organization • Metadata is important • My definition of metadata… • How you acquire and store it is as important as what you acquire • You can get by with less than you think, but be thankful for all you can get
Metadata: Appropriate Level • Collection-level access: • Discovery metadata describes the collection • Example: Archival finding aid encoded in SGML; see http://www.oac.cdlib.org/ • Item-level access: • Discovery metadata describes the item • Example: MARC or Dublin Core records for each item; see http://jarda.cdlib.org/search.html • Both types of access may be appropriate (“more is better”) • Doing both often takes very little extra effort
Collection Level Access Images Individual Finding Aid Search & Browse Interface Individual Finding Aid
Item Level Access Finding Aids Images Individual Finding Aid Search & Browse Interface Individual Finding Aid Individual Finding Aid
Combined Access Main Entry Point Images Individual Finding Aid Search & Browse Interface Search & Browse Interface Individual Finding Aid
Metadata: Granularity • <name>William Randolph Hearst</name> • <name> <first>William</first> <middle>Randolph</middle> <last>Hearst</last></name> • Consider all uses for the metadata • Design for the most granular use (slice and dice as small as you can stand) • Store it in a machine-parseable format
Metadata: Machine Parseability • The ability to pull apart and reconstruct metadata via software • For example, this:<name> <first>William</first> <middle>Randolph</middle> <last>Hearst</last></name> • Can easily become:<DC.creator>Hearst, William Randolph</DC.creator>
Metadata: Qualification • <name role=“creator”>William Randolph Hearst</name> • <subject scheme=“LCSH”>Builder -- Castles -- Southern California</subject>
Metadata: Schema Which one? Dublin Core CDWA MARC VRA EAD
Metadata: Organization • The schema or software you use to store it doesn’t matter • What does matter is that you: • Capture the quantity required for your purposes • Capture it at the granularity required for your purposes • Use appropriate vocabularies, if any • Qualify the metadata where required • Store it in a machine-parseable format • Can output it in any format required for interoperability with those important to you
Metadata: Generic Database Software • A continuum of options from free/simple, to expensive/powerful • Depending on the size and complexity of your project, you will have several options • Decision criteria: ability to provide required access and output options, complexity to implement, ease of creating and maintaining records, cost, support options, etc. MS Access, Filemaker Pro Oracle Sybase ht:/dig, SWISH-E Sprite MySQL Less More Approx. Complexity, Cost, Power, etc.
Metadata: Image Databases • CONTENTdm - contentdm.com/ • Luna Insight - www.lunaimaging.com/ • Univ. of Michigan DLXS - www.umdl.umich.edu/
Metadata: Getting By • The least descriptive metadata you can get by with is the least necessary for retrieval: A title? A description? A few keywords? And the location. • Typically, you can capture much more, even if only by setting standard defaults • And if you think you can’t afford to capture good metadata, then you likely can’t afford to digitize to begin with
Hardware and Software • The single most expensive part of your project is…people! • Don’t allow your inexpensive resource (hardware and software) to waste the time of your expensive resource (staff) • RAM and hard disk storage are very inexpensive; offline storage presents even more options
Hardware and Software: What’s the Least You Can Get By With? • [a computer] • $100 - flatbed scanner (1200x2400dpi/36 bit) • $100 - Paintshop Pro [$500 Adobe Photoshop] • $100 - more RAM • $15 - grayscale target • $500 - OCR software (optional) • $400 - sheet feeder (optional) • TOTAL: $315 - $1,615
Interoperability • The ability of disparate systems to communicate and operate as if they were one • Track national and international developments • Be prepared to adopt emerging standards and best practices • Be prepare to take advantage of the adoption of these standards by others
Access • Think about access before you begin your project: • Who will be using it? • What will they want to do with it? • Which terms will they think of to describe what they want to do? • Think flexibility — the ability to tailor the interface to specific needs, often on-the-fly according to conditions or parameters • Basic principle: the data remains inviolate, but the depiction of it is conditional
Transformation Presentation Variables, e.g.: • need • user • location Basic Access Model Information
General Advice • When it comes to access, more is better • Standards and best practices matter • There are several ways you can succeed (and countless ways you can fail…) • Never put data into a system that cannot cough it up again in as structured a format as it went in • Develop incrementally (never underestimate the power of a prototype) • Don’t do perfectly what you can do almost perfectly in half the time
What I Want You To Remember • Capture at the highest resolution and bit depth you can afford • Describe at the most granular and specific level you can stand • Provide access using the most flexible methods you can manage • Never underestimate the power of a $100 scanner and a person with good intentions!