200 likes | 219 Views
Explore Servlet/JSP Model in digital library applications, how MySQL and postgreSQL differ, proposed DB schema for archaeology/genealogy, and legacy data ingestion techniques.
E N D
Archivists’ ToolkitPreliminaries: Architecture, DB Leslie Myrick NYU
Possible Java Architecture • JSP Model 2 Architecture • Servlet Controller • Handles requests, View selection, instantiates beans • JSPs update the View in the browser • JavaBeans used to represent the object in memory; access DB using JDBC • manage the Model • JDBC connection to the data source
Similar Use of Servlet/JSP Modelin Digital Library Applications • Dspace • UC Berkeley’s GenX system • CDL Preservation Repository
JSP Model 2 • Cleanest separation of presentation and content • Clear delineation of roles of developers and designers • Takes advantage of strengths of servlets and JSPs for serving dynamic content • JSP for presentation layer • Servlets for performing process-intensive tasks • Servlet as Controller in charge of request processing, creation of beans or objects used by JSPs to forward request • No processing logic in JSPs -- simply responsible for retrieving objects or beans instantiated by servlets
JSP Model 1 • Bulk of processing performed by JSP • Process requests and draw view • Fine for simple applications
MySQL vs postgreSQL • Both ACID compliant (transaction safe) • Both support referential integrity (as of MySQL 4.x) • MySQL faster; postgreSQL more robust • Finer grained locking in postgreSQL • MultiVersion Currency Control in postgreSQL • Want triggers? Views? Inheritance? For now go with postgreSQL • MySQL has built-in full-text search capability • Ease of installation and maintenance – MySQL hands down.
The ACID test • Atomicity - All elements of a given transaction take place or none do. • Consistency - Each transaction transforms the database from one valid state to another valid state. • Isolation - The effects of a transaction are not visible to other transactions in the system until it is complete. • Durability - Once a transaction has been committed, it's effects are permanent-- even if the system crashes, or a disk dies.
Proposed DB Schema: Archaeology / Genealogy • Ultimately based on MOA II model • With refinements to NYU’s zeroDB schema for digital object metadata • Torqued to describe archival objects and their digital surrogates • Same essential hook: pure Aristotelian hierarchy
It all comes down to object • Pivotal entity is object nesting other objects • objectType can be fonds, collection, component • componentType can be series, file, item, accretion • Object hierarchy maintained through: • objectID, parentID, nextSibID
Physical Location Tables CREATE TABLE physLoc ( physLocID int(11) NOT NULL auto_increment, physLocLevelID int(11) not NULL default '0', physLocTypeID int(11) NOT NULL default '0', physLoc varchar(128) NOT NULL default '', isPublic tinyint(1) unsigned NOT NULL default '0', PRIMARY KEY (physLocID) ); -- -- Data for table 'physLocType' -- INSERT INTO physLocType (physLocType) VALUES ('accession location'); INSERT INTO physLocType (physLocType) VALUES ('processing location'); INSERT INTO physLocType (physLocType) VALUES ('shelflist location'); INSERT INTO physLocType (physLocType) VALUES ('offsite location'); -- -- Data for table physLocLevel -- INSERT INTO physLocLevel (physLocLevel) VALUES ('repository'); INSERT INTO physLocLevel (physLocLevel) VALUES ('internal location'); INSERT INTO physLocLevel (physLocLevel) VALUES ('physical container');
Ingest of Legacy Datafrom marcxml • Student Programmers’ Assignment • Probably involve JAXP/DOM • Already undertaken conversion of records from Innopac iiirecord dtd to marc21slim schema; tape .mrc to marcxml using marc4J
Ingest of Legacy Data from EAD • Testbed creation tool • XSLT with Java Extensions using Xalan • Get nextID from database • Extensions instantiate and increment DBID, parentID, nextSibID for each component in <dsc> • Write out to .sql file to dump into DB
<xalan:component prefix="counter" elements="init incr" functions="read"> <xalan:script lang="javaclass" src="xalan://MyCounter"/> </xalan:component> <xsl:template match="/"> <counter:init name="index"/> <xsl:template name="dsc"> <xsl:for-each select="ead/archdesc/dsc"> <xsl:variable name="dsc-parentID"><xsl:value-of select="counter:read('index')"/></xsl:variable> <counter:incr name="index"/> <xsl:for-each select="c01"> DBID: <xsl:value-of select="counter:read('index')"/> PARENTID <xsl:value-of select="$dsc-parentID"/> Series: c01-<xsl:number/> Unittitle: <xsl:apply-templates select="did/unittitle"/> Abstract: <xsl:apply-templates select="did/abstract"/> <xsl:if test="./child::scopecontent"> Scopecontent:<xsl:for-each select="scopecontent/p"><xsl:apply-templates select="."/></xsl:for-each> </xsl:if>
DBID: 3 PARENTID 2 Series: c01-1 Unittitle: Series I: Documentary Material DBID: 4 PARENTID:3 Subseries: c02-1 Unittitle: Subseries A: Subjects DBID:5 PARENTID: 4 Subseries: c03-1 Box: 1 Folder: 1 Unittitle: Advertising Unitdate:undated DBID:6 PARENTID: 4 Subseries: c03-2 Box: 1 Folder: 2-6 Unittitle: Art & Collecting Unitdate: undated
DBID: 3 PARENTID: 2 NEXTSIBID: 126 Series: c01-1 Unittitle: Series I: Documentary Material INSERT INTO OBJECT (objectID, parentID, nextSibID, hasChildren, componentTypeID) VALUES (3,2,126,1,1); INSERT INTO TITLE (titleID, titleTypeID, title, objectID) VALUES (NULL,1,"Series I: Documentary Material",3)