150 likes | 159 Views
Learn how VuFind's record handling was redesigned to centralize MARC-specific code, allow for multiple metadata formats, and make customization and maintenance easier.
E N D
VuFind Beyond MARCdiscovering everything else Demian Katz VuFind Developer demian.katz@villanova.edu
How VuFind Used to Work • MARC records were loaded into Solr. • Data parsed to fields for searching/faceting. • Full binary record stored in “fullrecord” field. • Solr was used for retrieving records. • VuFind’s PHP code made heavy use of “fullrecord” data for building displays.
What’s wrong with that? • MARC must die. • Not all searchable documents are MARC. • Code for pulling data from MARC is ugly.
Redesign Goals • Centralize MARC-specific code so it can be easily replaced. • Use stored Solr fields whenever possible. • Allow arbitrary metadata formats to coexist peacefully. • Make no assumptions about metadata content.
The Solution: Record Drivers • A class interface for displaying a document retrieved from Solr. • A new Solr field tells VuFind which Record Driver to instantiate for each document. • A default Record Driver can be written to display a document based solely on stored Solr fields.
One Key Design Decision • What should the Record Driver class contain? • Data-oriented methods (getTitle, getAuthor, etc.) • Screen-oriented methods (getSearchResult, getStaffView, etc.)
The Answer: All of the Above interface RecordInterface public getSearchResult() public getStaffView() … class IndexRecord implements RecordInterface protected getAuthor() protected getTitle() … class MarcRecord extends IndexRecord protected getAuthor() protected getTitle() …
Record Driver Benefits • Large-scale changes are possible. • Small-scale changes are easy. • Allows object-specific behaviors. • Eases maintenance of local customizations.
Next Problem… • Where’s the data? • MARC records traditionally come from an ILS export. • SolrMarc traditionally takes care of populating VuFind’s Solr index.
Growing the Toolkit • The toolkit approach is important! • Problems to solve: • Obtain records from remote sources • Process harvested files • Index arbitrary XML
Tool #1: OAI-PMH Harvester • Purpose of tool: harvest metadata files from an OAI-PMH server into a directory. • Key feature: ID manipulation. • Key feature: delete support.
Tool #2: Batch Import Scripts • Purpose of tool: process all metadata files in a directory. • Easily achieved with Windows batch or Unix shell scripting. • Several sample scripts ship with VuFind.
Tool #3: XSLT Importer • Purpose of tool: with XSLT, map an XML document to a Solr document based on VuFind’s schema. • Key feature: PHP integration • Key feature: Aperture support • Several sample XSLT documents ship with VuFind (DSpace, OJS, VuDL).
Parting Thoughts • Understanding Record Drivers gives you a lot of control over VuFind. • VuFind should be able to index practically anything with a bit of effort. • Don’t be afraid to build your own tools!
More Information • VuFind: • http://vufind.org • Demian Katz: • demian.katz@villanova.edu