140 likes | 282 Views
DEF System Architecture XML Web Services. Fedora and the Zebra Search Engine in an OAI Eprints Application. by Gert Schmeltz Pedersen, DTV gsp@dtv.dk - +45 4525 7244. Contents. XML Web Services and the 3-tier Architecture The DEF Eprints Service DEF-XWS Eprints Generic Search Service
E N D
DEF System ArchitectureXML Web Services Fedora and the Zebra Search Engine in an OAI Eprints Application by Gert Schmeltz Pedersen, DTV gsp@dtv.dk - +45 4525 7244
Contents • XML Web Services and the 3-tier Architecture • The DEF Eprints Service • DEF-XWS Eprints • Generic Search Service • Repository Federation
DEF-XWS project suite "XML Web Services and the 3-tier architecture" a project suite within the programme area System Architecture at Denmark's Electronic Research Library (DEFF) (http://defxws.cvt.dk) a collaboration with The Royal Library, The State and University Library, Aarhus Business School Library a.o. • Get web services hands-on and experience. • Get Fedora hands-on and experience. • Use Fedora to implement a web service version of • DEF Eprints - International eprints metadata harvested from Open Archives, a DEF project carried out at DTV. • Add full text indexing and retrieval.
DEff 3-tier Service Oriented Architecture Web browser Data base Data base Data base Data base Data base Data base Data base Data base Data base Data base Data base Data base Data base Data base Data base Data base Central Portal Local Portal Local service Local service Common service Common service Common service Common service
The DEF Eprints ServiceArchitecture of the DEF Eprints Service Provider Open Archives Initiative Data Providers OAI Harvester Zebra server Zebra server OAI-PMH E X P O R T Web UI w/Z39.50 Web UI w/Z39.50 DEF Portal User OAI Manager Full set M Y S Q L Z39.50 InfoNetUser Sub set Librarian Eprint Service Provider Protocol for Metadata Harvesting
DEF-XWS Eprints Open Archives Initiative Data Providers OAI Harvester Zebra server Zebra server OAI-PMH E X P O R T Web UI w/Z39.50 Web UI w/Z39.50 DEF Portal User OAI Manager E X P O R T Full set Full text retrieval Zebra server M Y S Q L SOAP/REST DEF-XWS Eprints User Batch ingest Web UI w/SOAP java Fedora server Z39.50 Web UI w/REST php DEF-XWS Eprints User AppXYZ w/SOAP perl InfoNetUser AppXYZ User Sub set Librarian Eprint Service Provider
DEF-XWS Eprints ZebraForFedora, a module for Fedora (http://www.indexdata.dk/zebra) Purpose: to obtain powerful text index and search functionality and performance. The original text index and search functionality in Fedora is simple SQL on a table, where DC element texts are stored in fields. ZebraForFedora is a set of Java classes that deploys over existing Fedora and Zebra installations by the running of an Ant target. In the Fedora configuration file: <module role="fedora.server.search.FieldSearch" class="dk.defxws.eprints.fedora.server.search.FieldSearchZebraModule"> <comment>Instead of fedora.server.search.FieldSearchSQLModule</comment> <datastore id="zebra"> <comment>Zebra server</comment> <param name="host" value="defxws.cvt.dk"/> <param name="port" value="9395"/> </datastore>
DEF-XWS Eprints • Purpose achieved • Fedora hands-on and experience • web services hands-on and experience • DEF-XWS Eprints available from web services • http://defxws.cvt.dk:8082/fedora/access/soap?wsdl • http://defxws.cvt.dk:8082/fedora/accessDEF-XWS/soap?wsdl • ready for 3-layered system architecture • applications combining many web services • Lesson • Do not override field search, • provide generic search service instead ...
Generic Search Service Generic Zebra ... Lucene • Core Fedora Repository Service • new services are deployed as web applications (.war files), with a configuration file. • The Generic Search Service shall be a webapp, configurable to use an existing Fedora repository and an existing installation of an indexing and searching engine, like Zebra, Lucene, and others. • Functionality to be decided by a working group of Fedora users and developers.
Generic Search Service • preliminary analysis of what has been done by others already, • approaches and issues people have taken in the following areas • a. what kinds of search engines? • b. how is indexing done and how is it kept up to date? • c. configuration options? How can you specify what datastreams/disseminations to index? • d. what interfaces for doing searches? • e. how do you deal with security in terms of the service interacting with Fedora? • f. what are problems with current approaches? • g. what would be desirable in a generic search service that would be delivered with Fedora? • gathering of requirements and issues for moving towards a reference implementation • - ZebraForFedora may serve as a reference implementation • from a broader perspective, how to deal with search for federations of repositories • - P2P search in EU project Alvis may be relevant • things that the Fedora Dev Team might need to do for new services in the Framework: • - a notification/messaging module in the core Fedora repository service • so that other services can find out when objects are added or changed. • - how the services run securely with Fedora, a Basic Auth approach is used now
Repository Federation Idea under elaboration: Fedora as Superpeer in an ALVIS peer-to-peer system
DEF-XWS Thank you!
DEF-XWS Eprints future Web Services Description Language DEF-XWS Pilot DEF-XWS Pilot http://host/fedora/ws/soap?wsdl Simple Object Access Protocol or REST Representational State Transfer Java Test UI Java Eprint WS php Test UI DEF-XWS Pilot Web Service-Oriented Architecture Graphics from Web Services: A Manager's Guide, by Anne Thomas Manes, Addison-Wesley, 2003