190 likes | 402 Views
ARD Prasad. Export/Import in Dspace & Backup. Where Dspace stores data. /dspace/assetstore directory will have all the Bitstreams and licenses PostgreSQL databases contains information on Metadata Information about Communities Information about Collections
E N D
ARD Prasad Export/Import in Dspace & Backup
Where Dspace stores data • /dspace/assetstore directory will have all the • Bitstreams and licenses • PostgreSQL databases contains information on • Metadata • Information about Communities • Information about Collections • Information about e-groups & authorizations • Information about E-persons & authorizations • Host of other information
Export/Import in Dspace • Export and import deal only with bitstreams, metadata, license and handles. • But NOT information about communities, collection, members, reviewers etc., access permissions/restrictions • You can export or Import • An item or • All items in a collection
Export command syntax/dspace/bin/dsrun org.dspace.app.itemexport.ItemExport \--type=COLLECTION --id=collID \--dest=dest_dir --number=seq_num Where --type can have either the value COLLECTION or ITEM--id is the handle/collection_or_Item_Id ex: 1849/2 (or 123456789/2 in case you do not have handle)--dest is destination directory (directory be created prior before running the script)--number is sequence number, it can be just 1
Shell Script for exporting#!/bin/shif test $# != 1thenecho "Usage: $0 <export-directoryname>"exitfideclare collection_id[5]=(2 3 4 5 6 7)for((i=0; i<=5; i++))domkdir $1/${collection_id[$i]}/dspace/bin/dsrun org.dspace.app.itemexport.ItemExport \--type=COLLECTION \--id=1849/${collection_id[$i]} \--dest=$1/${collection_id[$i]} \--number=1done
In the shell script... • Look for the line • declare collection_id[5]=(2 3 4 5 6 7) • Change 2 3 4 etc with your collection ids • Clue: collection ids are the one that appear in the browser URL after handle prefix, ie. If you have not registered with CNRI, the number that appears after 123456789/ • Also create the directory where the data should be exported to
Shell Script for Import#!/bin/shdeclare collection_id[5]=(2 3 4 5 6 7)for((i=0; i<=5; i++))do/dspace/bin/dsrun org.dspace.app.itemimport.ItemImport \-a -e dspace@localhost.localdomain \-c 123456789/${collection_id[$i]} \-s $1/${collection_id[$i]} \-m mapfiledone
Here also change the collection ids in the import progam • -e option, should have the dspace admin id (i.e. e-mail address)
What is exported • The following files will be created for every item • dublin_core.xml ( metadata) • Handle ( one line having the handle number) • license.txt • Actual file ( bitstream: could be pdf or doc or an image file) • Contents (with two lines – license file name, and actual bitstream name)
However • Import and Export are meant for data exchange • It can however, be used for partial back up • It takes care of only items • It does not back up • Your communities, collection, e-groups, e-persons
How to backup postgresql • pg_dump as dspace user • Example: • $ pg_dump dspace > backupfile • Note: where dspace is name of the database • backup file will have all the table definitions and contents. • pg_dump has lots of options
How to restore database • psql -d dspace –f dumpedfile • Note: pgsql has lots of options, to know more about options, you can use
Alternative (using tar) • To dump a database called mydb that contains large objects to a tar file: • $ pg_dump -Ft -b mydb > db.tar • To reload this database (with large objects) to an existing database called newdb: • $ pg_restore -d newdb db.tar
Upgrading • This procedure should be first step when you are upgrading DSpace to newer version • Even if upgradation fails, you have back to fall back
Upgrading Tip • Have different database and as a different user, so that you do not have to touch the existing DSpace insallation
Extra care • It is a good idea to take a tape (hard disk) back up of • Entire /dspace directory • pg_dump out put file • And the export directory
Final Lesson • Learning dspace is too easy. • can be learnt in a week • Can be mastered in a month • Creating content is continuous, long-term, perhaps no end • Be more careful with the Content