460 likes | 585 Views
Cache Administration. Bill Robbins MetaArchive Annual Membership Meeting Houston, Texas Friday October 23, 2009. Overview of LOCKSS Cache Administration. Overview of LOCKSS Cache Administration. Perspective on a Lockss Cache vs. Other Servers Lockss Cache Installation
E N D
Cache Administration Bill Robbins MetaArchive Annual Membership Meeting Houston, Texas Friday October 23, 2009
Overview of LOCKSS Cache Administration • Perspective on a Lockss Cache vs. Other Servers • Lockss Cache Installation • Ingesting AUs – Putting Digital Data Into Preservation • Status/Monitoring of Your Cache • Ongoing Cache Operations • Troubleshooting Caches • Central Administration • Resources & Links • Future Plans
Section One Administration of a Cache vs. Other Servers
Administration of a Cache vs. Other Servers • Cache = Linux + Lockss + Minor Add-ons • NOT a multi-purpose server • Preserve --> MUST BE Secure!! • NOT online retrievable • NOT a backup site • NOT mission critical • Lockss - Turtle - Cheap H/W, 2nd Tier vendors - Iron systems, Capricorn, etc
Administration of a Cache vs. Other Servers • How does the purpose of the server change administration policies? (Users, Backups, monitoring…) • Selecting the O/S (No fee CentOS) • O/S upgrades (Security, LOCKSS & JDK) • Lockss caches file system (RAID?) • Appliance - Very Little Classic UNIX administration • Repair contracts – lowest level Questions / Discussion
Section Two Lockss Cache Initial Installation
Lockss Cache – Initial Installation • Three Sequential Steps • Creating the Linux Server • Turn the Linux Server into a MetaArchive Node • Installation & Configuration of Lockss Software • Follow the Kickstart Instructions • Often servers will use Kickstart • Otherwise, follow latest Kickstart instructions
Lockss Cache – Initial Installation 1) Creating the Linux Server • O/S – CentOS 5.3 • Hard Disks – file system (Multi TB + for AUs) • Kickstart-Complete Automated Installation is Possible • Network Configuration – CRITICAL INFORMATION • Hostname FQDN – Fully Qualified Domain Name • IP Must be registered, fixed IP address • Netmask • Gateway • DNS Local Nameserver
Lockss Cache – Initial Installation • Security Configuration • IPTABLES - Internal Allow List • IPTABLES is not a Firewall in classic sense • Lockss - V3 Protocol (TCP:9729) • User Interface via http - Needed for Local AND Central Administration (TCP:8081) • Audit Proxy (TCP:8080) • Standard Linux – ssh (TCP:22) • SE Linux – Security Enhanced Linux is enabled.
Lockss Cache – Initial Installation • Software Packages Selection • Development software if needed in the future • Office Software, just in case • Not a Central Server, Not a shared resource • NO Email • NO FTP • NO Windows Share • NO games • The node is now a Basic Linux Server Questions / Discussion
Lockss Cache – Initial Installation 2) Creating the MetaArchive Node • Post Install of the Kickstart == One Shell Script • Packages at http://~~.metaarchive.org/~~/kickstart (RPM) JDK, Lockss & Denyhosts • RPM – Redhat Package Manager • Package Retrieve and install • WGET • rpm install • In some cases we get a Config File (.conf) • Set the software to run on normal startup, (run levels) • Future – Software Repository Questions / Discussion
LOCKSS Cache – Initial Installation 3) Lockss Installation & Configuration • Create the lockss user and directories • Need parameters to create lockss startup • SMTP server • Admin Email • User ID & Roles -- Known User ID & Password are needed for the Cache Manager and Troubleshooting • The LOCKSS “hostconfig” is set up • Local Administrators can set up a UID/Pass • LOCKSS Will Start on Reboot • CRITICAL INPUT – /etc/lockss/hostconfig
/etc/lockss/hostconfigCritical Parts [Fully qualified hostname (FQDN) of this machine: [devcache1.library.emory.edu] IP address of this machine: [170.140.208.43] Path to java: [/usr/java/jdk1.5.0_15/bin/java] Configuration URL: [http://some-path/config/lockss.xml] Preservation group(s): [metaarchive] User name for web UI administration: [denisg] Password for web UI administration user denisg: [] Password for web UI administration (again): [] -------- Done---- lines removed LOCKSS will start automatically at next reboot, or you may start it now by running /etc/init.d/lockss start
Instructions for Configuring and Running LOCKSS From /etc/lockss/README Public (routable) IP address -nslookup must work- To further administer the daemon, go to http://<hostname>:8081/ Configuration URL – THIS IS THE TITLE DATABASE The URL (or local file name) from which the LOCKSS daemon will load extended configuration and tuning parameters. Use the default value to participate in the global LOCKSS preservation community. Use a private config file to create your own preservation community. MetaArchive is a PRIVATE Lockss Network (PLN).
Instructions for Configuring and Running LOCKSS - Preservation group(s) Used to select group-specific options in the config file. Use the default value to participate in the global LOCKSS preservation community. Multiple groups may be entered, separated by semicolon. - User name/Password for web UI administration • Follow Kickstart Instructions and then ….. • Your Cache is Now Ready to Go! Questions / Discussion
Section Three Lockss CacheIngesting Archival Units
Lockss CacheIngesting Archival Units The initial ingestion of an Archival Unit is a short and simple process. But it is only part of the ongoing process needed for long term digital data preservation. Verifying Preservation requires several ongoing activities both on the caches AND on the sites that are preserved. After the initial ingestion there will be ongoing preservation operations. • Ingesting Data - CAUTION • CAUTION – IS THE SITE PROTECTED?? IP ALLOW LIST AVAILABLE! • CAUTION – I S THIS CACHE SUPPOSED TO INGEST THIS DATA??
Lockss CacheIngesting Archival Units Pre-Ingestion Checklist / Review • Conspectus Entry • Provider Site is prepared • Plugin is completed • Manifest Page is in place • The Site is accessible to the Caches • The Plugin is accessible to the Caches • The AU is registered in the Title Database • The Cooperative has been notified
Lockss CacheIngesting Archival Units • CAUTION – IS THE SITE PROTECTED?? • Access to UI (User Interface) • Browser http://cacheName.university.edu:8081 • Access Control via IP Address • Ingesting • Journal Configuration • Add Titles • No wholesale ingesting – Please! Questions / Discussion
Section Four Lockss Cache Preservation Operations
Lockss Cache Preservation Operations • How Do You Know The AU is accurate and safe? • Crawl / Poll / Vote / Repair (if needed) • Repeat • Status Monitoring & Verification • Using the Daemon • Using The Cache Manager • Audit Proxy
Lockss Cache Preservation Operations • What is that Daemon Doing? - Live Demo (Maybe – due to security) • Crawling (Ingesting) • Polling • Voting • Establish Peers / Become a Peer • Auditing / Verifying Contents • Restoral • Logging • The LCAP protocol contains most of these functions
Lockss Cache Preservation Operations The MetaArchive Cache Manager • Centralized Network Data Gathering • Caches Collections Archival Units Disk Space • What is in my cache? What do the partners have? • Where are AUs replicated? What are the sizes? • Daily Snapshots are taken. Data is not live. • “Problems” are flagged. • Ruby On Rails Technology • Almost Exclusive to MetaArchive
Lockss Cache Preservation Operations When is an error not a problem? The entire cooperative network is always very full of activity. Glitches, short term outages, planned down time, etc. are expected. Any error should be present for more than a day or two before it is worth pursuing as a problem. Will review longstanding errors on the weekly call
Audit Proxy How Can You See What is Preserved? This is the function of the Audit Proxy. This can be useful as well during the testing phase of the plugin. • Settings on the UI for access • Settings on the UI for the proxy • Settings on Your Browser
Audit Proxy Instructions are on the Wiki It makes a difference whether Firefox or I.E. is your browser Questions / Discussions
Section Five Lockss Cache Ongoing Operations
MA Ongoing Operations Changes to the Site Content Changes to MetaArchive Network Lockss Cache Ongoing Operations Server Ongoing Operations • Backup, Restore • Monitor for Problems • Typical Server Admin Issues • SOME VERY SPECIAL CONSIDERATIONS ARE • NEEDED IN ONGOING CACHE OPERATIONS! • THERE ARE TWO SIDES TO ONGOING OPERATIONS!!!
Lockss Cache Ongoing Operations Ongoing Operations of the Server • Journal Configuration files are sent to central backup • File system – monitored someby the Cache Manager, but not completely
Lockss Cache Ongoing Operations Changes to a Preservation Site Such As … • Architecture, Rate of new content Mean you might need … • New plugin, Change the Conspectus • New AU, New Manifest Page • Disable / Remove AU
Lockss Cache Ongoing Operations Changes to the Cooperative Network New Servers Online More Space Available Mean Changes to Firewall Allow Lists AU Distributions Questions / Discussions
Section Six Lockss Cache Troubleshooting
Lockss Cache Troubleshooting Common Problems not related to ingesting Disk Space Issues The pollstate directory Needing to remove an AU Problems with ingesting the AUs are usually non-existent provided the precautions already discussed have been taken.
Lockss Cache Troubleshooting Tools Provided by the Daemon • Logging from the Daemon • var/log/lockss/stdout, daemon - avail via UI • Debug Panel • http://your-cache.univ.edu:8081/DebugPanel • Force re-crawl of site or plugin • Force Poll • AU Troubleshooting • AU removal • AU restore
Lockss Cache Troubleshooting Tools From Unix Shell Restarting Lockss Daemon /etc/init.d/lockss {stop | start} Standard Unix Tools Disk space issues Other bottlenecks, CPU, memory, network
Completely Removing an Archival Unit See the Wiki for more on Removing an AU • This is not a simple problem • A LOCKSS cache will NOT delete data • Data must be deleted from Unix
Restoring an Archival Unit to a Cache The Daemon will restore lost or corrupt data to a cache from the originating site, if possible. If not possible to reach the originating site, the Daemon will restore data from a Peer Cache. If the Cache is having a really bad day, you may need to restore the Journal Configuration, and then allow the site to be re-crawled. Questions / Discussion
Section Seven Central Administration & Resources for the MetaArchive Cooperative
Resources & Links LINKS • Use this RSS Link to keep up to date!! http://metaarchive.org/public/resources/rss_feeds/metaarchive_links.rss • The MetaWiki == Resource Central • https://metaarchive.org/metawiki/ • Login Needed? Request it! Email: brobbi2@emory.edu • The IP allow list • Many Others on the RSS Feed • The Cache Manager The Conspectus • The Lockss Wiki Mailing Lists
Resources & Links LINKS • The Subversion System (SVN) • Version Control / Software Releases • NOTE: Two Systems – One for plugins one for code. • All plugins need to be stored in SVN • Clear path to develop, test, sign & jar plugins from one common source • Papers on Lockss Technology • LOCKSS itself • LCAP protocol Questions / Comments
Coming Sooner or Later New Members
Coming Sooner or Later • Encryption System for Mailing files. • Test Network – EZ Access • Software Repository • Use YUM for automated updates • Increase Pro-Active Monitoring • Cache management topics, “How To” online videos Questions / Comments
Summary • Mature Technology, but still a lot to do • Growth in Members & Data being preserved leads to better understanding of how we need to operate the network • Contacts • Bill Robbins Monika Mevenkamp • bill.robbins@metaarchive.org momeven@gmail.com • (404) 712-2851