950 likes | 1.09k Views
Marshall Breeding Independent Consultant, Author, Speaker Founder and Publisher, Library Technology Guides http://www.librarytechnology.org/ http://twitter.com/mbreeding. Cloud Computing in Libraries and Web-scale Library Management and Discovery. 15 March 2013. SENYLRC. Abstract.
E N D
Marshall Breeding Independent Consultant, Author, Speaker Founder and Publisher, Library Technology Guides http://www.librarytechnology.org/ http://twitter.com/mbreeding Cloud Computing in Libraries and Web-scale Library Management and Discovery 15 March 2013 SENYLRC
Abstract This is an introduction to the concepts of cloud computing and how this suite of technologies is positioned to re-shape the way’s that libraries make use of strategic applications such as discovery and management applications. The instructor will describe the evolution of discovery systems from next-generation library catalogs that provided some improvements in the interfaces and performance of the established online catalogs toward the current wave of index-based or Web-scale discovery services. Major changes are also underway in the applications that libraries use to manage their operations and collections, with a new slate of library services platforms coming on the scene, providing an alternative to the integrated library systems that have been available for many decades.
Cloud Computing for Libraries Book Image Publication Info: • Volume 11 in The Tech Set • Published by Neal-Schuman / ALA TechSource • ISBN: 781555707859 • http://www.neal-schuman.com/ccl
AppropriateAutomation Infrastructure • Current automation products out of step with current realities • Majority of library collection funds spent on electronic content • Majority of automation efforts support print activities • New discovery solutions help with access to e-content • Management of e-content continues with inadequate supporting infrastructure
Key Context: Libraries in Transition • Academic Shift from Print > Electronic • E-journal transition largely complete • Circulation of print collections slowing • E-books now in play (consultation > reading) • All libraries: • Need better tools for access to complex multi-format collections • Strong emphasis on digitizing local collections • Demands for enterprise integration and interoperability
Key Text: Changed expectations in metadata management • Moving away from individual record-by-record creation • Life cycle of metadata • Metadata follows the supply chain, improved and enhanced along the way as needed • Manage metadata in bulk when possible • E-book collections • Highly shared metadata • E-journal knowledge bases, e.g. • Great interest in moving toward semantic web and open linked data • Very little progress in linked data for operational systems • AACR2 > RDA • MARC > Bibframe (http://bibframe.org/)
Fundamental technology shift • Mainframe computing • Client/Server • Web-based and Cloud Computing http://www.flickr.com/photos/carrick/61952845/ http://soacloudcomputing.blogspot.com/2008/10/cloud-computing.html http://www.javaworld.com/javaworld/jw-10-2001/jw-1019-jxta.html
Local Computing • Traditional model • Locally owned and managed • Shifting from departmental to enterprise • Departmental servers co-located in central IT data centers • Increasingly virtualized
Virtualization • The ability for multiple computing images to simultaneously exist on one physical server • Physical hardware partitioned into multiple instances using virtual machine management tools such as VMware • Applicable to local, remote, and cloud models
Major trend in Information Technology Term “in the cloud” has devolved into marketing hype, but cloud computing in the form of multi-tenant software as a service offers libraries opportunities to break out of individual silos of automation and engage in widely shared cooperative systems Opportunities for libraries to leverage their combined efforts into large-scale systems with more end-user impact and organizational efficiencies Cloud Computing
Beyond “Cloudwashing” • Cloud as marketing hype • Cloud computing used very freely, tagged to almost any virtualized environment • Any arrangement where the library relies on some kind of remote hosting environment for major automation components • Includes almost any vendor-hosted product offering • Example: ASP now Software-as-a-Service
Cloud computing – characteristics • Web-based Interfaces • Externally hosted • Pricing: subscription or utility • Highly abstracted computing model • Provisioned on demand • Scaled according to variable needs • Elastic – consumption of resources can contract and expand according to demand
Budget Allocations Local Computing Cloud Computing • Server Purchase • Server Maintenance • Application software license • Data Center overhead • Energy costs • Facility costs • Annual Subscription • Measured Service? • Fixed fees • Factors • Hosting • Software Licenses • Optional modules
Infrastructure-as-a-service • Provisioning of Equipment • Servers, storage • Virtual server provisioning • Examples: • Amazon Elastic Compute Cloud (EC2) • Amazon Simple Storage Service (S3) • Rackspace Cloud (http://www.rackspacecloud.com/) • EMC2 Atmos (http://www.atmosonline.com/)
Amazon EC2 • Amazon Machine Instances (AMI) • Red Hat Enterprise Linux • Debian • Fedora • Ubuntu Linux • Open Solaris • Windows Server 2003/2008
Storage-as-a-Service • Provisioned, on-demand storage • Bundled to, or separate from other cloud services
Multi Tennant SaaS is the modern approach One copy of the code base serves multiple sites Software functionality delivered entirely through Web interfaces No workstation clients Upgrades and fixes deployed universally Usually in small increments Software as a Service
SaaS provides opportunity for highly shared data models Bibliographic knowledgebase: one globally shared copy that serves all libraries Discovery indexes: article and object-level index for resource discovery E-resource knowledge bases: shared authoritative repository of e-journal holdings General opportunity to move away from library-by-library metadata management to globally shared workflows Data as a service
Software-as-a-Service • Complete software application, customized for customer use • Software delivered through cloud infrastructure, data stored on cloud • Eg: Salesforce.com—widely used business infrastructure • Multi-tenant: all organizations that use the service share the same instance (codebase, hardware resources, etc) • Often partitioned to separate some groups of subscribers
Application service provider • Legacy business applications hosted by software vendor • Standalone application on discrete or virtualized hardware • Staff and public clients accessed via the Internet • Same user interfaces and functionality as if installed locally • Established as a deployment model in the 1990’s • Can be implemented through Infrastructure-as-a Service • Individual instances of legacy system hosted in EC2
ASP vs SaaS From: THINKstrategies: CIO’s Guide to Software-as-a-Service
Platform-as-a-Service • Virtualized computing environment for deployment of software • Application engine, no specific server provisioning • Examples: • Google App Engine • SDKs for Java, Python • Heroku: ruby platform • Amazon Web Service • Library Specific platforms • WorldSharePlatform
Cloud Computing Library Context
Library automation through SaaS • Almost all library automation products offered through hosted options • SaaS or ASP?
ILS Products offered as SaaS (mostly ASP) • SirsiDynix Symphony • SirsiDynix Horizon • Innovative Interfaces Millennium • Ex Libris Aleph • EOS International EOS.Web • Evergreen – Equinox Software • Koha – LibLime, ByWater, many others internationally • …many other examples …
Multi-tenant SaaS • Serials Solutions • Summon • Intota (Announced for 2012-13) • 360 Search, 360 Link, KnowledgeWorks • Ex Libris • Alma • Primo Central • BiblioCommons • OCLC WorldShare Management Services
Platform as a Service • OCLC WorldShare Platform • WorldShare Management Services • WorldShare License Manager • Library-created applications
Almost all library automation vendors offer some form of “cloud-based” services Server management moves from library to Vendor Subscription-based business model Comprehensive annual subscription payment Offsets local server purchase and maintenance Offsets some local technology support Library Management in the Cloud
Moving legacy systems to hosted services provides some savings to individual institutions but does not result in dramatic transformation Globally shared data and metadata models have the potential to achieve new levels of operational efficiencies and more powerful discovery and automation scenarios that improve the position of libraries overall. Leveraging the Cloud
Transition to Web-scale Technologies • Web-scale: a characterization or marketing tag that denotes a comprehensive, highly-scalable, globally shared model • Web-scale: One of the key characteristics of emerging library management and discovery services • Displaces applications or data models targeting individual libraries in isolation • Discovery: index-based search • Management: Library Services Platforms
Repositories in the cloud • Dspace – institutional repository application • Fedora – generalized repository platform • DuraSpace – organization now over both Dspace and Fedora • DuraCloud – shared, hosted repository platform • Pilot since 2009, production in early 2011 • http://www.duraspace.org/duracloud.php
Caveats and concerns with SaaS • Libraries must have adequate bandwidth to support access to remote applications without latency • Quality of service agreements that guarantee performance and reliability factors • Configurability and customizability limitations • Access to API’s • Ability to interoperate with 3rd party applications • Eg: Connect SaaS ILS with discovery product from another vendor
Benefits of Cloud Computing Libraries Providers / Vendors • Elimination of capital expenses for equipment • Lower annual costs • Redeployment of technical staff to more meaningful activities • Higher revenues relative to software-only arrangements • Provision of infrastructure at scale with lower unit costs • Longer-term relationships with customers
Cost implications • Total cost of ownership • Do all cost components result in increased or decreased expense • Personnel costs – need less technical administration • Hardware – server hardware eliminated • Software costs: subscription, license, maintenance/support • Indirect costs: energy costs associated with power and cooling of servers in data center • IaaS: balance elimination of hardware investments for ongoing usage fees • Especially attractive for development and prototyping
Risks and concerns • Privacy of data • Policies, regulations, jurisdictions • Ownership of data • Avoid vendor lock-in • Integrity of Data • Backups and disaster recovery
Security issues • Most providers implement stronger safeguards beyond the capacity of local institutions • Virtual instances equally susceptible to poor security practices as local computing
Cloud computing trends for libraries • Increased migration away from local computing toward some form of remote / hosted / virtualized alternative • Cloud computing especially attractive to libraries with few technology support personnel • Adequate bandwidth will continue to be a limiting factor
Increased pressure • Library automation vendors promoting SaaS offerings • Some companies already exclusively SaaS • Software pricing increasingly favorable to SaaS
Caveat • technologies promoted by companies and organizations have a vested interest in their adoption • Critically assess viability of the technology and its appropriateness for your organization
Next-Gen Library Catalogs Marshall Breeding Neal-Schuman Publishers March 2010 Volume 1 of The Tech Set
ILS Data Online Catalog Search: Scope of Search • Books, Journals, and Media at the Title Level • Not in scope: • Articles • Book Chapters • Digital objects Search Results
Next-gen Catalogs or Discovery Interface • Single search box • Query tools • Did you mean • Type-ahead • Relevance ranked results • Faceted navigation • Enhanced visual displays • Cover art • Summaries, reviews, • Recommendation services • Scope of Search • Books, Journals, and Media at the Title Level • Other local and open access content • Not in scope: • Articles • Book Chapters • Digital objects
Discovery from Local to Web-scale • Initial products focused on interface improvements • AquaBrowser, Endeca,Primo, Encore, VuFind, • LIBERO Uno, Civica Sorcer, Axiell Arena • Mostly locally-installed software • Current phase is focused on pre-populated indexes that aim to deliver Web-scale discovery • Primo Central (Ex Libris) • Summon (Serials Solutions) • WorldCat Local (OCLC) • EBSCO Discovery Service (EBSCO) • Encore Synergy (no index, though)
Discovery Interface search model ILS Data Digital Collections Search: Local Index ProQuest Search Results EBSCOhost MetaSearch Engine … MLA Bibliography ABC-CLIO Real-time query and responses
Web-scale Index-based Discovery ILS Data (2009- present) Digital Collections Search: Web Site Content Institutional Repositories Aggregated Content packages Search Results Consolidated Index … E-Journals Reference Sources Pre-built harvesting and indexing