200 likes | 451 Views
Cloud Storage. Scott Friedman UCLA Office of Information Technology Institute for Digital Research and Education Research Technology Group 2013 CENIC – Building Blocks for Next Gen Networks Calit2 / USCD, San Diego, CA March 11, 2013. Overview. Archival Storage for Research Data
E N D
Cloud Storage Scott Friedman UCLA Office of Information Technology Institute for Digital Research and Education Research Technology Group 2013 CENIC – Building Blocks for Next Gen Networks Calit2 / USCD, San Diego, CA March 11, 2013
Overview • Archival Storage for Research Data • Principle use case • What drove us to this service/solution • Key factors • What we are implementing • Not implementing, what’s different • Costs and connectivity • Rate structure and status
Background • IDRE provides HPC storage to campus clusters • Both scale out NAS and parallel storage available • Relatively pricey (we charge $500/TB/Year) • Many of our HPC users, however, were asking for • Bulk, Archival, Project and Backup space • More affordable and flexible pricing options • We had no solution • Others - both HPC and enterprise • Could provide us with solutions • Just none that we or our users could afford • Performance and/or complexity issues
Background • Meanwhile… • Non-HPC users are generating a lot of data too • Storing their data in all kinds of “creative” ways • I am certain many of you are familiar with this • The common theme, however, is it’s CHEAP • $500/TB/Year is not going to happen in a million years • They have a nephew, neighbor or whatever that can do it even cheaper • What do we do? • Nothing?
Evolution • Doing nothing sounds good actually… • We have no budget for this • We have no mechanism flexible enough • to provide a service in the way our users want • that the university will allow us to implement • …nothing is still sounding pretty good • This bothered us • Saying NO all the time is tiresome • Since we were so constrained • We imagined the service we’d provide if we had no constraints • What would it look like given the realities our users live with • i.e. little or no money
Approach • We built a self-contained service • Funds itself, covers cost of the service only • Keep rates down - do not spread cost over our entire operation • Long term funding model (no grants) • Leverage our buying power and negotiating skills • Volume purchase • Sized to identifiable (initial) demand • Cost appropriate • Priced below what users are spending now (for junk) • Offer discounts for volume and time commitments
Roadblocks? • Two problems • Will vendors play ball? • Turns out, yes, they will • Not so surprising • Will university play ball? • Turns out, also yes (!) • Very surprising • Although it has taken eight months to work the details • Pays to be persistent • Won’t pretend this hasn’t been painful • Budget environment helped
Definition • We have a service but what is it? • Storage • No kidding, but whatshould it look like? • We decided to ask the users • A novel approach… • Some surprising answers • File based access number one request – by far… • Block based distant second, web based (link sharing) third • Object based – almost no one (surprising to us) • Lots of application type services – backup, Dropbox-like syncing, collaboration, web hosting, etc., etc. too many to count
Focusing • No problem • We can do file and block based all day long • Can do the other stuff too • But • Different groups wanting to all kinds of things • Security and policy requirements • Everyone wants everything • What’s new with that?
Compromise • Cannot be everything to everyone • It will ruin our cost model • We will do it badly • Everyone will be unhappy • Try not to duplicate what others are offering • We made two important choices • Virtualize the front end and the storage • Everything to everyone – up to a point • Not saying yes and not saying no
Design • We envision a layering of storage based services • Provide basic file/block based storage • …and that is it • More complex services can be stacked on top • Backup, syncing, collaboration, data management, portals, etc. • Virtualization • Security and namespace separation • High availability and load balancing
Operation • Users given a storage endpoint • NFS export and/or iSCSI target • Possibly other options (CIFS, Web) • No access to server by users ever • Control via web account portal • No access default policy • Encryption on client side (we never have keys) • Solution arrived in consultation with campus and medical center security offices
Implementation • Reliability over performance • However, we do not expect the performance to be an issue • Enterprise/HPC level hardware • We will offer local mirroring of data • Two copies across two on campus data centers • Future service to offer replication OoSC • Separately charged service • Location TBD
Details • VM cluster • 8 nodes: 256GB, 160Gbs aggregate bandwidth each • Storage sub-system • 4 cells, approximately 500TB each, 80Gbs aggregate bandwidth each • Network • 4x10Gbps to our data center core routers • Easily expandable • 2x10Gbps to UCLA backbone • 100Gbs campus L2 overlay / CENIC • All elements redundant
Details Campus Network 2x10G Existing Cisco 5048 10G Switch 4x10G Mellanox SX-1016 10G Switch HP 6500 Chassis 4x SL230G8 Nodes Ea. 256GB/2x2 40G HP 6500 Chassis 4x SL230G8 Nodes Ea. 256GB/2x2 40G 4x10G 4x40G MetroX 6x40G 4x40G Beta Testing 4x40G 4x40G Mellanox SX-1036 40G Switch 4x1G 4x1G 2x1G 2x1G Cisco SG500 1G Switch Management / Backup iSCSI Cisco SG500 1G Switch Management / Backup iSCSI 4x10G/4x10G 4x10G/4x10G 4x10G/4x10G 4x10G/4x10G 2x NEXSAN E60+E60X+E60X Data Cell 1.2+ PB Usable 2x NEXSAN E60+E60X+E60X Data Cell 1.2+ PB Usable MSADC PODDC
Network • Non-UCLA users become more realistic • As we improve connectivity to CENIC (100Gps) • As other campus’ do the same • Not necessarily at 100gbps • We see 100Gbps as an aggregation technology • From the perspective of this service • Local data repositories • Data movement to/from labs/SCCs
Status • We are in the final stage of campus approval • Storage will be available in • Volume increments of 1TB • Time increments of 1 year • Discount rate structure • Breaks at 2, 4, 8, 16, 32 and 64TB • Breaks at 2, 3, 4 and 5 years • Prices • From less than $180 down to $115 /TB/Year • Availability • UCLA focus, other UC campus’ if there is interest • Testing now, June production • Interested in non-UCLA testers/users
Cloud Storage Thank You! Questions / Comments ? Scott Friedman friedman@idre.ucla.edu