200 likes | 231 Views
Cloud Computing Storage Architecture and Costs Comparison for PDS. November 08, 2018 MC F2F in Houston. Action Item. From August MC, use the storage and distribution metrics to evaluate potential cloud architecture tradeoffs and storage costs for PDS. Approach.
E N D
Cloud Computing Storage Architecture and Costs Comparison for PDS November 08, 2018 MC F2F in Houston
Action Item • From August MC, use the storage and distribution metrics to evaluate potential cloud architecture tradeoffs and storage costs for PDS.
Approach • Focused only on storage of archival data for this exercise • There are a lot of options for computing close to data that has benefits on scalability that can be explored • Use PDS data sizing exercise and unfilteredweb metrics to determine costs • Data sizing estimates come from August query to nodes to will cost storage infrastructure • Web metrics com from August web logs and will cost download/egress charge • Use Amazon Web Services (AWS) cost model as the cloud hosting organization given NASA’s AWS investment across the agency as well as other organizations. • Use AWS published monthly rates for S3 (storage service) and Glacier • Note: Different cost alternatives to paying AWS costs down (e.g., prepaying) • Cost one copy of data in AWS with secondary copy on local hardware • Either primary or backup based on requirements and cost tradeoffs
Relevant PDS Level 1/2/3 Requirements • 2.7 PDS will provide appropriate storage for its archive. • 3.2.1 PDS will provide online mechanisms allowing users to download portions of the archive • 4.1.4 PDS will develop and implement a disaster recovery plan for the archive • 4.1.5 PDS will meet U.S. federal regulations for preservation and management of the data through its Memorandum of Understanding (MOU) with the National Space Science Data Center (NSSDC)
PDS-related Policies • PDS Policy on System Availability and Recovery. (2008-08-29) • Recover from data loss from secondary repository within one week • Recover from a catastrophic event with one month • PDS Policy on Online Data Repositories (2008-08-29) • All data will be held online in a primary repository • PDS Policy on Data Delivery and Backup (2005-10-07) • Three copies of the “volume” are preserved within PDS; two copies within PDS and one at NSSDC
Three copy rule • Operational Copies • Primary Storage – Online, accessible for data distribution • Secondary Storage – Accessible to rebuild the primary repository and/or switchover as a mirror 2. Deep Archive - • PDS will meet U.S. federal regulations for preservation and management of the data through its Memorandum of Understanding (MOU) with the National Space Science Data Center (NSSDC) • PDS assumes that it can recover its entire holdings from one of the operational copy
Architectural Considerations The PDS4 architecture decouples a “storage service” from a registry service to allow storage to be independent. This gives PDS tremendous flexibility to meet its requirements and policies using different storage architectures (local, hybrid, commercial cloud, PDS cloud, etc)…
Amazon Web Services • Enormous ecosystem of capabilities • Storage – Simple Storage Service (S3), Glacier, etc • EC2 – Elastic Compute • Significant support for running databases, ML applications, • Ability to spin up virtual machines • This cost model looked at storage models and costs for using S3 where PDS data would be in the cloud and applications hosted locally.
Benefits of AWS S3 • Built in REST access to any file for download • Can link in authentication • Versioning of files • 99.99% uptime • Encrypted transfers • Management of security buckets • Link to compute services either co-located (EC2/AWS Workspaces, etc) or remotely • Co-location can decrease egress and increase scalability • Custom PDS tools and services for operating on the data
AWS Rates • Storage costs are generally broken into amount of storage plus egress (out of Amazon) • Writing to AWS does not carry a cost • Glacier is lower cost but is slower to access and has high costs for retrieval
Node Data Volume PDS August total storage metrics was about 1.58 PBs.
Node Monthly Data Distribution PDS August data distribution (egress) was approximately 79 TBs.
Findings • Imaging and then GEO drive the PDS storage costs • Both storage and egress • Most other nodes do not have the same costs drivers • Some savings can be achieved through Glacier if certain data sets could be classified that are rarely (if ever) accessed • Different architectural models could be adopted that would affect costs (storage and egress) • Primary vs Secondary Costs • Putting a subset of PDS data in the cloud
Other Considerations • PDS storage growth seems to be averaging approximately 200 TB/year • Assume that adds about 15% to the cost per year • AWS does not eliminate the need for system administration • Need a trained “cloud system administrator” • Compute can be brought to the cloud through EC2 • This could reduce egress charges • Increases need for a cloud administrator • Opens up opportunity for more novel integration of computational capabilties • Other models for cloud and a ”PDS Storage Service” are possible • Other commercial vendors; hosted locally
Recommendation • Given increasing importance of cloud computing, recommend PDS perform a cloud pilot for storage • EN would chair and provide a cloud instance for evaluation • Identify one or more nodes that would demonstrate a PDS4 implementation with data stored in the cloud • Present an experience report the MC in August 2019 for discussion • Feed into a longer term PDS compute and storage roadmap • Rename PMWG and task to lead a cloud pilot study • PDS has used the PMWG (Physical Media WG) in the past; the name seems to be well past its prime and should be renamed • Possible names: Storage Management WG, Infrastructure Management WG, etc
Study • Setup AWS/S3 instances and host a few PDS4 bundles • Assign different buckets to different nodes • Work with nodes to transfer bundles to AWS • Configure a test instances to show product access and distribution via S3 REST API • Link from EN and node hosted web applications • Demonstrate ability to easily share API for data access across nodes