210 likes | 226 Views
Explore the use cases, technical details, and state-of-the-art solutions for data storage and management in the cloud. Learn about block storage, object storage, native cloud solutions, and the EGI Federated Cloud.
E N D
Data Services and SolutionsPart 2: Data in the cloud Enol Fernández enol.fernandez@egi.eu Data Services and Solutions - PART II
Outline • Data in the cloud • Use cases and technical details • Block Storage • Object Storage • Native cloud solutions • State of the art: status in EGI • Examples • Future Plans Data Services and Solutions - PART II
What is the EGI Federated Cloud The EGI Federated Cloud is federation of institutional private Clouds, offering Cloud Infrastructure as a Service to scientists in Europe and worldwide. EGI Federated Cloud is based on: • Standards and validation: federation is based on common Open-Standards – OCCI, CDMI, OVF, GLUE, etc... • Heterogeneous implementation: no mandate on the cloud technology, the only condition is to expose the chosen interfaces and services. Data Services and Solutions - PART II
FedCloud IaaS Capabilities Data Services and Solutions - PART II
Block Storage Persistent block level storage to use with VMs VM Data Services and Solutions - PART II
Object Storage Data storage infrastructure for storing and retrieving data from anywhere at any time Data Services and Solutions - PART II
Block Storage vs Object Storage Data Services and Solutions - PART II
Use Cases Data Services and Solutions - PART II
Block Storage: Typical Use • Store your data on volumes • Data persists independently of VM • Stripe volumes for better performance • Share via network filesystem (e.g. NFS) or use as DB store VM NFS Data Services and Solutions - PART II
Block Storage: OCCI • OCCI (Open Cloud Computing Interface) is a OGF standard API to facilitate interoperable access to cloud resources • Block storage in FedCloud is managed via OCCI: • create/delete volumes • Attach/detach (link/unlink in OCCI terms) to VMs • Once attached, use as other disk in VM Data Services and Solutions - PART II
Object Storage: CDMI • FedCloud object storage is managed via CDMI (Cloud Data Management Interface) • RESTful API for operations on storage objects • Developed by SNIA, now ISO/IEC 17826 • Very flexible API, based on capabilities: • Object basic capabilities (create/get/delete/list) • Object ACLs • Import from external sources, export as Filesystems • … Data Services and Solutions - PART II
Native Cloud Solutions • Cloud Management Frameworks (CMF) provide their own APIs for managing cloud storage • Usually more features than OCCI/CDMI • However, not (yet) fully integrated in EGI’s FedCloud Data Services and Solutions - PART II
State of the Art: Block Storage • Block storage is supported on all FedCloud CMFs and sites Data Services and Solutions - PART II
State of the Art: Object Storage • CDMI support • CDMI server framework by Synnefo • On going effort to support OpenStack • Basic client available • Native APIs allow basic and advanced capabilities Data Services and Solutions - PART II
Example: Chipster • Chipster is a graphical application for data analysis, with server backend • Original Chipster VM included big collection of tools and data (~200GB) • Deployment at FedCloud • Separated VMs from tools and data with block storage • NFS server for these volumes • Chipster VMs mount the NFS exports on start-up EGI FedCloud Resource Provider Chipster VM NFS Server Chipster VM Tools Volume Data Volume Data Services and Solutions - PART II
Example: EISCAT-3D (I) • EISCAT-3D is a 3D imaging radar to be located in the northernmost parts of Europe. • Open Source Geospatial Catalogue (OSGC) Portal provides access to the data stored in Object Storage providers at FedCloud • Planning extra services to further process the EISCAT-3D data and make it available in the portal Data Services and Solutions - PART II
Example: EISCAT-3D (II) Off-site On-site EGI Federated Cloud EISCAT archive Object Storage Juelich site (DE) OpenStack SWIFT Open Source Geospatial Catalogue (OSGC) CESNET site (CZ) 5m files, ~1TB in total CDMI with HTTP export wget Webbrowser Catalogue Scientific users Phase 1: In ENVRI Phase 2: In EGI-Engage Near Real Time tool to import data automatically from receiving stations Data administrators Pre-processing service 1 Pre-processing service N ... Admin tools Processing / visualization service 1 Processing / visualization service N ... Data Services and Solutions - PART II
Plans • EGI-ENGAGE: • Effort to further develop OCCI/CDMI interfaces in FedCloud • OneData development • Storage Testbed • Other related projects: • INDIGO will develop (data) cloud solutions Data Services and Solutions - PART II
Distributed multi-provider storage • Flexible access control • Intra-federations scenarios for sharing data • Works with Tokens or X.509 • POSIX client for mounting user’s space • Scalable from Single NAS to Large Datacentre • Can be deployed on top of high-performance parallel storage solutions with very small overhead < 5%. • Support for open data scenarios in preparation • Onedata is currently supported by:PLGrid, EGI-Engage, INDIGO-DataCloud, ESPREX for ISS Data Services and Solutions - PART II
Storage Testbed • Testbed will allow to: • Test tools and setups in a distributed and big enough collection of resources • Pilot applications to be migrated to production • Currently looking for Resource Providers • Join as users/use cases to articulate requirements and preferences for this infrastructure Data Services and Solutions - PART II
References EGI Federated Cloud resources • Wiki site: http://go.egi.eu/fedcloud • User support: https://wiki.egi.eu/wiki/Federated_Cloud_user_support • User support e-mail: support@egi.eu • Federated Cloud Communities: https://wiki.egi.eu/wiki/Federated_Cloud_Communities • Federated Cloud Storage HOWTO: https://wiki.egi.eu/wiki/HOWTO09_How_to_use_Federated_Cloud_Storage Related Standards: • OCCI: http://occi-wg.org • CDMI: http://cdmi.sniacloud.com/ Data Services and Solutions - PART II