230 likes | 242 Views
This project outlines the goals, partners, infrastructure, and features of the National Data Storage (NDS) in the PIONIER network. It discusses the components, replication options, and potential applications of the NDS system. The NDS aims to provide distributed, reliable, and secure data storage with broad-band access.
E N D
National Data Storage (NDS)in the PIONIER*) network Maciej Brzeźniak, Norbert Meyer, Rafał Mikołajczak, Maciej Stroiński *) PIONIER - Polish Optical Internet
National Data Storage (NDS)in the PIONIER network • Outline: • Project partners and status • Goals of the project • Components, infrastructure (including PIONIER network) used for building the NDS system • Main NDS features + Added values of NDS • Overall NDS architecture • Example NDS use cases + replication options in NDS • Potential end-users and applications • Some words about other storage-related projects
NDS Project Partners • 4 academic computing centres + 4 universities in Poland: • Academic Computing Center CYFRONET AGH, Cracow • Academic Computing Center in Gdańsk • Częstochowa University of Technology • Marie Curie-Skłodowska Universityin Lublin • Poznań Supercomputing and Networking Center • Technical University of Białystok • Technical University of Łódź • Wrocław Supercomputing and Networking Center
National Data Storage - Goals • Data storage system that: • is distributed, no centralisation! • has national ‘coverage’, • is reliable and secure, • provides broad-band access. • Services: • Backup/Archive services • Application-level data storage: • logical filesystem: • single logical name space (visible from multiple access points) • separate logical namespaces • accessible through: • SCP, (s)FTP, HTTP(s) protocols • and other techniques
National Data Storage components:existing and new • Existing components: • Network • Storage Hardware • Storage Management Software • New components: • NDS System Management Software
NDS existing components – PIONIER networkphysical links Installed fibers Leased fibers PIONIER nodes Planned for 2007
NDS existing components – PIONIER networklogical links GEANT210+10Gb/s edu traffic 5Gb/s Internet Legend 2x10Gb/s (2 lambdas) CBDF 2x10 Gb/s (2 lambdas) 1 Gb/s Metropolitan Area Networks Metropolitan Area Networks + Supercomputing Centres
NDS components – Hardware and softwareStorage Hardware and Software • Hardware: • disk matrices, tape libraries • starting from 1.2-2 PB (disks+tapes) • 4x 50-200 TB of disks • 4x 200-400 TB of tapes • more in future • Storage Area Networks • file servers, application servers • Software: • Storage Management Systems • Hierarchical Storage Management (HSM) systems, • Backup/Archive systems
National Data Storage –main features • Target infrastructure: • - 4 main storage nodes • - 4 application nodes • - embedded in PIONIER network • Storage nodes: • Provide data storage services • Compose the system core • - manage the data objects, file • space, user accounts… • - control network/hardware/ software compoments • Application nodes: • Provide additional services on top of the core services, e.g:- searching basing on meta-data • - versioning, - custom interfaces to data NDS
National Data Store – Added Value • High level of dependability: • Data & services avalability: • Geographical replication - replicas stored in multiple, distant sites • Hardware/software components redundancy • + High-end, by-design redundant components • Backbone network links redundancy • Fault-tolerance features in the NDS management software • Decentralisation vs coherency of data and meta-data: • Coherency kept by NDS management software • of course challenging… - the ‘core’ of the research work, the rest is mainly the deployment work
National Data Store - Added Value • High level of dependability (continued) • Data confidentiality and integrity: • Encryption: • Where: • On the way from the client to the system • Optionally, before storing the clientdata into NDS • Architecture to support both approaches • How? • Certified cryptographic solutions (software- and/or hardware-based) used for clients that require them • Data integrity: • Ensured by careful system design and security audits • Evaluatede.g by digest mechanisms: MD5, SHA1…
User interfaces • Both ‘standard’ and ‘custom’ interfaces • standard: • B/A service, • Application-level storage: (s)FTP, SCP… • custom: • B/A service with encryption + integrity checks • application-level storage with encryption + integrity checks • HTTP/HTTPs interface with meta-data support; • meta-data can be used later, e.g. for searching files • Why various interfaces? – in order to: • allow different users to exploit different features • meet contradictory requirements… e.g. security vs simplicity
Replication options (0) No replication at all • Compliant with standards (e.g. industry-accepted B/A clients) • Data redundancy in the confines of a given node (RAIDs, redundant tape pools) (1) ‘Off-line’ • Data originally stored into one site, then replicated to another site • Suitable for standard access methods • Issues: • users gets only metadata information concerning replicas created, e.g. by email or on the web-site • Replication is not atomic with ‘store’ operation (2) ‘On-line’ • Data replicas are created by the access point in parallel to data storage process • Assumed number of replicas is createdatomically with the ‘store’ operation • Limitations: • Suitable for ‘custom’ access methods, incompatible with ‘standard’ ones • Hard to implement, possible performance delays
Example use case – standard B/A clientoff-line replication Store/retrieve data to/from KMD NDS Features:- nosystem-side data replication- load balancing on per-session basis possible- BUT compliance to standard- NOTE that replication can be doneon the client side (manually or automatically)
Example use case – advanced B/A clienton-line replication Store/retrieve data to/from KMD NDS Features:- on-line data replication!!- dynaminc load- balancing possible- BUT not compatible with standard B/A clients
Potential end-users of NDS • Educational institutions and projects: • Backup/Archive services for universities • Cross-centers backup copies/recovery for academic computing centres • Storage space / file sharing facilities for: • scientific/educations projects • national and EU R&D projects • Government offices and agencies: • Backup/Archive for government agencies and organisations • E.g. Police cameras etc., metropolitan CCTV systems, Zoll agencies… • Secure storage/archival of financial, medical … data • Such data are confident ‘by definition’ • System certification for such kind of data would be required - out of scope of the project – but this is planned for future • Other end-users: • Museums, digital libraries… • Digitalisation (scanning of eold books, paints…)
Summary – National Data Store • User point of view • Reliable, Secure and Efficient(high performance, broadband access) • Flexible – many possible interfaces, some other options to choose • Can be the extra functionallity to the network links • Service Provider point of view • scalable system • (cost-) efficient solution, thanks to: • ‘effect of the scale’ • per TB costs are lower for large-scale systemsthan for small ones • using our own network links • No need to pay anyone else for network • optimal usage of resources: • HSM systems (i.e. disks + tapes + mgmt) used when possible instead of pure disk-based storage – allows to use economical media types • network channels reservation on-demand (inst. of persistent links)
A bit off-topic slide– other storage-related projects in PSNC • Currently running storage-related projects: • CoreGrid (NoE project): • WP2 (CoreGRID Institute on Knowledge and Data Management), • Task 2.1: Distributed Storage Management • Partners: FORTH, Crete, Greece (prof. Angelos Bilas group)and SZTAKI, Hungary, UCY Cyprus (Zsolt Nemeth) • Already finished projects: • Secure data storage for Digital Signature System (National R&D project) • Data acquired from Oracle Database and encrypted BEFORE going into the backup system (on the client side) • Hardware-based appliance secures the transmission/storage • Encrypted data put to a regular Backup/Archive system • Evaluation of the performance of iSCSI and iFCP protocols (published on TERENA conference) • Automated Backup System – used internally in PSNC • Planned projects: • Evaluation of the cluster-based storage approach (e.g. in NDS environment) • Perhaps common EU project with FORTH…
Thank YOU!Contact:Maciej Brzeźniak, maciekb@man.poznan.pl Norbert Meyer, meyer@man.poznan.pl
End user example –Police Department in PoznanBackup/Archive service for City Video Monitoring System (CCTV) Cameras in Poznan: 2004 – 70 cameras 2005 – 85 cameras 2006 – 165 cameras 2007 – 200 cameras… 2 TB /day 60 TB /month Data must be stored at least for a month for security purposes and are retrievedfor investigations when crime happens. Tape media are ideal for long term storageso we provide B/A service to police dep. using our B/A systemand tape libraries.
Next step– usage of NDSto provide B/A service for CCTV at the national scale POLICE BIAŁYSTOK Temporary storage only NDS storage node in Poznan • Long term storage (archiving) • Backup copies POLICE ŁÓDŹ CZESTOCHOWA Temporary storage only POLICE CZESTOCHOWA