260 likes | 540 Views
Overview of Lustre. ECE, U of MN Changjin Hong (Prof. Tewfik ’ s group) hongcj92@ece.umn.edu Monday, Aug. 19, 2002. Outline. Reference Lustre Cluster Lustre System Components Distributed Lock Manager Object Based Storage Conclusion (security issues). Reference.
E N D
Overview of Lustre ECE, U of MN Changjin Hong (Prof. Tewfik’s group) hongcj92@ece.umn.edu Monday, Aug. 19, 2002
Outline • Reference • Lustre Cluster • Lustre System Components • Distributed Lock Manager • Object Based Storage • Conclusion (security issues)
Reference • Lustre: A SAN File System for Linux • http://www/lustre.org/docs/lustre/luswhite.pdf • Several presentation materials from Dr. Peter J. Braam
A Lustre Cluster 10,000’s 10’s of nodes 1,000’s
Key Design Issue : Scalability • I/O throughput • How to avoid bottlenecks • Metadata scalability • How can 10,000’s of nodes work on files in same folder • Cluster Recovery • If sth fails, how can transparent recovery happen • Management • Adding, removing, replacing, systems; data migration & backup
Interaction between systems Pre-allocation file creation, recovery purpose, file status, MDS CMD protocol (directory) metadata handling, inodes updates, concurrency Client OS protocol File I/O, allocation of blocks, striping, security enforcement OST
Client File System • A directory tree, subdivision into filesets for cluster ▷wide Unix file sharing semantics • CMD protocol • Transaction-based • Authenticated access • Write-behind caching for MD updates with strict data/metadata coherency
Metadata Service (MDS) • All access to the file is governed by MDS which will directly or indirectly authorize access. • To control namespace and manage inodes • Load balanced cluster service for the scalability (a well balanced API, a stackable framework for logical MDS, replicated MDS) • Journaled batched metadata updates
Object Storage Targets (OST) • Keep file data objects • File I/O service ▷Access to the objects • The block allocation for data obj., leading distributed and scalability • OST s/w modules • OBD server, Lock server • Obj. storage driver, OBD filter • Portal API
Distributed Lock Manager • For generic and rich lock service • Lock resources: resource database • Organize resources in trees • High performance • node that acquires resource manages tree
Big Picture Resource Tree and namespace Resource manager Obj.1 <namespace> Name1 Name2 Name3 Name4 : R R Obj.2 R R Obj.3 distributed resource directory/hash function (LDWV)/lock directory Obj.4 Apps.
Mechanism in resource dB • Hash binary string % N ▷ get h • Lookup system in lock directory weight vector [h] ▷ find system K. • Systems • may occupy 0, 1 or more slots in LDWV • Number of slots is lock directory weight
Lustre DLM features • Low concurrency • Want write-back caching • High concurrency • Want load balancing in cluster • Subdivide directories etc with hashes • Want server of request to limit lock revocations-> ops. on the MD cluster in a client server RPC model • Deadlock detection
Object Based Storage • Object Based Storage Device • More intelligent than block device • Speak storage at “inode level” • create, unlink, read, write, getattr, setattr… • Iterators, security, almost arbitrary processing
Components of OB Storage • Storage Object Device Drivers • Class drivers : attach driver to interface • Targets, clients : remote access • Direct drivers : to manage physical storage • Logical drivers: for intelligence & storage management • Object storage application (OSA) • (cluster) file systems • Advanced storage : parallel I/O, snapshots • Specialized apps. : caches, db’s, filesrv
System Interface • Modules • Load the kernel modules to get drivers of a certain type • Name devices to be of a certain type • Build stacks of devices with assigned types
Benefits-clustering/SM • Suitable for use in a SAN file system • Shared at the level of an individual block • Obj namespace : divided into obj group. This is very advantageous to be able to create obj w/ given obj id’s. Good for snapshot! • Hot file migration
Conclusion • Object Based Storage To process the disk operations on the higher concept of individual files and the file inode level, rather than the low-level h/w disk block level. • Security Issues • Auxiliary service in cluster • LDAP, PKI, Kerberos • Purpose • CFS/ MDS/ OST • Authenticate to each other • Set up session keys
Etc. • GSS-API for authentication and Integrity Checks • Remote DMA • Layer for NEVER bypass security processing • Request processing for checking authentication by a higher level layer in the networking stack