220 likes | 296 Views
Comparison and Performance Evaluation of SAN File System. Yubing Wang & Qun Cai. Outline. Background (YW) SAN File System (YW) Comparison of SAN File Systems (YW) Performance Evaluation of SAN File Systems (QC). Background. Explosive Data Growth
E N D
Comparison and Performance Evaluation of SAN File System Yubing Wang & Qun Cai
Outline • Background (YW) • SAN File System (YW) • Comparison of SAN File Systems (YW) • Performance Evaluation of SAN File Systems (QC)
Background • Explosive Data Growth - Both documents and applications are becoming more media-rich, driving up file sizes. - Continued growth in capacity of memories and disks promotes further file growth. - Requires more graceful management of storage sizes and speeds. - Availability, scalability, and reliability are critical issues. • Classic Client/Server Distributed File Systems – the Problems - Single points of failure leading to low reliability and availability. - No means of high-performance data sharing: low bandwidths due to slow media, high protocol overheads, and server bottlenecks. - Limited scalability: inherent architecture drawback.
Background - continued • Enabling Technologies • Fibre Channel • High bandwidth, low latency network and channel interface. • merges features from both networks and channel interfaces to create a storage interface. • Highly scalable, very flexible topologies. • open, high volume, industry standard. • Network-attached Storage (NAS) • A specialized servers with optimized file systems and thin OS that are tuned for the requirement of file serving. • Using NFS and CIFS for file access protocol. • Architecture
SAN File System • Storage Area Networks (SAN) • NAS + Fibre Channel + Switch + HBA (Host Based Adapter). • Allows direct data transfer between disks and clients. • Architecture. • SAN File Systems • An architecture for distributed file systems based on shared storage. • Fully exploits the special characteristics of Filbre Channel-based LANs. • Key feature is that clients transfer data directly from the device across the SAN. • Example: Global File System (GFS), Central Version File System (CVFS), CXFS(SGI), SANergy (Tivoli Systems).
SAN File Systems - Continued • Key Characteristics and Issues • More than one client may access the data from the same storage device. • Must recognize the existence of other clients accessing the same storage device and file system data and meta-data. • Directly through the meta-data.(GFS) • Through a file manager.(CVFS, CXFS) • Precludes most local file systems: these consider storage devices as owned and accessed by a single host computer. • Key Advantages • Availability is increased since the shared data is not owned by a single host. • Load-balancingis simplified by clients’ ability to access any portion of the shared disks. • Scalability in capacity, connectivity, and bandwidth can be achieved without limitations inherent in file systems designed with central servers.
Two Architectures for SAN FS • Distributed network-attached storage model • making storage a part of the network rather than a part of the local server. • no central server, hosts are peer to peer. • file system in each host should coordinate to meta-data. • examples:GFS. • The Hybrid Network-Attached Storage Model • using meta server(s) as the coordinator of all the metadata traffic. • Real data and meta-data have separate paths. • Examples: CVFS, CXFS
GFS • Global File System (GFS) • Developed at the University of Minnesota. • Using network storage pool – a shared address space of disk blocks. • The storage pool layer implements: locks, striping, name services etc. • Nodes are independent, no file manager. • Using Device Lock to maintain consistency. • Each GFS file system is divided into several Resource Groups (RGs). • Resource groups are designed to distribute file system resources across the entire storage subpool. • Resource groups are essentially a mini-file system. • Each group has a RG block, data bitmaps, dinode bitmaps, dinodes, and data blocks.
CVFS & CXFS • CentralVersion File System (CVFS) • A distributed file system focused on the digital video and film industry. • Transferring data directly between network storage and clients. • Using TCP/IP transports under a client/server model for control and meta-data. • The file access protocol is token-based. • CXFS • A distributed file system in SGI. • Based on the SGI XFS file system. • Transferring data directly between network storage and clients. • Using TCP/IP transports under a client/server model for control and meta-data. • Could have more than one meta-servers.
Comparison – Symmetric or Asymmetric • Symmetric - Any client can perform any file system operation without interacting with another client. • Asymmetric - A client must first make a request through a file manager executing on another host. • GFS is Symmetric - No central file manager (Meta-server). • CVFS and CXFS are asymmetric - Meta operations through meta-servers.
Comparison - Locking • GFS • locking is in devices (shared storage). • similar to test-and-set locks in memory. • each device lock is associated with a logical clock. • many locks per device (up to 1024). • The locking mechanism caused poor scalability of GFS. • CXFS • locking is performed on inodes (File System), in either shared or exclusive mode. • CVFS • Token-based, a Token grants the right to a client to perform an operation on a file. • Tokens are maintained between meta-server and clients.
Comparison - Caching • GFS • GFS cache metadata both on the disk driver caches and client memory . • Device locks maintain the consistency for the disk driver caches. • Locks are polled to determine if metadata in client is stale or not. • CXFS • CXFS buffer cache supports delayed allocation. • CXFS uses transactional logging mechanism to maintain consistency. • CVFS • Metadata is cached locally on the client. • Using a call-back mechanism to maintain consistency. • Certain attributes of metadata are flushed back to meta-server on demand.
Comparison - Sharing • GFS • GFS distributes file system resources across the entire storage subsystem, allowing simultaneous access from multiple machines. • GFS supports both read and write sharing. • CXFS • When using buffered I/O, multiple readers can access the file concurrently, but only a single writer is allowed to the file at a time. • When using direct I/O, multiple readers and writers can all access the file simultaneously. • CVFS • CVFS provides no guarantees about the consistency of write-shared data. • Similar to UNIX semantics, “last writer wins”.
Comparison – Meta-data Operations • GFS • The metadata in GFS (dinode) is partitioned into groups (Resource Groups) for scalability and load balancing. • Metadata can exist at disk, disk cache, and client memory. • Clients serve only local file system requests and act as file mangers for their requests. • CXFS • All the metadata operations are through meta-servers. • Infrequent compared with the data operations directly to the disks. • CVFS • All the metadata operations are through file system manager (FSM). • The message protocol between client and FSM is token-based. • Token contains information describing the file. • Metadata can be cached on clients and only flushed back to FSM on demand.
Comparison – Data Transfer • GFS • GFS does not have a separate data transfer path. • Clients serve only local file system requests and act as file mangers for their requests. storage devices serve data directly to clients. • CXFS • CXFS uses a combination of clustering, read ahead, write behind, and request parallelism in order to exploit its underlying disk array. • CXFS allows applications to use direct I/O to move data directly between application memory and disk array using DMA. • CVFS • The clients locally cache the extent list, which describes the physical location of the file on disk. • Given the information in the extent list and the size of the stripe groups, the client initiates IO to each disk in the stripe disk simultaneously. • To avoid the data copies that are inherent in most protocol, CVFS transfers data directly from the Fibre Channel disks into the user’s application buffers.
Performance Evaluation of SAN File Systems • To address the major aspects of SAN FS performance. • Scalability. • Sharing. • Meta-data operations. • Real data transfer. • Provide Micro-benchmarks to measure the performance of SAN FS. • In application level. • Measurethroughput and response-time.
Related Work • Lmbench (microbenchmark) • Modified Andrew Benchmark • LADDIS from SPEC---NFS performance • Others
Scalability • What to measure? • Number of clients that can be supported. • Aggregate bandwidth that can be achieved. • How to measure? • Increase the client number to saturate the whole system. • The applications include: write files, read files. • The whole file set accessed should be huge, much larger than cache size. • The file size varied exponentially from 8KB to 1GB. • The request (block) size varied from 8KB to 512KB. • For each write, using sync() to force the data to disk.
Sharing • What to measure? • Multiple processes in a single host reading or writing to the same file. • Multiple processes in multiple hosts reading the same file. • Multiple processes in multiple hosts with one process writing a file and rest of the processes reading the same file. • Multiple processes in multiple hosts overwriting the same file. • Multiple processes in multiple hosts appending the same file. • Multiple processes in multiple hosts truncating the same file. • How to measure? • Both throughput and response-time should be measured. • Vary the number of sharing processes and hosts.
Meta-data operations • What to measure? • Directory operations. • File Create and Delete. • File Access and Attribute. • Symbol Link etc. • How to measure? • Measure the response time for each operation. • Need to measure the meta operation performance in a heavily loading environment.
Data Transfer • What to measure? • Read operation response time and throughput. • Writeoperation response time and throughput. • How to measure? • Direct I/O (raw I/O) if supported. • Buffered I/O. • Different read mode: sequential read, random read. • Different write mode: sequential write, random write.
Work in future • Consider more micro-benchmarks for characters of SAN file system • Implement these Micro-benchmarks. • Build a micro-benchmark suite for SAN file system • Build a adaptable and widely-used benchmark to measure SAN file system performance.