230 likes | 419 Views
Performance Analysis of Cluster File System on Linux. Yaodong CHENG IHEP, CAS chyd@ihep.ac.cn. Outline. Introduction Review of cluster file system Data access model Performance analysis formula Performance test Some useful methods. Introduction.
E N D
Performance Analysis of Cluster File System on Linux Yaodong CHENG IHEP, CAS chyd@ihep.ac.cn
Outline • Introduction • Review of cluster file system • Data access model • Performance analysis formula • Performance test • Some useful methods CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
Introduction • Cluster systems made up with PCs are more and more popular • The improvement of commodity hardware and software • CPU, memory, hard disk, network • Linux software technology • How to use the our existing hardware and software more efficiently CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
job job Compute node1 Compute node N • • • disk disk High speed network I/O Node 1 disk I/O Node N disk • • • disk tape disk Architecture of a cluster system CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
Cluster file system review • one of the most important methods to share information of cluster system • General characteristics: • Single-system image • Transparency • Good scalability • High performance • Structure • C/S, share-disk, virtual share-disk CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
Disk Disk Disk IO node N IO node 1 IO node 2 Client N Client 1 Client 2 Manager Node Data access model N e t w o r k I/O Servers ● ● ● ● ● ● Meta Data Server CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
Some assumptions • Data is processed only in each client • Storage nodes only provide storage capacity and deal with file operations • The traffic between clients and management nodes is very small • The time for dealing with requests of clients is far smaller than the time consumed by transferring data CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
Performance analysis formula T = max (D*c/N, D/(N*I), D/(M*I), D/(P*R) ) S = D/T = min (N/c, N*I, M*I, P*R) • c: the CPU time to compute each byte; • D: the total of data; I: network speed; M: the number of I/O nodes; N: the number of clients; P: the number of disks in parallel; R: disk speed • T: the minimum access time to total data • S: the maximum aggregate bandwidth • Limitation: P/M >=1 CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
In above formula, if c is very small, the formula becomes: T = max (D/(N*I), D/(M*I), D/(P*R) ) S = D/T = min (N*I, M*I, P*R) and this formula is the basis of performance analysis in this work CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
Some cases • N=1, M>=1 (or N>=1 and M=1), R>I S depends on I • N=1, M>=1 (or N>=1 and M=1), R<I S depends on I and P*R • N>1, M>1, R>I S depends on the number of clients and I/O nodes CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
Test environment • Twelve PCs • I/O nodes, Manager nodes and clients • P4 2.8G/512M/DiskWD80G-8M-7200RPM • OS • CERN Linux 7.3.3 • Kernel: 2.4.20-18.7.cernsmp • Local file system: ext3 • Network: 100M Ethernet • Cluster file system • OpenAFS 1.2.9, NFS v3, PVFS, CASTOR1.6.1.2 CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
Pre-test • Test tools • Netperf 2.2pl3 • Iozone 3.217 • Local area network bandwidth (I): • 100M Ethernet: about 94.11Mbits/sec • Local file system measurement (R) • ./iozone -Rab local.xls -g 2048M • Recompile IOzone linked with CASTOR RFIO library CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
One client one server • Only one client access files • Only one I/O nodes in server configuration • Write performance measurement • file size: 512MB • record size: 64KB-16MB • output unit: KB/sec CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
Results CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
Multi-process test • Only one client and one I/O node • Many processes access one I/O node simultaneously. • Write performance measurement • File size: 100MB • Record size: 512KB • Process number: 1 10 • Output unit: KB/sec CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
Multi-client to multi-server • Multiple clients read/write files • Multiple I/O nodes provide file storage • The output is aggregate bandwidth • Only measure CASTOR and PVFS • Write performance • The size of each file: 200M • Record size: 2MByte • Output unit: MB/sec CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
Some useful methods • In theory, good cluster file system • the data is physically balanced among the I/O devices • the data requirements are balanced among the application’s tasks • network has enough aggregate bandwidth to pass the data between the two without saturating • In practice, the following methods are useful CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
Use high-speed network, for example Gigabit Ethernet or Myrinet • Use or develop high performance network file transfer protocol • Use multi-server to improve the aggregate bandwidth • Improve the read/write speed of disks • File stripping and parallel I/O • Good file system design • Improve the processing ability of manager nodes CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
Summary • Cluster file system review • Performance analysis formula • Performance test • Some methods to improve the performance CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland
Thank you!! CHEP'04 Sep 27 - Oct 1, 2004 Congress Zentrum Interlaken, Switzerland