240 likes | 402 Views
A report on the 2nd Annual Linux Storage Management Workshop. Pete Grönbech Systems Manager. Workshop Topics. Local File Systems Volume and Device Management Kernel Development Distributed & Cluster File Systems Clusters and High Availability Backup NFS and Device Management
E N D
A report on the2nd Annual Linux Storage Management Workshop Pete Grönbech Systems Manager HEPSYSMAN 2000
Workshop Topics • Local File Systems • Volume and Device Management • Kernel Development • Distributed & Cluster File Systems • Clusters and High Availability • Backup • NFS and Device Management • Storage Management • Tutorials HEPSYSMAN 2000
Local File Systems • Stephen Tweedie, Red Hat: ext2fs & ext3fs • Hans Reiser, Namesys: ReiserFS • Steve Best, IBM: IBM’s journaled File System • Steve Lord, SGI: SGI’s XFS • Steve Pate, Veritas: VxFS & VxVM • 2.4 Kernel Ted Ts’o Comments Volume and Device Management HEPSYSMAN 2000
Stephen Tweedie • 2.4 Kernel better SMP scalability • 2.2 was better than 2.0 but not scalable above 4 CPU’s • 2.4 has large memory support; increased from 4GB to 64GB using PAE36 • Need Distributed Lock Manager • Journaling filesystems • ext3 and reiserfs being used in production • Raid support • Software raid 1/5 are integrated • Mylex, DPT, 3ware hardware raid controllers • Clustering • Already have failover and load balancing HEPSYSMAN 2000
ReierFS • Perhaps more stable than ext3 at present • Shipped on latest SuSE CD? HEPSYSMAN 2000
IBM JFS • Scaleable 64-bit filel system • File size max 512TB w/ 512 block size • File size max 4 PB w/4k block size • (Limited by Linux I/O structures not being 64-bit) • Journaling of meta-data only • Restarts after crash in seconds • B+tree use is extensive throughout JFS • Store names using Unicode • JFS shipped 2/2/2000 Alpha software, Beta end of the year. • http://oss.software.ibm.com/developerworks/opensource/jfs/index.html • Lots of bureaucracy in IBM before a single line of code can be released as open source. Lawyers have to check it against all their patents. HEPSYSMAN 2000
SGI’s XFS SGI has ported its XFS filesystem and associated utilities to Linux. XFS is a full 64-bit file system that can scale to handle extremely large files (2^63-1 byte) and file systems. • Journaling • Delayed write allocation, for better disk layout • Direct I/O • DMAPI support • Beta code on 2.4.0-test5 • 2TB block device limit HEPSYSMAN 2000
Veritas • Industrial Strength Commercial product • VxFS and VxVM • $1Billion Revenue in 2000 (Not open source!!) • Ported to many unix variants • Solaris, HP-UX, NT, W2K, AIX and Linux • Port to Linux 2.4 kernel at Beta stage. • Has all the features: • Journaling, online grow-shrink,defrag, snapshots, clustered file-system. HEPSYSMAN 2000
Linux Present • 2.4 almost ready to be released (really!) • 2.4 is much more scaleable • 64 Gig memory on IA32 • 64 bit file access on 32-bit platforms (LFS API) • 32-bit uid, gid • Much better SMP scaleability • Fine-grained locking (networking, VFS, etc.) • Better BUS support (PCMCIA, USB, firewire) • NFS v3, NFS improvements • RAW I/O HEPSYSMAN 2000
Volume and Device Management • Heinz Mauelshagen, Sistina Software: LVM • Richard Gooch, University of Calgary: Linux devfs • A virtual File system similar to /proc • with devfs /dev reflects the hardware you have • register devices by name rather than device numbers • Can support hot plugin of USB, PCMCIA and Firewire. • Ben Rafanello, IBM: IBM’s LVM • IBM has volume group based LVMs and a partition based LVM. Linux has LVM…. • Enterprise Volume Management System to emulate multiple LVMs within a single LVM, (Uses plug in modules) HEPSYSMAN 2000
Kernel Development • Rik van Riel, Connectiva: VM system • Eric Youngdale, MKS, Inc: Linux SCSI mid-layer • Justin Gibbs, Adaptec: Low-level SCSI drivers HEPSYSMAN 2000
Distributed & Cluster File Systems • Peter Braam, Carnegie Mellon University: Intermezzo • A replicating high availability file system and file synchronization tool • Ken Preslan, Sistina Software: GFS • Chris Feist, StorageTek: Secure File System • Jeremy Allison, VA Linux: Samba • Rob Ross, Argonne: PVFS HEPSYSMAN 2000
GFS • New networking technologies allow multiple machines to share storage devices. File systems that allow these machines to simultaneously mount and access files on these shared devices are called shared-disk file systems. This is in contrast to traditional distributed file systems where the server controls the devices. • GFS is a shared device, cluster file system for linux. GFS supports journaling and rapid recovery from client failures. Nodes within a GFS cluster share the same storage by means of Fibre Channel or shared SCSI devices. • The file system appears to be local on each node and GFS synchronises file access across the cluster. HEPSYSMAN 2000
Node Node Node Storage Cluster RAID RAID Storage Area Network GFS GFS Switch TCP/IP HEPSYSMAN 2000
Parallel Virtual File System • Use of multiple distributed I/O resources by a parallel application • Goal is to increase aggregate I/O performance • Accomplished by reducing bottlenecks in I/O path • no single I/O device • no single I/O bus • no single network path • Target is medium to large clusters (64 or more nodes) • Applications using MPI (ROMIO provides the interface, MPI-IO) • Linux 2.2 kernel • TCP data transfer only • Use UNIX interface to store data on local file system (eg ext2fs, reiserfs) HEPSYSMAN 2000
CN 0 ION 0 PVFS NETWORK CN 1 ION 1 CN 2 ION 2 CN n ION n HEPSYSMAN 2000
Clusters and High Availability • Alan Robertson, SuSE: Heartbeat • Lars Marowsky-Bree, SuSE: Failsafe • Brian Stevens, Mission Critical Linux: Kimberlite • Philip Reisner, Qubit: drdb HEPSYSMAN 2000
Heartbeat • Memership Services • Notice when machines join/leave the cluster • Notice when links go down/come back • Communication Services • Cluster Manager • Currently limited to 2 nodes • Resource Monitoring • Not yet • Storage • Resource I/O Fencing • STONITH Shoot the other node in the head • Reset or Power cycle other node • Load Balancing (Optional) HEPSYSMAN 2000
distributed replicated block device • disk mirroring via the network used for implementing high-availability servers under linux. • webservers, fileservers • data mirroring, cheaper than with shared disks • higher overheads at writes • monitoring of nodes with “heartbeat” • currently 2.2 kernel HEPSYSMAN 2000
Backup • John Jackson, Purdue: Amanda • Gawain Lavers, Big Storage: Linux DMAPI HEPSYSMAN 2000
NFS and Device Management • Andy Adamson, University of Michigan: Linux NFS V4 • Sept 1st 2.2.14 Kernel • Network Appliance sponsoring NFS V3/4 performance project • Dave McAllister, 3ware: Storage protocols over IP • Holger Smolinski, IBM Germany: Dynamic registration of SCSI devices HEPSYSMAN 2000
Storage Protocols over IP • With the advent of gigabit ethernet, and the planned drive to 10 gigE, the capability of supporting storage, and SAN over IP becomes attractive. • Being defined by 3ware, working with IETF iSCSI • 3ware….defining a protocol that allows multiple ATA(IDE) drives to be presented as one or more SCSI drives • Allows storage on same network as Lan traffic. • Over comes ordering restrictions that have hampered SCSI and FC • Capabilities of FC SAN, but at much lower cost • Their hardware had embedded linux, but they changed to FreeBSD and got a 3 fold (22-70MB/s) performace improvement. This is because the 2.4 kernel does not have zero copy yet. HEPSYSMAN 2000
Storage Management • Ric Wheeler, EMC: Smart Storage and Linux: An EMC perspective • Daniel Pillips, innominate AG: Tux 2 Filesystem • Like a journaling FS but no journal, uses a db type approach. • Sang Oh, SANux: SANux File System • SAN based cluster file system. Has a DLM. HEPSYSMAN 2000
My Conclusions • (Too) Many different Journaling File systems to choose from • ext3fs mainstream (from Red Hat) • ReiserFS may be more stable now (from Suse) • Low cost data mirroring with drbd • For SAN’s look at GFS • For SMP 2.4 Kernel is key For more details and copies of the slides see http://www2.physics.ox.ac.uk/gronbech HEPSYSMAN 2000