1 / 10

James Annis Gabriele Garzoglio Peretz Partensky Chris Stoughton

James Annis Gabriele Garzoglio Peretz Partensky Chris Stoughton. The Terabyte Analysis Machine Project. data intensive computing in astronomy. The Experimental Astrophysics Group, Fermilab. TAM Design. The TAM is a compact analysis cluster that is designed to

gail
Download Presentation

James Annis Gabriele Garzoglio Peretz Partensky Chris Stoughton

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. James Annis Gabriele Garzoglio Peretz Partensky Chris Stoughton TheTerabyte Analysis MachineProject data intensive computing in astronomy The Experimental Astrophysics Group, Fermilab

  2. TAM Design • The TAM is a compact analysis cluster that is designed to • Bring compute power to bear on large datasets. • Make high I/O rate scans through large data sets • Compute power brought to large data sets (Hardware) • 10 processor Linux cluster • High memory (1 Gig RAM/node) and local disk (140 Gig/node) • SAN to Terabyte of global disk (FC network and HW RAID) • Global File System (9x faster read than NFS; HW limited) • High I/O rate scans (Software) • The SDSS science database SX • The Distance Machine framework

  3. IDE disk farms Enstore TAM Hardware April 2001 Gigabit Uplink To Slow Access Data Fast Ethernet Switch Fibre Channel Switch 1 Terabyte of Global Disk 5 Compute Nodes 0.5 Terabyte of Local Disk

  4. System integrator Linux NetworX ACE cluster control box Compute Nodes Linux NetworX Dual 600 MHz Pentium III ASUS motherboard 1 Gig RAM 2x36 Gig EIDE disks Qlogic 2100 HBA Ethernet Cisco Catalyst 2948G Fibre Channel Gadzoox Capellix 3000 Global Disk DotHill SanNet 4200 Dual Fibre Channel controllers 10x73 Gig Seagate Cheetah SCSI disk Software Linux 2.2.19 Qlogic drivers GFS V4.0 Condor The Terabyte Analysis Machine

  5. GFS: The Global File System • Sistina Software (ex-University of Minnesota) • Open source (GPL; now Sistina Public License) • Linux and FreeBSD • 64-bit files and file system • Distributed, server-less metadata • Data synchronization via global, disk based locks • Journaling and node cast-out • Three major pieces: • The network storage pool driver (on nodes) • The file system (on disk) • The locking modules (on disk or ip server)

  6. GFS Performance • Test setup • 5 nodes • 1 5-disk RAID • Results • RAID limited at 95 Mbytes/sec • at >15 threads, disk head move limited • Linear rate increase before hardware limits • Circa 9x faster than NFS

  7. Fibre Channel • Fibre Channel hardware has performed flawlessly, no maintence • Qlogic hardware bus adaptors (single channel) • Gadzoox Capellix Fibre Channel Switch • One port per node • Two ports per raid box • Dot Hill hardware raid system, with dual FC controllers • Hardware bus adaptor ~$800/node • Switches ~$1000/port • HBA shows up as a scsi device (/dev/sda) on Linux • HBA has driver code and firmware code • Must down load driver code and compile into kernel • We haven't explored fibre channel's ability to connect machines over km’s.

  8. Global File System I • We have never lost data to a GFS problem • Untuned GFS cleary outperformed untuned NFS • Linux kernel buffering an issue (“but I edited that file…”) • Sistina mailing list very responsive • Must patch Linux kernel (2.2.19 or 2.4.6) • Objectivity doesn’t behave on GFS. One must duplicate federation files for each machine that wants access. • GFS itself is on the complicated side, and is unfamiliar to sys-admins. • We haven't explored GFS machines as file servers.

  9. Global File System II • Biggest issues are node death, power on, and IP locking • GFS is a journaling FS; who replay journal? • STOMITH: shoot the other machine in the head • Needs to be able to power cycle nodes so that other nodes can replay journal • Power on: • /etc/rc.d/init.d scriptable • We have never survived a power loss without human intervention at power up. • IP locking • DMEP is the diskless protocol, very new; 2 vendor support • Memexpd is an ip lock server • Allow any hardware for disk • Single point of failure and potential throughput bottleneck

  10. Disk Area Arrangement • Data area on /GFS • Home area on /GFS • Standard products and UPS available from NFS • Normal user areas (desktops) available from NFS • /GFS area not NFS shared to desktops • Large amounts of local disk (ide disk) available on each node, not accesible from other nodes

More Related