200 likes | 308 Views
A Linux PC Farm for Physics Analysis at the ZEUS Experiment. Marek Kowal , Krzysztof Wrona, Tobias Haas, Ingo Martens, Rainer Mankel DESY, Notkestrasse 85, Hanburg, Germany http://zarah.desy.de/. A Linux PC farm - plan of talk. Overall status Key issues Hardware and Software Next steps.
E N D
A Linux PC Farm for Physics Analysis at the ZEUS Experiment Marek Kowal, Krzysztof Wrona, Tobias Haas, Ingo Martens, Rainer Mankel DESY, Notkestrasse 85, Hanburg, Germany http://zarah.desy.de/
A Linux PC farm - plan of talk • Overall status • Key issues • Hardware and Software • Next steps
Overall status • First reconstruction farm working since 1997 • Right now farm consists of 47 PCs • Both reconstruction and analysis software runs efficiently on PCs
Key issues • Big computing power • Big IO rate • Easy user interface • Maintenance • Prices
Hardware - introduction • “worker” PCs - 45 • “server” PCs - 2 • Fileservers - 3 TB • Old SGI Multiprocessor machines - 44 processors • Network
Farm built over past three years Number of PCs processor memory IDE SCSI 16 B PPro 200 64MB 2GB - 1 S PPro 200 128MB 2GB 3x8GB 19 B PII 350 128MB 6GB - 1 S PII 350 256MB 6GB 3x8GB 10 B PIII 450 128MB 8GB - Each PC equipped with 100Mb network card PCs - commodity hardware
Fileservers - SGI • Origin 2000, 4xIP27 195MHz, 0.75GB RAM, HIPPI 800Mb, 1Gb • Challenge DM, 4xIP19 100MHz, 384 MB RAM, HIPPI 800Mb • SCSI discs - 2TB • Fibre Channel (!) discs - 1TB
SGI Challenge XL • total of three machines • total of 44 processors (IP19,IP25) /20,16,8/ • 4.3GB RAM /1.5,1,1.8/ • HIPPI 800Mb
Software - introduction • BATCH System • Job submission • tpfs & RFIO • WWW interface
Batch system • NQS and LSF evaluated, LSF choosen for PCs • LSF • possibility to define load window • possibility to define resource requirements for job (HDD!) • SGI Challenges - still NQS
Job submission software • Allows to submit jobs (binaries & data) to batch system • Each job is allocated its own working directory • Operations supported: submission, retreival, querying status, listing, killing, purging • Avaliable for: Linux, Solaris, IRIX, OSF1, Windows NT
tpfs • tpfs - transparent access to data stored on robots (hard copy) and discs (cached copy) • automatic staging of files upon trial to open them
RFIO • stateless • avaliable as dynamically loaded library (DLL) > export LD_PRELOAD_PATH=librfio.so > some_program
WWW interface • queues status / steering • systems’ load and statistics • discs avaliability and free space reports • staging status and statistics
Maintenance & Costs • More space required for PCs than for Challenges (Racks!) • No real console avaliable (BIOS problems!) • Synchronization of software - AFS • Price: 3000DM (PC, rack space, switch port, cabling and LSF licence)
Next steps - hardware • complete removal of SGI machines except for Origin2000 fileserver • further PCs to be added • increase in size of discs attached to fileserver - up to 4.5 TB till the end of 2000
Next steps - software • Development of new job submission software written as JAVA applet accesible via WWW • New version of RFIO software with support for Disc Cache project started at DESY (see poster session - Patrick Fuhrmann)
Have you got any questions? • … • … • ...
A Linux PC Farm for Physics Analysis at the ZEUS Experiment Marek Kowal, Krzysztof Wrona, Tobias Haas, Ingo Martens, Rainer Mankel DESY, Notkestrasse 85, Hanburg, Germany http://zarah.desy.de/