150 likes | 295 Views
UTA Site Report. Jae Yu Univ. of Texas, Arlington. 5 th DOSAR Workshop Louisiana Tech University Sept. 27 – 28, 2007. Introduction. UTA a partner of ATLAS SWT2 Actively participating in ATLAS production Kaushik De is co-leading Panda development
E N D
UTA Site Report Jae Yu Univ. of Texas, Arlington 5th DOSAR Workshop Louisiana Tech University Sept. 27 – 28, 2007
Introduction • UTA a partner of ATLAS SWT2 • Actively participating in ATLAS production • Kaushik De is co-leading Panda development • Phase I implementation at UTACC completed and running • Phase II hardware installation completed • Software installation in progress • MonALISA based OSG Panda monitoring implemented Allow OSG sites to show up on the LHC Dashboard • Working on DDM monitoring • HEP group working with other discipline in shared use of existing computing resources • Interacting with the campus HPC community • Working with HiPCAT, Texas grid community
UTA DPCC – The 2003 Solution • UTA HEP-CSE + UTSW Medical joint project through NSF MRI • Primary equipment for D0 reconstruction and MC production up to 2005 • Now primarily participating in ATLAS MC production and reprocessing at part of SWT2 resources • Other disciplines also use this facility but at a minimal level • Biology, Geology, UTSW medical, etc • Hardware Capacity • PC based Linux system assisted by some 70TB of IDE disk storage • 3 IBM PS157 Series Shared Memory system
UTA – DPCC • 84 P4 Xeon 2.4GHz CPU = 202 GHz • 5TB of FBC + 3.2TB IDE Internal • GFS File system • 100 P4 Xeon 2.6GHz CPU = 260 GHz • 64TB of IDE RAID + 4TB internal • NFS File system • Total CPU: 462 GHz • Total disk: 76.2TB • Total Memory: 168Gbyte • Network bandwidth: 68Gb/sec • HEP – CSE Joint Project • DØ+ATLAS • CSE Research
SWT2 • Joint effort between UTA, OU, LU and UNM • 2000ft2 in the new building • Designed for 3000 1U nodes • Could go up to 24k cores 1MW Total power capacity Cooling with 5 Livert units
Installed SWT2 Phase I Equipment • 160 Node cluster (Dell SC1425) • 320 cores (3.2GHz Xeon EM64T) • 2GB RAM/core • 160GB SATA local disk drive • 8 Head Nodes (Dell 2850) • Dual 3.2 GHz Xeon EM64T • 8GB RAM • 2x 73GB (RAID1) SCSI Storage • 16TB Storage System • Direct Data Networks S2A3000 system • 80x250GB SATA drives • 6 I/O servers • IBRIX Fusion file system • Dedicated internal storage network (Gigabit Ethernet) • Has been operating and conducting Panda production over a year
SWT2 Phase II Equipment • 50 node cluster (SC 1435) • 200 Cores (2.4GHz Dual Opteron 2216) • 8GB RAM (2GB/core) • 80 GB SATA disk • 2 Head nodes • Dual Opteron 2216 • 8 GB RAM • 2x73GB (RAID1) SAS Drives • 75 TB (raw) Storage System • 10xMD1000 Enclosures • 150x500GB SATA Disk Drives • 8 I/O Nodes • DCache will be used for aggregating Storage • 10GB internal network capacity • Hardware installation completed and software installation in progress
ATLAS SWT2 (2007) SWT2-PH1@UTACC SWT2-PH2@UTACPB • 320 Xeon 3.2 GHz cores = 940 GHz • 2GB Ram/core = 640GB • 160GB Internal/unit 25.6TB • 8 dual core server nodes • 16TB of storage assisted by 6I/O server • Dedicated Gbit internal connections • 200 Optaron 3.2 GHz cores = 640 GHz • 2GB Ram/core = 400GB • 8GB SATA /unit 4TB • 8 dual core server nodes • 75TB of storage by 10 Dell MD100 Raid assisted by 8 I/O servers • Dedicated Gbit internal connections
Network Capacity History at UTA • Had DS3 (44.7MBits/sec) till late 2004 • Choke the heck out of the network for about a month downloading D0 data for re-reconstruction • Met with VP of Research at UTA and emphasized the importance of network backbone for attracting external funds • Increased to OC3 (155 MBits/s) early 2005 • OC12 as of early 2006 • Connected to NLR (10GB/s) through LEARN (http://www.tx-learn.org/) via 1GB connection to NTGP • $9.8M ($7.3M for optical fiber network) state of Texas funds approved in Sept. 2004
NLR – National LambdaRail ONENET LONI LEARN 10GB/sec connections
Software Development Activities • MonALISA based ATLAS distributed analysis monitoring • A good, scalable system • Software development and implementation completed • ATLAS-OSG sites are on the LHC Dashboard • New server purchased for OSG at UTA Activation to follow shortly • Working on DDM monitoring project
CSE Student Exchange Program • Joint effort between HEP and CSE • David Levine is the primary contact at CSE • A total of 10 CSE MS Students each have worked in SAM-Grid team • Five generations of the student • Many of them playing leading roles in grid community • Abishek Rana at UCSD • Parag Mashilka at FNAL • Sudhamsh Reddy working for UTA at BNL • New program with BNL implemented • First student on completed the tenure and is on job training • Second set of two Ph.D. students at BNL • Participating in ATLAS Panda project • One student working on pilot factory using condor glide-in • Working on developing middleware
Conclusions • MonALISA based panda monitoring activated • New server needs to be brought up • Working on DDM monitoring • Will be involved in further DDM work • Leading EG2 (photon) CSC note exercise • Connected to 10GB/s NLR via 1GB connection to UTD • Working closely with HiPCAT for State-wide grid activities • Need to go back on to the LAW project but awaiting for successes at OU and ISU