200 likes | 431 Views
Louisiana Tech Site Report. DOSAR Workshop V September 27, 2007 Michael Bryant Louisiana Tech University. Louisiana Tech University and LONI. COMPUTING IN LOUISIANA. Computing Locally at LTU. At the Center for Applied Physics Studies (CAPS),
E N D
Louisiana Tech Site Report DOSAR Workshop V September 27, 2007 Michael Bryant Louisiana Tech University
DOSAR Workshop V Louisiana Tech University and LONI COMPUTING IN LOUISIANA
DOSAR Workshop V Computing Locally at LTU • At the Center for Applied Physics Studies (CAPS), • Small 8 node cluster with 28 processors (60 Gigaflops) • Used by our local researchers and the Open Science Grid • Dedicated Condor Pool of both 32-bit and 64-bit (w/ compat) machines running RHEL5 • Additional resources at LTU through the Louisiana Optical Network Initiative (LONI) • Intel Xeon 5TF Linux cluster (not yet ready): • 128 nodes (512 CPUs), 512 GB RAM • 4.772 TF peak performance • IBM Power5 AIX cluster • 13 nodes (104 CPUs), 224 GB RAM • 0.851 TF peak performance
DOSAR Workshop V Louisiana Tech Researchers • Focused on High Energy Physics, High Availability (HA) and Grid computing, and Biomedical Data Mining • High Energy Physics: • Fermilab (D0), CERN (ATLAS), and ILC: • Dr. Lee Sawyer, Dr. Dick Greenwood (Institutional Rep.), Dr. Markus Wobisch • Joe Steele is now at TRIUMF in Vancouver • Jefferson Lab (G0, Qweak experiments) • Dr. Kathleen Johnston, Dr. Neven Simicevic, Dr. Steve Wells, Dr. Klaus Grimm • HA and Grid computing : • Dr. Box Leangsuksun • Vishal Rampure • Michael Bryant (me)
DOSAR Workshop V Louisiana Optical Network Initiative The Louisiana Optical Network Initiative (LONI) is a high speed computing and networking resource supporting scientific researchand the development of new technologies, protocols, and applications to positively impact higher education and economic development in Louisiana. - http://loni.org • 40Gb/sec bandwidth state-wide • Next-generation network for research • Connected to the National LambdaRail (NLR, 10Gb/sec) in Baton Rouge • Spans 6 universities and 2 health centers
DOSAR Workshop V LONI Computing Resources • 1 x Dell 50 TF Intel Linux cluster housed at the state's Information Systems Building (ISB) • “Queen Bee” named after Governor Kathleen Blanco who pledged $40 million over ten years for the development and support of LONI. • 680 nodes (5,440 CPUs), 688 GB RAM • Two quad-core 2.33 GHz Intel Xeon 64-bit processors • 8 GB RAM per node • Measured 50.7 TF peak performance • According to the June, 2007 Top500 listing*, Queen Bee ranked the23rd fastest supercomputer in the world. • 6 x Dell 5 TF Intel Linux clusters housed at 6 LONI member institutions • 128 nodes (512 CPUs), 512 GB RAM • Two dual-core 2.33 GHz Xeon 64-bit processors • 4 GB RAM per node • Measured 4.772 TF peak performance • 5 x IBM Power5 575 AIX clusters housed at 5 LONI member institutions • 13 nodes (104 CPUs), 224 GB RAM • Eight 1.9 GHz IBM Power5 processors • 16 GB RAM per node • Measured 0.851 TF peak performance *http://top500.org/list/2007/06/100 Combined total of 84 Teraflops
DOSAR Workshop V LONI: The big picture…by Chris Womack NEXT ??? Dell 80 TF Cluster IBM P5 Supercomputers Louisiana Optical Network LONI Members National Lambda Rail
DOSAR Workshop V • Goal: enable domain scientists to focus on their primary research problem, assured that the underlying infrastructure will manage the low-level data handling issues. • Novel approach: treat data storage resources and the tasks related to data access as first class entities just like computational resources and compute tasks. • Key technologies being developed: data-aware storage systems, data-aware schedulers (i.e. Stork), and cross-domain meta-data scheme. • Provides and additional 200TB disk, and 400TB tape storage
Participating institutions in the PetaShare project, connected through LONI. Sample research of the participating researchers pictured (i.e. biomechanics by Kodiyalam & Wischusen, tangible interaction by Ullmer, coastal studies by Walker, and molecular biology by Bishop). High Energy Physics Biomedical Data Mining LaTech Coastal Modeling Petroleum Engineering LSU Computational Fluid Dynamics Synchrotron X-ray Microtomography UNO Biophysics Tulane ULL Molecular Biology Geology Computational Cardiac Electrophysiology Petroleum Engineering
DOSAR Workshop V LONI and the Open Science Grid ACCESSING RESOURCES ON THE GRID
DOSAR Workshop V OSG Compute Element: LTU_OSG • Located here at Louisiana Tech University • OSG 0.6.0 production site • Using our small 8 node Linux cluster • Dedicated Condor Pool using 20 of the 28 CPUs • 8 nodes (28 CPUs), 36 GB RAM • 2 x Dual 2.2 GHz Xeon 32-bit processors, 2GB RAM per node • 2 x Dual 2.8 GHz Xeon 32-bit processors, 2GB RAM per node • 2 x Dual 2.0 GHz Operton 64-bit processors, 2GB RAM per node • 1 x Two quad-core 2.0 GHz Xeon 64-bit processors, 16GB RAM • 1 x Two quad-core 2.0 GHz Xeon 64-bit processors, 8GB RAM • We would like to… • Expand to Windows Co-Linux Condor Pool • Combine with IfM and CS clusters • Plan to move to OSG ITB when the LONI 5TF Linux cluster at LTU becomes available
DOSAR Workshop V OSG Compute Element: LTU_CCT • Located at the Center for Computation & Technology (CCT) at Louisiana State University (LSU) in Baton Rouge, La. • OSG 0.6.0 production site • Using the LONI 5TF Linux cluster at LSU (Eric) • PBS opportunistic single-processor queue • Only 64 CPUs (16 nodes) available from the 512 CPUs total • 128 nodes, 512 GB RAM • Two dual-core 2.33 GHz Xeon 64-bit processors • 4 GB RAM per node • The 16 nodes are shared with other PBS queues • Played a big role in the DZero reprocessing effort • Dedicated access to LONI cluster during reprocessing • 384 CPUs total were used simultaneously • Continuing to run DZero MC production at both sites
DOSAR Workshop V Reprocessing at LTU_CCT LTU_CCT (LONI)
DOSAR Workshop V Reprocessing at LTU_CCT (cont.) LTU_CCT (LONI)
DOSAR Workshop V DZero MC Production for LTU* Weekly production by site • 8.5 million events total Cumulative production by site * LTU_CCT and LTU_OSG are combined
DOSAR Workshop V LONI OSG CEs and PanDA Scalability + High Availability CURRENT STATUS AND FUTURE PLANS
DOSAR Workshop V Current Status of LTU_OSG • Upgraded to OSG 0.6.0 • Upgraded to RHEL5 • Added two new Dell Precision Workstations (16 CPUs, two quad-core 2.0GHz Xeon 64-bit processors, 16GB and 8GB) • Connected to LONI 40Gbps network in June (finally!) • Allows us to run D0 MC again • Running DZero MC production jobs (sent using Joel’s AutoMC daemon) • Installed standalone Athena 12.0.6 on caps10 for testing ATLAS analysis
DOSAR Workshop V Current Status of LTU_CCT • Switched to the LONI 5TF (Eric) cluster from SuperMike/Helix • Upgraded to OSG 0.6.0 • Running DZero MC production jobs (sent using Joel’s AutoMC daemon) • Running ATLAS production test jobs • Problems so far: • Pacman following symlinks! (/panasas/osg/app -> /panasas/osg/grid/app on headnode) • Conflict with 32-bit Python install on 64-bit OS (https:// not supported) • OSG_APP Python path was wrong • Incorrect Tier2 DQ2 URL • 3 successful tests, need a few more before running full production
DOSAR Workshop V What’s next? • Create OSG CEs at each of the six LONI sites • Possibly creating a LONI state-wide grid • Tevfik Kosar is building a campus grid at LSU • Begin setting up PetaShare storage at each LONI site • PanDA scalability tests on Queen Bee • Proposing to PanDA team and LONI allocation committee • Involving other non-HEP projects to DOSAR using PanDA (see talk tomorrow) • Applying HA techniques to PanDA and the Grid (see talk tomorrow)
DOSAR Workshop V QUESTIONS / COMMENTS?