180 likes | 404 Views
Environment from the Molecular Level A NERC eScience testbed project. Leveraging HTC for UK eScience with Very Large Condor Pools: Demand for transforming untapped power into results. Paul Wilson 1 , John Brodholt 1 , and Wolfgang Emmerich 2 .
E N D
Environment from the Molecular Level A NERC eScience testbed project Leveraging HTC for UK eScience with Very Large Condor Pools: Demand for transforming untapped power into results. Paul Wilson1, John Brodholt1, and Wolfgang Emmerich2. 1. Department of Earth Sciences, University College London, Gower Street, London WC1E 6BT, UK 2. Department of Computer Science, University College London, Gower Street, London WC1E 6BT, UK
Environment from the Molecular Level A NERC eScience testbed project This talk: Part 11. The eMinerals problem area2. The Computational job-types this generates3. How Condor can help to sort these jobs out4. What we gain from Condor and where to go next5. UK Institutional Condor programmes and the road ahead. This talk: Part 21. Condor’s additional features and how we use them. 2. The eMinerals mini grid. 3. Conclusion.
Environment from the Molecular Level A NERC eScience testbed project THE PROBLEM AREA.1. Simulation of pollutants in the environment Binding of heavy metals and organic molecules in soils. 2. Studies of materials for long-term nuclear waste encapsulationRadiocactive waste leaching through ceramic storage media. 3. Studies of weathering and scalingMineral/water interface simulations, e.g oil well scaling. Codes relying on empirical descriptions of interatomic forces: DL-POLY - molecular dynamics simulations GULP – lattice energy/lattice dynamics simulations METADISE – interface simulations Codes using a quantum mechanical description of interactions between atoms: CRYSTAL – Hartree-Fock implementation. SIESTA – Density Function Theory, numerical basis sets to describe electronic wave function. ABINIT - DFT, plane wave descriptions of electronic wave functions WHAT TYPE OF JOBS WILL THESE PROBLEMS BE MANIFESTED AS?
Environment from the Molecular Level A NERC eScience testbed project 2 TYPES OF JOB: 1) High to mid performance: Requiring powerful resources, potential process intercommunication, long execution times, CPU and memory intensive.2) Low performance/high throughput:Requiring access to many hundreds or thousands of PC-level CPU’s. No process intercommunication, short execution times, low memory usage. WHERE CAN WE GET THE POWER? TYPE 1 JOB: Masses of UK HPC resources around- it seems that UK grid resources are largely HPC! TYPE 2 JOB: ???????? THERE HAS GOT TO BE A BETTER WAY TO OPTIMISE TYPE 2 JOBS!
Environment from the Molecular Level A NERC eScience testbed project …AND THERE IS: WE USE WHAT’S ALREADY THERE:930 win2K PC’s (1GHz P3, 256/512Mb Ram, 1Gbit e-net.) clustered in 30 student cluster rooms across every department on the UCL campus, with the potential to scale up to ~3000 PC’s.These machines waste 95% of their CPU cycles 24/7: A MASSIVE UNTAPPED RESOURCE- A COUP FOR eMINERALS! This is where Condor enters the scene. THE ONLY AVAILABLE FREE, OFF-THE-SHELF RESOURCE MANAGEMENT AND JOB BROKER FOR WINDOWS: Install Condor on our clusters, and we harness 95% of the power of 930+ machines 24 hours a day, without spending any money. Is it really this simple?
Environment from the Molecular Level A NERC eScience testbed project YES! It has surpassed all expectations, with diverse current use and ever-rising demand.- 15 smiley happy people ( our current group of users, and increasing monthly.): eMinerals project, eMaterials project, UCL Computer Science, UCL medical school, University of Marburg, Universities of Bath and Cambridge, Birkbeck College, The Royal Institution… - Over 1000,000 hours of work completed in 6 months (105 CPU-years equivalent and counting)- Codes migrated to Windows representing huge variety:environmental molecular work (all eMinerals codes!), materials polymorph prediction, financial derivatives research, quantum mechanical codes, climatic research, medical image realisation… NUMBER 1 METRIC FOR SUCCESS: Users love it. simple to use, doesn’t break and they can forget about their jobs. NUMBER 2 METRIC FOR SUCCESS: UCL admin love it. 100% utilisation levels 24/7on the entire cluster network with no drop in performance and negligible costs satisfies our dyed-in-the-wool, naturally paranoid, sys admin. NUMBER 3 METRIC FOR SUCCESS: eMinerals developers love it: fast deployment, tweakable, can build on top of it, low admin, integratable with globus, great metadata, great free support, great workflow capabilities, Condor-G. NUMBER 4 METRIC FOR SUCCESS: eScience loves it. Other institutions are following our example, interest is high.
Environment from the Molecular Level A NERC eScience testbed project One million Condor nodes in a hollowed out volcano! Mwahahaha… • WHAT IS MOST IMPORTANT?Condor ENABLES any scientist to do their work in a way they previously dreamed about: • Beginning to make real the ability to match unbounded science with unbounded resources.Condor has slashed time-to-results from years to weeks-Scientists using our Condor resource have • Redefined their ability to achieve their goals. • Condor has organised resources at many levels: • Desktop- June 2002 (2 nodes) • Cluster- Sept 2002 (18 nodes) • Department – Jan 2003 (150 nodes) • Campus – October 16th 2003 (930 nodes) • WHERE NEXT- (?????? nodes, ???? Pools)… This is the largest single Condor pool in the UK (according to Condor) This is the first fully x-department institutional Condor pool in the UK. Several other Institutions have followed our lead: Cambridge, Cardiff. Much scope for combining resources (flocking, glide-in) …Regional and national Condor resources are next…
Environment from the Molecular Level A NERC eScience testbed project …Regional and national Condor resources continued. Many UK institutions have small/medium Condor pools. Some- Soton, Imperial, Cardiff, Cambridge have large and expanding pools. Many UK institutions have resources wasting millions of CPU cycles. We have proved the usefulness of large Windows Condor resources. Assurances regarding security, authorisation, authentication, access and reliable job execution are essential to the take up of Condor on this scale in the UK Many potential resources are Windows, which complicates matters (for example, poor GSI port to Windows and lack of Windows check-pointing.) With education, awareness, support and a core group to lead the way, UK institutions can form a national-level Condor infrastructure leveraging HTC resources for scientists within UK eScience. This is the largest single Condor pool in the UK (according to Condor) This is the first fully x-department institutional Condor pool in the UK. Several other Institutions have followed our lead: Cambridge, Cardiff. Much scope for combining resources (flocking, glide-in) It hasn’t all been plain sailing though…
Environment from the Molecular Level A NERC eScience testbed project Issues with Very Large Condor Installations. Political – the biggest problem. resistance to change, ownership. Technical – usually surmountable. networks, deployment, admin, load. Policy – changes to I.S usage. new usage, which is primary use? Security – trust or certificate based. trust easy and works. Certs a pain.
Environment from the Molecular Level A NERC eScience testbed project 5) The latest from the Condor pool…
Environment from the Molecular Level A NERC eScience testbed project
Environment from the Molecular Level A NERC eScience testbed project 2) Latest UK Condor research: FC-UK… UCL, Cambridge and the Condor team at Wisconsin-Madison: Microsoft-funded (50%) 1 year project to develop web-services based Condor scheduler and administrative interfaces on the eMinerals mini-grid and using Microsoft .NET. This may extend into WS-RF (grid standard?) if it appears. This is a fully integrated Condor project, and will form part of future releases. Who? Me, Clovis, Wolfgang Emmerich (UCL) Martin Dove, Mark Calleja (Cams) Miron Livny, Todd Tanenbaum and Matt Farrellee (Condor) and all you prolific users!
Environment from the Molecular Level A NERC eScience testbed project 3) Where next, given the lack of volcanoes? UK e-Science to lead in Condor-based HTC. Here’s the idea… UCL host the UK Condor download mirror (imminent) UK Condor support network working through the new Grid Operation Centre (Discussions with UK Grid Exec and GOC current) UK Condor working group to develop an National HTC Condor Service, and formalise long term Condor integration across the UK. UCL to integrate W-S Condor into existing infrastructure: more choice… UCL kicked this all off by proposing and co-leading the inaugural UK Condor Week 2004…
Environment from the Molecular Level A NERC eScience testbed project 4) UK Condor Week 2004. Jolly exciting it is too. October 11th to 15th 2004, National eScience Centre, Edinburgh. Anyone with an interest in Condor, creating HTC resources and the future of UK eScience: Project members, leaders, scientists, Institutional I.S leaders and administrators, eScience decision makers and leaders. Fully endorsed and encouraged by the Condor team, who will attend along with Miron Livny (Condor Godfather and a top bloke) and give two days of tutorials, hands-on sessions, Q & A, demos of new technology. 3 days will be discussions, breakout sessions etc with the aim of formalising a Condor/HTC roadmap for the short and near term for the UK, and agreeing on a group of people to actually do the work. See www.nesc.ac.uk for details.
Environment from the Molecular Level A NERC eScience testbed project …AND FINALLY. THE MILLION DOLLAR QUESTION? When was the millionth recorded hour of work completed? DATE: April 2nd 2004… HOUR: ~09.03AM… JOB: 1735.441… JOB LENGTH: 23hrs 41 minutes… WHO GETS THE GLORY? DR SAM FRENCH, e-Materials Project, R.i. A.K.A: ‘The Poolmeister’
Environment from the Molecular Level A NERC eScience testbed project Summary. Condor has enabled eMinerals scientists and their UK colleagues to perform their science: in significantly new ways, on previously un-tapped resources, on previously unutilised operating systems, in weeks rather than years, in an integrated, heterogeneous, grid-enabled environment. easily, painlessly and for no cost. with equal importance given to data handling. using out-of-the-box tools.
Environment from the Molecular Level A NERC eScience testbed project Conclusion: THIS MUST CONTINUE! Condor has an important part to play in the UK eScience programme: Through meeting the increasing demands from users for large scale, accessible Condor-enabled HTC resources. Through harnessing the significant volumes of existing, under-utilised, heterogeneous UK institutional hardware. Through providing functionality to facilitate secure accessibility to heterogeneous compute and data resources. Through engaging with the UK eScience programme within Condor’s grid/web service and standardisation developments.
Elvis from the Molecular Level A NERC eScience testbed project Uhhh thankyouverymuch. You’re beautiful. eMinerals projecthttp://www.eminerals.org