120 likes | 411 Views
Javier Lopez , Alvaro Simon, Esteban Freire/ CESGA SA3 All Hands Meeting, Barcelona 22 May 2007. CESGA Status Report. Outline. Main Achievements since November Work on Deliverables/Milestones Issues/Problems Next Steps. Main achievements since November. Infrastructure.
E N D
Javier Lopez, Alvaro Simon, Esteban Freire/ CESGA SA3 All Hands Meeting, Barcelona 22 May 2007 CESGA Status Report
Outline EGEE User Forum • Main Achievements since November • Work on Deliverables/Milestones • Issues/Problems • Next Steps
Infrastructure EGEE User Forum • New infrastructure based on virtual machines • This is the new common infrastructure for SA3, PPS and Production testbeds @CESGA • Fronted: • 1x Dell Poweredge 2950: 1TB RAID5 storage (golden images of all the services) • Virtual Machines: • 4x10 Dell Poweredge 1955: quad-core processors • For SA3 services we use HVM machines: they allow us to use kernel 2.4 without modification to the OS
Infrastructure • Advantages • We can increase or decrease the capacity on demand • Easy to test new releases in a clean environment • Possible to roll-back failing upgrades using LVM snapshoot capability • We have produced a document explaining our infrastructure: • https://swe-wiki.egee.cesga.es/cgi-bin/moin.cgi/XEN3_Virtual_Machines_-_CESGA-EGEE • More detailed documents also available on request
SGE • NOTE: This task is a joined effort between IC, LIP and CESGA • Integration in LCG CE ready • RPM packages tested • Yaim scripts developed • Documentation updated • Ready for certification • Re-Distribution of Grid Engine: • Reviewed license and sent to SA3 list for second review • Re-distribution allowed
Testing SGE • Based on the Torque/maui tests developed @GRNET • Adapting the scripts • Preliminary results available • Very slow submission • Optimizing SGE configuration
SGE on gLite CE • IP almost there (only minor changes required to the LCG version) • Meeting with BLAH developer (David Rebatto) to understand the work required • Required scripts are being developed @IC • Testing will be done @CESGA • APEL ready (Dave Kant @RAL)
Assigned Tasks • Task #4759: Testing SGE • In progress • Task #4600: Provide updated RPMs for SGE jobmanager and installation guide • Ready for certification
Issues/Problems • Job submission tests: • Preliminary results show that optimization of default SGE configuration required • Improve SGE configuration to send back to CE stdout and stderror files • Modifications required to run SGE on para-virtual machines (failure in arch detection)
Next Steps • SGE is working on a lcg-CE. Next step: CERTIFICATION • Add RPMs to SA3 repository • Integrate SGE yaim scripts • Tests for SGE lcg-CE (later they will be re-used for glite-CE) • SGE on glite-CE: Started on integrating support for BLAH • Other local middleware elements (GIIS, YAIM) basically remain unchanged for this glite-CE flavour. • APEL ready • Support for external SGE_QMASTER (IC and CESGA use this type of configuration in production) • GridICE sensors for SGE
References EGEE User Forum • Xen Virtualization @CESGA • https://swe-wiki.egee.cesga.es/cgi-bin/moin.cgi/XEN3_Virtual_Machines_-_CESGA-EGEE • SGE Wiki Page • https://twiki.cern.ch/twiki/bin/view/LCG/ImplementationOfSGE