1 / 31

Publishing applications on the web via the Easa Portal and integrating the Sun Grid Engine

Publishing applications on the web via the Easa Portal and integrating the Sun Grid Engine. By Michael Griffiths & Deniz Savas CiCS Dept. Sheffield University M.Griffiths@sheffield.ac.uk D.Savas@sheffield.ac.uk http://www.sheffield.ac.uk/wrgrid Sept 2007.

kadeem
Download Presentation

Publishing applications on the web via the Easa Portal and integrating the Sun Grid Engine

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Publishing applications on the web via the Easa Portal and integrating the Sun Grid Engine By Michael Griffiths & Deniz Savas CiCS Dept. Sheffield University M.Griffiths@sheffield.ac.uk D.Savas@sheffield.ac.uk http://www.sheffield.ac.uk/wrgrid Sept 2007

  2. Sheffield is in South Yorkshire, England

  3. Sheffield University- facts Established in 1828 70 academic departments in 7 faculties Number of Undergraduate Students: 25,500 Number of Post Graduate/Research Students: 5,600 Number of International Students : 3,100

  4. ‘iceberg’ the HPC Cluster at the Computer Centre • AMD Opteron based, supplied by Sun Microsystems. • Processors: 320 ( 160 of these are designated to the Physics Dept. for the PP project ) • Performance: 300GFLOPs • Main Memory: 800GB • User filestore: 9TB • Temporary disk space: 10TB • Physical size: 8 racks • Power usage: 50KW

  5. ‘iceberg’ cluster hardware components 160 general-purpose-use cpu’s; 80 of these are in dual-core configuration with 2 GBytes of memory each. ( V20 Model ) (i.e 40 boxes with 2 cpus + 4 GBytes ) 80 are in quad-core configurations with 4 GBytes memory each. ( V40 Model ) ( i.e 20 boxes with 4 cpus + 16 GBytes ) These are also connected via a Myrinet Switch at 2Gbps connection speed. IPMI Service Processors : Each box contains a service processor with separate network interface for remote monitoring and control. Inside a V20

  6. nfs mounted onto Worker nodes Shared file store Service Proc 1 Service Proc n Service Proc 7 Service Proc 1 Service Proc 10 Service Proc 6 Service Proc 5 Service Proc n Service Proc 2 Service Proc n Service Proc n Service Proc 58 Service Proc 59 Service Proc n Service Proc 8 Service Proc 60 Service Proc 4 Service Proc n Service Proc 57 Service Proc 9 Workernode n Workernode 59 Workernode 3 Workernode 7 Workernode 1 Workernode 6 Workernode n Workernode 56 Workernode 2 Workernode 57 Workernode 10 Workernode 60 Workernode n Workernode 9 Workernode 5 Workernode n Workernode n Workernode 58 Workernode 8 Workernode 4 Myranet Connected Workers Eth0 Eth1 HEAD NODE Iceberg Cluster Configuration All remote access License server

  7. White Rose Grid YHMAN Network

  8. Grid & HPC applications development tools • Development • Fortran77,90, C, C++, Java compilers • MPI / MPICH-gm • OpenMP • Nag Mk 20, 21 • ACML • Grid • Sun Grid Engine • Globus 2.4.3 (via gpt 3.0) • SRB s-client tools

  9. Using the White Rose Grid Application Portal

  10. Features and Capabilities • Web accessible management and execution of applications • Provides a service for rapid authoring and publication of custom applications • Easily integrate multiple heterogeneous resources

  11. Potential benefits of an applications portal • More efficient use of resources • Ease of use • Familiar GUI interface • Capturing of expert knowledge • Better presentation of legacy software

  12. Potential Development • Building Expert Systems • Allowing novice expert to take advantage of parallel HPC resources • Providing HPC services over the grid • HPC centres collaborating with each other without having to provide individual usernames, file-storage etc to remote users.

  13. WRG – Application Portal • Based on EASA • Three Usage Modes • Users • Run applications • Have storage space • Review old results • Authors • Build and publish applications • Administrators

  14. Using • Accessing • Managing • Applications • Workspace • Results • Help

  15. Using:Accessing • Start up a web browser & http://www.shef.ac.uk/wrgrid/easa.html • Login using provided user name and password

  16. Using:Help • Select Help and Support tab to register • Apply to Admin for an account • Apply to authors to register applications

  17. Using:Managing • Installing a client • Setting password • Setting Mode • user/author

  18. Using:Applications • View and select available applications

  19. Running An Application

  20. User Interface

  21. Using:Workspace • Storage for uploaded files and old job files

  22. Using:Results • Check results • View job progress • Export to spreadsheet

  23. Using: Results • Viewing Results

  24. Using:Help • Documentation • Contacts

  25. Conclusions • Disadvantages • Thick Client, License costs • Advantages • Rapid publication • Enable virtualization of HPC resources • Make applications available to broader community, become application focused • Effective on a network with low bandwidth • Make applications available to collaboration partners over the internet and outside own organisation

  26. Demonstration Applications Developed for EASA • Demonstration of Metascheduling Across White Rose Grid • Monitoring of usage across White Rose Grid • Running Applications on the local cluster • Fluent • Ansys • Generic Matlab and Scilab applications

  27. Metascheduler Demonstration:Background • Enable utilisation of resources across White Rose Grid • Exploit use of task arrays • Job submission is seamless • Demonstration uses a generic scilab application that runs on any of the White Rose Grid Nodes • Simplistic, but; • effective, manageable and sustainable

  28. Metascheduler Demonstration: Method • Query and Compare job queues for WRG nodes • qstat –g c • Use slots available and total number of slots to generate weights for different queues • Compare weights for all queues on different nodes and use to select node • Use standard EASA job submission technique to submit job to selected node • EASA does not know about clusters • Special easaqsub submits job to sge, monitors job status will remove job if wait time exceeded, easaqsub job monitor has completed EASA knows that EASA compute task has completed

  29. Metascheduler Demonstration: Running Scilab • User provides scilab scriptfile • Required resource file e.g. datafiles or files for scilab library routines • Can provide zipped bundle of scilab resources • Set job submission information and then submit job

  30. Metascheduler Demonstration: Job Submission • Provide jobname and job description • Information used for metascheduling • Jobtime (hours) • Waittime (hours) • Number of tasks (for job array) • Submission method • Use metascheduling • Select a particular node

  31. Metascheduler Demonstration : Further Developments • Current method successful! • Correctly selects clusters and improves turnaround for scilab compute tasks • Current pattern can be extended to other EASA applications • Provide distributed storage across White rose Grid • Develop metascheduling strategy introduce greater dependency on user job requirements for node selection • Exploit other metascheduling systems e.g. SGE transfer queues, CONDOR-G THE END

More Related