260 likes | 365 Views
Experiences with Globus on DAS-2 in an educational setting. Herbert Bos & Lex Wolters LIACS, Leiden University {herbertb,llexx}@liacs.nl. Seminar Grid Computing. Fall 2001 11 students (year 3 or 4) started, 8 finished Once a week, 2 hours 13 classes Programming assignment Goals
E N D
Experiences with Globus on DAS-2 in an educational setting Herbert Bos & Lex Wolters LIACS, Leiden University {herbertb,llexx}@liacs.nl DAS-2 workshop, June 6 2002
Seminar Grid Computing • Fall 2001 • 11 students (year 3 or 4) started, 8 finished • Once a week, 2 hours • 13 classes • Programming assignment Goals • Try to separate Grid hype from Grid reality • Show the underlying technologies that are currently being developed and used to provide a 'pervasive computing grid' DAS-2 workshop, June 6 2002
Topics by lecturers • What is Grid Computing? • Requirements • History • Grid architecture • Basic Services • Taxonomy of Grids • QoS (final class) DAS-2 workshop, June 6 2002
Presentations by students • Legion • Globus • Resource management: GRAM • Scheduling: AppLeS • Communication (Nexus, etc.) • Information service: MDS, GRIS, GIIS • Data access: GASS + RSL • Security: GSS-API • Language support DAS-2 workshop, June 6 2002
Programming assignment Using Globus to implement a Grid application: • Computation chopped up in subtasks which are distributed to computational nodes • Final result is combination of results of subtasks • Resource discovery • At least one of the following options: • Data is distributed in secure fashion • Incorporate costs DAS-2 workshop, June 6 2002
Topics • Willem de Bruijn: Distributed Evolutionary Algorithm • Hongqin Chen: RSA Key Breaking • Jeroen Laros: GridCrafty • Hui Li: Parallel Fractal Image Generation • Yafei Sun: Adaptive Quadrature • Arjan Tijms & Shlomo Raikin: Parallel Genetic Algorithm DAS-2 workshop, June 6 2002
Situation • Delivery DAS-2 delayed • Globus installation on SUN server • System-managers unfamiliar with Globus • Incorrect installation, e.g. certificates • Jan 21, 2002: DAS-2 operational • Platform for students • Focus on PBS, MPI; not Globus DAS-2 workshop, June 6 2002
Distributed Evolutionary Algorithm • Purpose: minimizes an arbitrary function • Strategy: • self-adaptation, no distinction between worker and controller nodes • predefined number of runs • Language: C++ • Modules: • communication: Globus IO • resource management: GRAM • Results: • master/slave set-up best results in shortest time-span • other strategies increases self-adaptiveness, but worse results in current setting DAS-2 workshop, June 6 2002
Distr. Evolutionary Algorithm (cont’d) • Problems: • Distinction between fileserver and compute node: starting up new processes • Wall-time value (60 s) of scheduler cannot be altered (also not by maxTime in RSL): waiting processes are killed • Suggestions for improvement: • Symbolic links to Globus libraries • Documentation on Globus: • Overall idea is neglected • Q&A forum, globus.org DAS-2 workshop, June 6 2002
RSA Key Breaking • Purpose: factoring large numbers • Strategy: • Pollard’s Rho factoring algorithm • Master/slave framework • Language: C • Modules: • Communication: Nexus • Job allocation: GRAM and PBS • Results: • Significant speed-ups, depending on work-load/distribution DAS-2 workshop, June 6 2002
RSA Key Breaking (cont’d) • Problems: • Start-up • Problems to get correct certificate • Libraries were not installed correctly • Functions were not available • ‘Real’ problems • GRAM macro-definitions not in corresponding header-file • Documentation • Lack of practical guidelines and examples DAS-2 workshop, June 6 2002
GridCrafty • Purpose: shell script which parallelises the chess engine Crafty • Strategy: • Master: all possible moves; worker: grade moves • Modules: • Storage access GASS, globus_rcp, openssh • Results: • Due to problems with Globus implementation it was also bypassed entirely which leads to speed-up of 17.5 (theoretical 22) DAS-2 workshop, June 6 2002
GridCrafty (cont’d) • Problems: • start-up • GASS did not work properly • Globus_rcp was not installed • Openssh did not work • ‘real’ problem • Scheduling of tasks takes a lot of time • Final implementation: • connect to all nodes; query load: • Static: < 10% host free • Dynamic: clients checks load before start of intensive calculations • ssh implementation much faster than Globus (speed-ups 17.5 versus 5-9) DAS-2 workshop, June 6 2002
Parallel Fractal Image Generation • Purpose: see title • Strategy • Master distributes work, collects output, draw image • Slaves calculates points line-wise • Language: C and C++ • Modules • Resource management GRAM • Communication MPI DAS-2 workshop, June 6 2002
Par. Fractal Image Generation (cont’d) • Problem • Conflict between current MPI set-up and GRAM job submit script (temporary fixed only on UvA-cluster) • Suggestions for improvement: • Installation of MPICH-G2 • Where can one find good examples on exploiting Globus to get started? DAS-2 workshop, June 6 2002
Adaptive Quadrature • Purpose: calculate the quadrature of the curve of an arbitrary function • Strategy: • Divide curve into smaller ones • Ring of processes • Results via files • Language: gcc • Modules • Process control and allocation DUROC • Communication Nexus DAS-2 workshop, June 6 2002
Adaptive Quadrature (cont’d) • Problems • Start-up • Getting the correct certificate • Using the right RSL parameter (hostCount) • ‘Real’ problem • Conflict between duroc_runtime_barrier and PBS: fixed only on UvA-cluster • Suggestions for improvement • Info on different communication techniques DAS-2 workshop, June 6 2002
Parallel Genetic Algorithm • Purpose: improving results of GAs • Strategy • Start independent searches at different locations of the solution landscape • Periodically exchange highest fitting of individuals • Init process: job dispatching and bootstrap communication set-up • Master process: relay for communications, synchronizes the start of worker processes, collects final results, and sets up GUI for monitoring and progress display • Worker processes: each runs a single N-generation run DAS-2 workshop, June 6 2002
Parallel Genetic Algorithm (cont’d) • Language: C and C++ • Modules • Communication NEXUS – RPC • Job submission GRAM • Thread creation Globus_Common • Preliminary results • Parallel algorithm achieves results that are 8-17% better than sequential algorithm DAS-2 workshop, June 6 2002
Parallel Genetic Algorithm (cont’d) • Problems • Start-up • Environment and path setting • Obtaining certificates • Who is responsible for globus on das-2? • Different versions of globus (1.1.3 versus 2.0 beta) • ‘Real’ problems • Shared libraries are not installed at nodes • Delegating proxies • Information about resource availability static or not present • Globus 2.0 is a beta version: things not implemented or missing DAS-2 workshop, June 6 2002
Parallel Genetic Algorithm (cont’d) • Suggestions for improvement: • Default Globus environment • Globus libraries on nodes via • NFS partition • Symbolic link to the ‘strange’ globus-edg beta 2.1 names • ‘fork’ is default jobmanager, which only ‘schedules’ jobs to local file server (adding PBS makes code dependent on this scheduler) • Installation of a cluster monitor better than beowulf • Examples and makefiles DAS-2 workshop, June 6 2002
Conclusions • Seminar quite successful • DAS-2 • Great environment for teaching purposes • Start-up problems • Current setting not optimal • Who is responsible for DAS-2? • Who determines policies, implementations? • Globus • Documentation, examples (probably better with current training material on globus.org) • Installation not trivial • IBM • Pre-sales OK, after-sales??? DAS-2 workshop, June 6 2002
Thanks Many thanks to David Groep who helped our students many, many times without any hesitation! Great job! DAS-2 workshop, June 6 2002