1 / 1

Job Dispatch and Termination Performance Agent Teamwork VS. Globus/OpenPBS

AgentTeamwork. Technology. Terminology. Grid vs. Cluster

sileas
Download Presentation

Job Dispatch and Termination Performance Agent Teamwork VS. Globus/OpenPBS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. AgentTeamwork Technology Terminology Grid vs. Cluster A computing grid is commonly distinguished from a computing cluster by the geographic distance between members. A cluster would be a group of computers in the same room or building and connected to the same physical network, while the members of grid could be located anywhere and may connected over several different networks. Platform I define an HPDC platform as software that provides Infrastructure and Scheduling services. Infrastructure services include authentication and authorization, job submission, and file transfer for job deployment. Scheduling services include dynamic resource identification and allocation, scheduling policies, and coordinating job execution. Framework I define a framework as a related set of software libraries that are used to write software in a particular programming model. The Single Program Multiple Data (SPMD) programming model is commonly used to achieve data level parallelism in HPDC. MPIJava is a Java implementation of the Message Passing Interface standard which provides a framework for programming in the SPMD model. Agent Teamwork AgentTeamwork is a mobile-agent-based job coordination system that targets a mixture of computing nodes, some directly connected to the public Internet, and others simply clustered in a private IP domain but not managed by a commodity job scheduler.1 Globus Toolkit The Globus Toolkit is an open source software toolkit used for building Grid systems and applications.2 OpenPBS OpenPBS is the original version of the Portable Batch System. It is a flexible batch queueing system developed for NASA in the early to mid-1990s3. The purpose of the OpenPBS system is to provide additional controls over initiating or scheduling execution of batch jobs; and to allow routing of those jobs between different hosts.4 Message Passing Interface (MPI) MPI is a library specification for message-passing, proposed as a standard by a broadly based committee of vendors, implementors, and users. MPI was designed for high performance on both massively parallel machines and on workstation clusters.5 MPICH-G2 A grid-enabled implementation of the MPI v1.1 standard. It uses services from the Globus Toolkit (e.g., job startup, security), MPICH-G2 allows you to couple multiple machines, potentially of different architectures, to run MPI applications.6 MPIJava mpiJava is an object-oriented Java interface to the standard Message Passing Interface (MPI).7 1 Fault-Tolerant Job Execution over Multi-Clusters using Mobile agents, Munehiro Fukuda gca07.pdf 2 http://www.globus.org/ 3 http://www.openpbs.org/about.html 4 Overview of the OpenPBS, http://www.openpbs.org/overview.html 5 What is MPI, http://www-unix.mcs.anl.gov/mpi/ 6 What is MPICH-G2 http://www3.niu.edu/mpi/ 7 http://www.hpjava.org/mpiJava.html Overview Job Dispatch and Termination PerformanceAgent Teamwork VS. Globus/OpenPBS My goal as a research assistant was to evaluate Agent Teamwork’s “Job Dispatch & Termination” and “Framework” performance against a contemporary alternative. Job Dispatch & Termination Evaluation: I built a reference platform to compare Agent Teamwork against by integrating the Globus Toolkit with the OpenPBS scheduler and the MPICH-G2 MPI framework. Framework Function Evaluation: To evaluate the framework performance I wrote three benchmark programs in the Agent Teamwork MPI framework and the MPIJava framework and compared their runtimes. Evaluation of Agent TeamworkHigh Performance Distributed ComputingMiddleware .Solomon Lane Agent Teamwork Research AssistantOctober 2006 – March 2007 Results: These graphs compare job dispatch & termination time when submitting a test program to different numbers of cluster nodes in either a depth or breadth first distribution. Agent Teamwork’s job dispatch and termination performance was comparable with the reference platform in the depth first distribution And agent teamwork outperformed the reference platform with a large number of nodes in a breadth first distribution. Reference Platform Overview The Clusters The following tables describe the hardware that was used. There were a total of 66 machines divided into two clusters. Framework Execution PerformanceAgent Teamwork VS. MPIJava Framework Results: Currently two of the Agent Teamwork versions of the benchmark programs cannot be run across the clusters due to outstanding bugs in the framework. One of the benchmark programs, Wave2D, was able to run on a limited number of nodes. The graphs to the right show these partial results which indicate that the Agent Teamwork version is at least one order of magnitude slower than MPIJava. At this point however framework debugging is ongoing. • In order to run a job you generate a job definition file • using the Resource Specification Language (RSL) and • submit it along with your user certificate using globusrun. • The gram client submits the job to a gatekeeper on the • cluster head, which uses the GSI to authenticate and • authorize the job submission. It then starts a job • manager which issues a callback to the gram client to • connect std error and std out back to the client. The job • manager then submits the job details to the PBS Server. • The PBS Scheduler selects appropriate nodes from the • cluster and transfers the executable to the PBS mom on • the cluster nodes. The PBS mom launches the application. • Applications are written in the MPICH-G2 framework which • uses the grid infrastructure to coordinate the parallel • execution. 1 2 3

More Related