AgentTeamwork : Mobile-Agent-Based Middleware for Distributed Job Coordination

AgentTeamwork: Mobile-Agent-Based Middleware for Distributed Job Coordination Munehiro Fukuda Computing & Software Systems, University of Washington, Bothell Funded by AgentTeamwork

Outline • Introduction • Execution Model • System Design • Performance Evaluation • Related Work • Conclusions AgentTeamwork

1. Introduction • Why Grid Computing • Background • Objective • Project Overview AgentTeamwork

Why Grid Computing • Textbooks say: • Only 30% CPU utilization • Only episodic job requirements • Anyone and anywhere like a power grid • Many research prototypes and commercial products: • Globus, Condor, Legion(Avaki), NetSolve, Ninf, Entropia PCGrid, Sun Grid Engine, etc. • Then, have you ever used them? • Probably not so many of you. • What is a big hurdle? • You don’t need it anyway. Or, what? AgentTeamwork

BackgroundMost Grid Systems • Functional viewpoints: • Centralized resource/job management • Two drawbacks • A powerful central server essential to manage all slave computing nodes • Applications based on master-slave or parameter-sweep model • Out motivation • Decentralized job distribution, coordination, and fault tolerance • Applications based on a variety of communication models • Practical viewpoints: • Systems dedicated to large institutions/companies • Two drawbacks • A lot of installation work required under the root account. • A group of individual computer owners not targeted at. • Our motivation • Easy participation in grid-computing and easy installation AgentTeamwork

BackgroundHow to Pursue Our Motivation • Use of mobile agents • We are experts in mobile agents. • Most mobile agents • An execution model previously highlighted as a prospective infrastructure of distributed systems. • No more than an alternative approach to centralized grid middleware implementation. • Our initial goal • Decentralized middleware design with mobile agents AgentTeamwork

Objective • A mobile agent execution platform fitted to grid computing • Allowing an agent to identify which MPI rank to handle and which agent to send a job snapshot to. • A fault-tolerant inter-process communication • Recovering lost messages. • Allowing over-gateway connections. • Agent-collaborative algorithms for job coordination • Allocating computing nodes in a distributed manner. • Implementing decentralized snapshot maintenance and job recovery. AgentTeamwork

Project Overview • Funded by: NSF Middleware Initiative • Sponsored by: University of Washington • In Collaboration of: Ehime University • In a Team of: UWB Undergraduates AgentTeamwork

2. Execution Model • System Overview • Execution Layer • Programming Interface AgentTeamwork

Snapshot Methods Snapshot Methods GridTCP GridTCP User program wrapper User program wrapper Results Results snapshot snapshot snapshot User A User B FTP Server snapshots snapshots System Overview User A’s Process User A’s Process User B’s Process TCP Communication Snapshot Methods GridTCP User program wrapper Sentinel Agent Sentinel Agent Sentinel Agent Commander Agent Resource Agent Resource Agent Commander Agent AgentTeamwork BookkeeperAgent Bookkeeper Agent

mpiJava-S mpiJava-A Execution Layer Java user applications mpiJava API Java socket GridTcp User program wrapper Commander, resource, sentinel, and bookkeeper agents UWAgents mobile agent execution platform Operating systems AgentTeamwork

Programming Interface public class MyApplication { public GridIpEntry ipEntry[]; // used by the GridTcp socket library public int funcId; // used by the user program wrapper public GridTcp tcp; // the GridTcp error-recoverable socket public int nprocess; // #processors public int myRank; // processor id ( or mpi rank) public int func_0( String args[] ) { // constructor MPJ.Init( args, ipEntry, tcp );// invoke mpiJava-A .....; // more statements to be inserted return 1; // calls func_1( ) } public int func_1( ) { // called from func_0 if ( MPJ.COMM_WORLD.Rank( ) == 0 ) MPJ.COMM_WORLD.Send( ... ); else MPJ.COMM_WORLD.Recv( ... ); .....; // more statements to be inserted return 2; // calls func_2( ) } public int func_2( ) { // called from func_2, the last function .....; // more statements to be inserted MPJ.finalize( );// stops mpiJava-A return -2; // application terminated } } AgentTeamwork

3. System Design • Mobile Agents • Job Coordination • Distribution • Monitoring • Resumption and migration • Programming Support • Language preprocessing • Communication check-pointing AgentTeamwork

UWInject: submits a new agent from shell. id 0 Agent domain (time=3:30pm, 8/25/05 ip = medusa.uwb.edu name = fukuda) id 0 -m 4 -m 3 id 1 id 1 id 2 id 3 id 2 User A user job UWPlace id 12 Agent domain (time=3:31pm, 8/25/05 ip = perseus.uwb.edu name = fukuda) id 4 id 5 id 6 id 7 id 8 id 9 id 10 id 11 UWAgents – Concept of Agent Domain • Agent domain created per each submission from the Unix shell • # children each agent can spawn is given upon the initial submission • No name server • Messages forwarded through an agent tree • A user job scheduled as a thread, using suspend/resume AgentTeamwork

UWAgents – Over Gateway Migration AgentTeamwork

User snapshot snapshot Job Distribution Job Submission Commander id 0 XML Query Spawn eXist Resource id 1 Sentinel id 2 rank 0 Bookkeeper id 3 rank 0 Sensor id 4 Sensor id 5 Sentinel id 8 rank 1 Sentinel id 9 rank 2 Sentinel id 10 rank 3 Sentinel id 11 rank 4 Bookkeeper id 12 rank 1 Bookkeeper id 13 rank 2 Bookkeeper id 14 rank 3 Bookkeeper id 15 rank 4 Sentinel id 32 rank 5 Sentinel id 33 rank 6 Sentinel id 34 rank 7 Bookkeeper id 48 rank 5 Bookkeeper id 49 rank 6 Bookkeeper id 50 rank 7 id: agent id rank: MPI Rank AgentTeamwork

User Node 4 Node 1 Node 0 Node 2 Node 2 Node 0 Node 1 Node 3 Node5 Resource Allocation Job submission total nodes x multiplier eXist Resource id 1 Commander id 0 An XML query CPU Architecture OS Memory Disk Total nodes Multiplier A list of available nodes Spawn Case 1: Total nodes = 2 Multiplier = 1.5 Sentinel id 2 rank 0 Sentinel id 8 rank 1 Bookkeeper id 12 rank 5 Bookkeeper id 2 rank 0 Future use Sentinel id 2 rank 0 Sentinel id 8 rank 1 Bookkeeper id 12 rank 5 Bookkeeper id 2 rank 0 Case 2: Total nodes = 2 Multiplier = 3 Future use Future use AgentTeamwork

An XML query Sensor id 4 Sensor id 5 Performance data ttcp Sensor id 16 Sensor id 17 Sensor id 18 Sensor id 19 Sensor id 20 Sensor id 21 Sensor id 22 Sensor id 23 ttcp ttcp Resource Monitoring A resource request eXist Resource id 1 Commander id 0 A list of available nodes Spawn • Current restrictions • Minimum interval: 3secs • Static distribution of sensor agents • Future extensions • Sensor migration • Use of NWS at each site AgentTeamwork

(2) Search for the latest snapshot (3) Retrieve the snapshot (4) Send a new agent (1) Detect a ping error New Sentinel id 11 rank 4 (5) Restart a user program (0) Send a new snapshot periodically Job Resumption by a Parent Sentinel Sentinel id 2 rank 0 MPI connections Sentinel id 8 rank 1 Sentinel id 9 rank 2 Sentinel id 10 rank 3 Sentinel id 11 rank 4 Bookkeeper id 15 rank 4 AgentTeamwork

(12) Restart a new resource agent from its beginning Commander id 0 (11) Detect a ping error (13) Detect a ping error and follow the same child resumption procedure as in p9. (10) Send a new agent (6) No pings for 2 * 5 (= 10sec) (7) Search for the latest snapshot New Resource id 1 Sentinel id 2 rank 0 (2) Search for the latest snapshot (8) Search for the latest snapshot (1) No pings for 8 * 5 (= 40sec) (9) Retrieve the snapshot No pings for 12 * 5 (= 60sec) (5) Send a new agent (3) Search for the latest snapshot (4) Retrieve the snapshot Job Resumption by a Child Sentinel Commander id 0 New Resource id 1 Sentinel id 2 rank 0 Bookkeeper id 3 rank 0 Sentinel id 8 rank 1 Bookkeeper id 12 rank 1 AgentTeamwork

User Program Wrapper User Program Wrapper Source Code func_0( ) { statement_1; statement_2; statement_3; return 1; } func_1( ) { statement_4; statement_5; statement_6; return 2; } func_2( ) { statement_7; statement_8; statement_9; return -2; } int fid = 1; while( fid == -2) { switch( func_id ) { case 0: fid = func_0( ); case 1: fid = func_1( ); case 2: fid = func_2( ); } } check_point( ) { // save this object // including func_id // into a file } statement_1; statement_2; statement_3; statement_4; statement_5; statement_6; statement_7; statement_8; statement_9; check_point( ); check_point( ); check_point( ); Preprocessed AgentTeamwork

Preproccesser and Drawback Preprocessed Code Preprocessed Source Code int func_0( ) { statement_1; statement_2; statement_3; return 1; } int func_1( ) { while(…) { statement_4; if (…) { statement_5; return 2; } else statement_7; statement_8; } } int func_2( ) { statement_6; statement_8; while(…) { statement_4; if (…) { statement_5; return 2; } else statement_7; statement8; } } statement_1; statement_2; statement_3; check_point( ); while (…) { statement_4; if (…) { statement_5; check_point( ); statement_6; } else statement_7; statement_8; } check_point( ); • No recursions • Useless source line numbers indicated upon errors • Still need of explicit snapshot points. Before check_point( ) in if-clause After check_point( ) in if-clause AgentTeamwork

User Program Wrapper rank ip 1 n1.uwb.edu user program 2 n2.uwb.edu n3.uwb.edu incoming TCP ougoing backup User Program Wrapper rank ip 1 n1.uwb.edu user program 2 n3.uwb.edu incoming TCP ougoing backup GridTcp – Check-Pointed Connection User Program Wrapper user program TCP outgoing backup incoming Snapshot maintenance n1.uwb.edu n2.uwb.edu • Outgoing packets saved in a backup queue • All packets serialized in a backup file every check pointing • Upon a migration • Packets de-serialized from a backup file • Backup packets restored in outgoing queue • IP table updated n3.uwb.edu AgentTeamwork

GridTcp – Over-Gateway Connection User Program Wrapper User Program Wrapper User Program Wrapper User Program Wrapper user program user program user program user program medusa.uwb.edu (rank 1) uw1-320.uwb.edu (rank 2) • RIP-like connection • Restriction: each node name must be unique. uw1-320-00 (rank 3) mnode0 (rank 0) AgentTeamwork

MPJ Package MPJ Init( ), Rank( ), Size( ), and Finalize( ) Communicator All communication functions: Send( ), Recv( ), Gather( ), Reduce( ), etc. JavaComm mpiJava-S: uses java sockets and server sockets. GridComm mpiJava-A: uses GridTcp sockets. DataType MPJ.INT, MPJ.LONG, etc. • InputStream for each rank • OutputStream for each rank • User a permanent 64K buffer for serialization • Emulate collective communication sending the same data to each OutputStream, which deteriorates performance MPJMessage getStatus( ), getMessage( ), etc. Op Operate( ) etc Other utilities AgentTeamwork

Sentinel Agent User Program Wrapper rank ip Main Thread SendSnapshot Thread 1 n1.uwb.edu 2 n2.uwb.edu n3.uwb.edu Bookkeeper Agent snapshot snapshot user program TCP ReceiveMsg Thread TCPError Thread outgoing backup Resumed Sentinel Agent incoming Restart message (a new rank/ip pair) MPI Connection MPI Job Execution UWPlace (UWAgent Execution Platform) AgentTeamwork

4. Performance Evaluation • Evaluation Environment: • A 8-node Myrinet-2000 cluster: 2.8GHz pentium4-Xeon w/ 512MB • A 24-node Giga-Ethernet cluster: 3.4GHz Pentium4-Xeon w/512MB • Computation Granularity • Java Grande MPJ Benchmark • Process Resumption Overhead AgentTeamwork

MPJ.Send and Recv Performance AgentTeamwork

Computational Granularity 1 AgentTeamwork

Performance Evaluation - Series AgentTeamwork

Performance Evaluation - RayTracer AgentTeamwork

Performance Evaluation – MolDyn AgentTeamwork

Overhead of Job Resumption AgentTeamwork

5. Related Work From the viewpoints of: • System Architecture • Mobile Agents • Fault Tolerance AgentTeamwork

System Architecture • Difference from Catalina/J-SEAL2 • They are not fully implemented. • They are based on a master-slave model AgentTeamwork

Mobile Agents AgentTeamwork

Fault Tolerance AgentTeamwork

6. Conclusions • Project Summary • Next Two Years AgentTeamwork

Project summary • Our focus • A decentralized job execution and fault-tolerant environment • Applications not restricted to the master-slave or parameter-sweeping model. • Applications • 40,000 doubles x 10,000 floating-point operations • Moderate data transfer combined with massive/collective communication • At least three times larger than its computational granularity • Current status • UWAgent: completed • Agent behavioral design: basic job deployment/resumption implemented • User program wrapper: completed except security feature • GridTcp/mpiJava: in testing • Preprocessor: in design AgentTeamwork

Next Two Years • Application support • Preprocessor implementation • Efficient input/output file transfer • Security enhancement in remote execution • GUI improvement • Agent algorithms • Over-gateway application deployment • Dynamic resource monitoring • Priority-based agent migration • Performance evaluation • Dissemination AgentTeamwork

Questions? AgentTeamwork

AgentTeamwork : Mobile-Agent-Based Middleware for Distributed Job Coordination