650 likes | 787 Views
GRID superscalar: a programming paradigm for GRID applications. CEPBA-IBM Research Institute Rosa M. Badia, Jesús Labarta, Josep M. Pérez, Raül Sirvent. Outline. Objective The essence User’s interface Automatic code generation Run-time features Programming experiences Ongoing work
E N D
GRID superscalar: a programming paradigm for GRID applications CEPBA-IBM Research Institute Rosa M. Badia, Jesús Labarta, Josep M. Pérez, Raül Sirvent
Outline • Objective • The essence • User’s interface • Automatic code generation • Run-time features • Programming experiences • Ongoing work • Conclusions
Grid Objective • Ease the programming of GRID applications • Basic idea: ns seconds/minutes/hours
Outline • Objective • The essence • User’s interface • Automatic code generation • Current run-time features • Programming experiences • Future work • Conclusions
The essence • Assembly language for the GRID • Simple sequential programming, well defined operations and operands • C/C++, Perl, … • Automatic run time “parallelization” • Use architectural concepts from microprocessor design • Instruction window (DAG), Dependence analysis, scheduling, locality, renaming, forwarding, prediction, speculation,…
The essence Input/output files for (int i = 0; i < MAXITER; i++) { newBWd = GenerateRandom(); subst (referenceCFG, newBWd, newCFG); dimemas (newCFG, traceFile, DimemasOUT); post (newBWd, DimemasOUT, FinalOUT); if(i % 3 == 0) Display(FinalOUT); } fd = GS_Open(FinalOUT, R); printf("Results file:\n"); present (fd); GS_Close(fd);
Subst Subst Subst DIMEMAS Subst DIMEMAS Subst Subst DIMEMAS EXTRACT DIMEMAS EXTRACT … DIMEMAS DIMEMAS EXTRACT EXTRACT EXTRACT EXTRACT Display CIRI Grid Display Subst DIMEMAS GS_open EXTRACT The essence
CIRI Grid The essence Subst Subst Subst DIMEMAS Subst DIMEMAS Subst Subst DIMEMAS Subst EXTRACT DIMEMAS EXTRACT … DIMEMAS DIMEMAS EXTRACT DIMEMAS EXTRACT EXTRACT EXTRACT EXTRACT Display Display GS_open
Outline • Objective • The essence • User’s interface • Automatic code generation • Run-time features • Programming experiences • Ongoing work • Conclusions
User’s interface • Three components: • Main program • Subroutines/functions • Interface Definition Language (IDL) file • Programming languages: C/C++, Perl
User’s interface • A Typical sequential program • Main program: for (int i = 0; i < MAXITER; i++) { newBWd = GenerateRandom(); subst (referenceCFG, newBWd, newCFG); dimemas (newCFG, traceFile, DimemasOUT); post (newBWd, DimemasOUT, FinalOUT); if(i % 3 == 0) Display(FinalOUT); } fd = GS_Open(FinalOUT, R); printf("Results file:\n"); present (fd); GS_Close(fd);
User’s interface • A Typical sequential program • Subroutines/functions void dimemas(in File newCFG, in File traceFile, out File DimemasOUT) { char command[500]; putenv("DIMEMAS_HOME=/usr/local/cepba-tools"); sprintf(command, "/usr/local/cepba-tools/bin/Dimemas -o %s %s", DimemasOUT, newCFG ); GS_System(command); } void display(in File toplot) { char command[500]; sprintf(command, "./display.sh %s", toplot); GS_System(command); }
User’s interface • GRID superscalar programming requirements • Main program: open/close files with • GS_FOpen, GS_Open, GS_FClose, GS_Close • Currently required. Next versions will implement a version of C library functions with GRID superscalar semantic • Subroutines/functions • Temporal files on local directory or ensure uniqueness of name per subroutine invocation • GS_System instead of system • All input/output files required must be passed as arguments
User’s interface • Gridifying the sequential program • CORBA-IDL Like Interface: • In/Out/InOut files • Scalar values (in or out) • The subroutines/functions listed in this file will be executed in a remote server in the Grid. interface MC { void subst(in File referenceCFG, in double newBW, out File newCFG); void dimemas(in File newCFG, in File traceFile, out File DimemasOUT); void post(in File newCFG, in File DimemasOUT, inout File FinalOUT); void display(in File toplot) };
Outline • Objective • The essence • User’s interface • Automatic code generation • Run-time features • Programming experiences • Ongoing work • Conclusions
client server Automatic code generation app.idl gsstubgen app.xml app-stubs.c app.h app.c app-worker.c app-functions.c app_constraints.cc app_constraints_wrapper.cc app_constraints.h
Sample stubs file #include <stdio.h> … int gs_result; void Subst(file referenceCFG, double seed, file newCFG) { /* Marshalling/Demarshalling buffers */ char *buff_seed; /* Allocate buffers */ buff_seed = (char *)malloc(atoi(getenv("GS_GENLENGTH"))+1); /* Parameter marshalling */ sprintf(buff_seed, "%.20g", seed); Execute(SubstOp, 1, 1, 1, 0, referenceCFG, buff_seed, newCFG); /* Deallocate buffers */ free(buff_seed); } …
Sample worker main file #include <stdio.h> … int main(int argc, char **argv) { enum operationCode opCod = (enum operationCode)atoi(argv[2]); IniWorker(argc, argv); switch(opCod) { case SubstOp: { double seed; seed = strtod(argv[4], NULL); Subst(argv[3], seed, argv[5]); } break; … } EndWorker(gs_result, argc, argv); return 0; }
Sample constraints skeleton file #include "mcarlo_constraints.h" #include "user_provided_functions.h" string Subst_constraints(file referenceCFG, double seed, file newCFG) { string constraints = ""; return constraints; } double Subst_cost(file referenceCFG, double seed, file newCFG) { return 1.0; } …
Sample constraints wrapper file (1) #include <stdio.h> … typedef ClassAd (*constraints_wrapper) (char **_parameters); typedef double (*cost_wrapper) (char **_parameters); // Prototypes ClassAd Subst_constraints_wrapper(char **_parameters); double Subst_cost_wrapper(char **_parameters); … // Function tables constraints_wrapper constraints_functions[4] = { Subst_constraints_wrapper, … }; cost_wrapper cost_functions[4] = { Subst_cost_wrapper, … };
Sample constraints wrapper file (2) ClassAd Subst_constraints_wrapper(char **_parameters) { char **_argp; // Generic buffers char *buff_referenceCFG; char *buff_seed; // Real parameters char *referenceCFG; double seed; // Read parameters _argp = _parameters; buff_referenceCFG = *(_argp++); buff_seed = *(_argp++); //Datatype conversion referenceCFG = buff_referenceCFG; seed = strtod(buff_seed, NULL); string _constraints = Subst_constraints(referenceCFG, seed); ClassAd _ad; ClassAdParser _parser; _ad.Insert("Requirements", _parser.ParseExpression(_constraints)); // Free buffers return _ad; }
Sample constraints wrapper file (3) double Subst_cost_wrapper(char **_parameters) { char **_argp; // Generic buffers char *buff_referenceCFG; char *buff_referenceCFG; char *buff_seed; // Real parameters char *referenceCFG; double seed; // Allocate buffers // Read parameters _argp = _parameters; buff_referenceCFG = *(_argp++); buff_seed = *(_argp++); //Datatype conversion referenceCFG = buff_referenceCFG; seed = strtod(buff_seed, NULL); double _cost = Subst_cost(referenceCFG, seed); // Free buffers return _cost; } …
app-worker.c app-worker.c app-functions.c app-functions.c serveri serveri Binary building app.c app_constraints_wrapper.cc app_constraints.cc . . . app-stubs.c GRID superscalar runtime GT2 client GT2 services: gsiftp, gram
Calls sequence without GRID superscalar app.c app-functions.c LocalHost
Calls sequence with GRID superscalar app.c app-stubs.c GRID superscalar runtime GT2 app-worker.c app_constraints_wrapper.cc app-functions.c app_constraints.cc RemoteHost LocalHost
Outline • Objective • The essence • User interface • Automatic code generation • Run-time features • Programming experiences • Ongoing work • Conclusions
Run-time features • Previous prototype over Condor and MW • Current prototype over Globus 2.x, using the API • File transfer, security, … provided by Globus • Run-time implemented primitives • GS_on, GS_off • Execute • GS_Open, GS_Close, GS_FClose, GS_FOpen • GS_Barrier • Worker side: GS_System
Data dependence analysis Renaming File forwarding Shared disks management and file transfer policy Resource brokering Task scheduling Task submission End of task notification Results collection Explicit task synchronization File management primitives Checkpointing at task level Deployer Exception handling Run-time features • Current prototype over Globus 2.x, using the API • File transfer, security, … provided by Globus
Subst Subst Subst DIMEMAS Subst DIMEMAS DIMEMAS EXTRACT EXTRACT EXTRACT Display Data-dependence analysis • Data dependence analysis • Detects RaW, WaR, WaW dependencies based on file parameters • Oriented to simulations, FET solvers, bioinformatic applications • Main parameters are data files • Tasks’ Directed Acyclic Graph is built based on these dependencies
File-renaming While (!end_condition()) { T1 (…,…, “f1”); T2 (“f1”, …, …); T3 (…,…,…); } WaR • WaW and WaR dependencies are avoidable with renaming T1_1 T1_2 T1_N “f1_2” “f1_1” “f1” “f1” “f1” … WaW T2_1 T2_2 T1_N T3_1 T3_2 T1_N
File forwarding T1 T1 f1 (by socket) f1 T2 T2 • File forwarding reduces the impact of RaW data dependencies
File transfer policy Working directories f1 f4 T1 f1 server1 f7 client f4 f7 T6 server2
Shared working directories Working directories T1 f1 server1 f7 f7 f1 f4 client T6 server2
Shared input disks Input directories server1 client server2
Disks configuration file shared directories khafre.cepba.upc.es SharedDisk0 /app/DB/input_data kandake0.cepba.upc.es SharedDisk0 /usr/DB/inputs kandake1.cepba.upc.es SharedDisk0 /usr/DB/inputs kandake0.cepba.upc.es DiskLocal0 /home/ac/rsirvent/matmul-perl/worker_perl kandake1.cepba.upc.es DiskLocal0 /home/ac/rsirvent/matmul-perl/worker_perl khafre.cepba.upc.es DiskLocal1 /home/ac/rsirvent/matmul_worker/worker working directories
Resource Broker • Resource brokering • Currently not a main project goal • Interface between run-time and broker • A Condor resource ClassAdd is built for each resource Broker configuration file: Machine LimitOfJobs Queue WorkingDirectory Arch OpSys GFlops Mem NCPUs SoftNameList khafre.cepba.upc.es 3 none /home/ac/rsirvent/DEMOS/mcarlo i386 Linux 1.475 2587 4 Perl560 Dimemas23 kadesh.cepba.upc.es 0 short /user1/uni/upc/ac/rsirvent/DEMOS/mcarlo powerpc AIX 1.5 8000 16 Perl560 Dimemas23 kandake.cepba.upc.es /home/ac/rsirvent/McarloClAds workers localhost
Resource selection (1) • Cost and constraints specified by user and per IDL task: • Cost (time) of each task instance is estimated double Dimem_cost(file cfgFile, file traceFile) { double time; time = (GS_Filesize(traceFile)/1000000) * f(GS_GFlops()); return(time); } • A task ClassAdd is built on runtime for each task instance string Dimem_constraints(file cfgFile, file traceFile) { return "(member(\"Dimemas\", other.SoftNameList))"; }
Resource selection (2) • Broker receives requests from the run-time • ClassAdd library used to match resource ClassAdds with task ClassAdds • If more than one matching, selects the resource which minimizes: • FT = File transfer time to resource r • ET = Execution time of task t on resource r (using user provided cost function)
Task scheduling • Distributed between the Execute call, the callback function and the GS_Barrier call • Possibilities • The task can be submitted immediately after being created • Task waiting for resource • Task waiting for data dependency • GS_Barrier primitive before ending the program that waits for all tasks
Task submission • Task submitted for execution as soon as the data dependencies are solved if resources are available • Composed of • File transfer • Task submission • All specified in RSL • Temporal directory created in the server working directory for each task • Calls to globus: • globus_gram_client_job_request • globus_gram_client_callback_allow • globus_poll_blocking
End of task notification • Asynchronous state-change callbacks monitoring system • globus_gram_client_callback_allow() • callback_func function • Data structures update in Execute function, GRID superscalar primitives and GS_Barrier
Results collection • Collection of output parameters which are not files • Partial barrier synchronization (task generation from main code cannot continue till we have this scalar result value) • Socket and file mechanisms provided
GS_Barrier • Implicit task synchronization – GS_Barrier • Inserted in the user main program when required • Main program execution is blocked • globus_poll_blocking() called • Once all tasks are finished the program may resume
File management primitives • GRID superscalar file management API primitives: • GS_FOpen • GS_FClose • GS_Open • GS_Close • Mandatory for file management operations in main program • Opening a file with write option • Data dependence analysis • Renaming is applied • Opening a file with read option • Partial barrier until the task that is generating that file as output file finishes • Internally file management functions are handled as local tasks • Task node inserted • Data-dependence analysis • Function locally executed • Future work: offer a C library with GS semantic (source code with typicals calls could be used)
Task level checkpointing • Inter-task checkpointing • Recovers sequential consistency in the out-of-order execution of tasks 0 1 2 3 3 4 5 6 Completed Successful execution Running Committed
Task level checkpointing • Inter-task checkpointing • Recovers sequential consistency in the out-of-order execution of tasks 0 1 2 3 3 4 5 6 Finished correctly Completed Failing execution Running Cancel Committed Failing
Task level checkpointing • Inter-task checkpointing • Recovers sequential consistency in the out-of-order execution of tasks 0 1 2 3 3 4 5 6 Finished correctly Completed Restart execution Running Committed Execution continues normally! Failing
Checkpointing • On fail: from N versions of a file to one version (last committed version) • Transparent to application developer
Deployer • Java based GUI • Allows workers specification: host details, libraries location… • Selection of Grid configuration • Grid configuration checking process: • Aliveness of host (ping) • Globus service is checked by submitting a simple test • Sends a remote job that copies the code needed in the worker, and compiles it • Automatic deployment • sends and compiles code in the remote workers and the master • Configuration files generation
Deployer (2) • Automatic deployment