180 likes | 274 Views
Pseudo dynamic DAG control. Version 1. Outline. Goal Solution Restrictions Example Case Study. Goal.
E N D
Pseudo dynamic DAG control Version 1
Outline • Goal • Solution • Restrictions • Example • Case Study
Goal • The user should be able to redirect the control in his workflow upon the outcome of any job. The exit value set by the executable of the user’s job determines weather a job fails or succeeds • The failed job therefore should not stop the overall operation, and no real subsequent computational activity must be started in the branch proved to be false. • The solution should be a clean “user level” one not requesting any change in the P-GRADE Portal middleware
The Solution The solution is the introduction of a suggested Job template where the frame of the job is standardized.
Solution details • Enveloping the executable of the original job in a standard wrapper program which terminates as TRUE. The wrapper program is written in C and downloadable as http://www.sztaki.hu/~ghermann/Szemelyes/PseudoDynamicDAGControl/wrapper.exe • Adding standard logical Input / Output channels to the wrapped job to control the flow
Restriction of the solution • The solution handles only internal (programmed) job failures • Failures due to the environment (resource, authentication and communication problems) are recognized by the DAGMAN and can be handled by the Rescue feature of the P-GRADE Portal
Original output files I/O convention for Job Wrapper Extension of an original job Original input files Original job: Executable(); Modified job: If(LOG_INPUT.value) LOG_INPUT.value= Executable().exit; TRUE_OUTPUT.value = LOG_INPUT.value; FALSE_OUTPUT.value = ! LOG_OUTPUT.value; EXECUTABLE_INPUT port TRUE_OUTPUT port LOG_INPUT port FALSE_OUTPUT port
Possible states Animation of wrapper job operation InputData InputData InputData OutputData OutputData OutputData LOG_INPUT execute execute Fake Output gen. Fake Output gen. F_OUTPUT F_OUTPUT F_OUTPUT F_OUTPUT F_OUTPUT F_OUTPUT T_OUTPUT T_OUTPUT T_OUTPUT T_OUTPUT T_OUTPUT T_OUTPUT LOG_INPUT LOG_INPUT LOG_INPUT LOG_INPUT I Token with value TRUE or FALSE arrives on LOG_INPUT TRUE on Logical input triggers the execution of the program of the user FALSE value on LOG_INPUT activates the subsequent jobs connected to the F(ALSE)_OUTPUT In the different cases pro forma (fake) output will be generated to “cheat” the DAGMAN Non Zero (false) exit value on “execute” activates the subsequent jobs connected to the F(ALSE)_OUTPUT LOG_INPUT II Real Output data will be forwarded only if the user job “execute” succeeds “execute” may return false or true exit value Zero (true) exit value on “execute” activates the subsequent jobs connected to the T(RUE)_OUTPUT III
RULES FOR EXTENDED JOBS • The Job Executable is a special wrapper program (wrapper.exe) • The genuine (user) executable returns the exit value • Two additional input Ports and two additional output Ports are introduced each with standard Internal File Name:the genuine executable is associated as “EXECUTABLE_INPUT”,the file delivering the executing permission is “LOG_INPUT”, the name of files delivering the propagated permissions for the subsequent jobs in the proper direction are “TRUE_OUTPUT” and “FALSE_OUTPUT” • The logical input and output ports accept special files with content {TRUE|FALSE} • The Internal File Names of the output files which may be produced by the user executable must be listed after the genuine arguments separated by the keyword –outputs. This list is needed because if the LOG_INPUT delivers FALSE value or the user job fails then the wrapper must create pro forma (fake) output data files substituting the not running or not properly running executable of the user. In the lack of these files the DAGMAN would abort the job while attempting to copy the not existing files to the subsequent jobs.
EXAMPLE: IF(C1) E1 ELSE IF(C2) E2 ELSE E3Detailes new LOG_INPUT port (Value: TRUE,FALSE) new EXECUTABLE_INPUTport to upload the genuine executable new TRUE_OUTPUT port (value: TRUE,FALSE) original output data port new FALSE_OUTPUT branch (value:TRUE,FALSE) Each Internal File Name of files which can be produced by the genuine user executable must be listed after the separator attribute -outputs Job executable is the the standard “wrapper.exe” original input data port
A LOG_INPUT port not connected to any (logical) output ports must be associated to a file containing the ascii string “TRUE” TRUE ExampleIF(C1) E1 ELSE IF(C2) E2 ELSE E3Environment EXECUTABLE_INPUT LOG_INPUT TRUE_OUTPUT FALSE_OUTPUT
II Part (A case study) The case study is an IF THEN ELSE type simple workflow containing three jobs. The tested application can be downloaded as: http://www.sztaki.hu/~ghermann/Szemelyes/PseudoDynamicDAGControl/TestProgram/SZTAKI_hermann_IF_THEN_ELSE_fork_seegrid.tar.gz
II Part (Case study) Input port definition to upload the executable “exitWithArg.exe” The job “TRUEBR” connected by the TRUE_OUTPUT port of the job “IFargEq0” will not execute its user program “multiply.exe” defined at the port:1 The first job of wrapper type must run unconditionally therefore gets a file containing “TRUE” as LOG_INPUT The test job IFargEq0 is the wrapper of the executable “exitWithArg.exe” which exits the same value it has been defined as Attributes i.e. we expect that the workflow will execute the job FALSEBR (connected to the FALSE_OUTPUT port ) The job “FALSEBR” connected to the port FALSE_OUTPUT of IFargEq0 will run in our experiment executing the user program “CopyAndTime” defined at the port:1 Port to define the user executable “multiply.exe”
Job IFArgEq0 output listing Message of the embedded user program “ExitWithArg” As this program has no “real” data output the warning can be left out of consideration The wrapper reports its decision which determines the activation of subsequent jobs
Job TRUEBR output listing As the preceding wrapper job resulted the value “FALSE” on the TRUE_OUPUT port the user executable of this job will not be executed
Job FALSEBR output listing Message of the embedded user program “CopyAndTime” The wrapper reports its decision which determines the activation of subsequent jobs