260 likes | 342 Views
A Steering Portal for Condor/DAGMAN. Naoya Maruyama on behalf of Akiko Iino Hidemoto Nakada, Satoshi Matsuoka Tokyo Institute of Technology. Background. Common Grid Usage Scenario Zillions of Batch Jobs scheduled over combination of private/public resources within a VO
E N D
A Steering Portal for Condor/DAGMAN Naoya Maruyama on behalf of Akiko Iino Hidemoto Nakada, Satoshi MatsuokaTokyo Institute of Technology
Background • Common Grid Usage Scenario • Zillions of Batch Jobs scheduled over combination of private/public resources within a VO • Some Jobs require steering during workflow • “Human decision required” • Most previous steering work focused on GUI-level interactivity • Real-time, interactive steering of the application itself • Does not meld well with batch jobs • Need significant application customizations
Objectives and Contributions • Objectives • A Steering Portal for workflow (DAGMAN) jobs with easy descriptions, w/o application, Condor, or DAGMAN modifications • Contributions • Portal to allow steering with simple additions to DAGMAN scripts • Confirmed low overhead with exemplar applications • Quantitative assessment of user steps required
Outline • Background • Motivating example • Required features of steering • Steering example • Overview and prototype implementation • Evaluation • Conclustion
Common Ancestor Exemplar Application:Phylogenetic Tree Inference • Infer phylogenetic relationships between different species from their genomic sequences[Hasegawa&Shimodaira04] • App Characteristics • Basically execute multiple parallel jobs in sequence => Workflow of batch jobs • But difficult to judge the termination condition of the application phases => Need human steering
Phylogenetic Tree Inference Breakdown Narrow down on the candidate phylogenetic trees:Hard to automate=>batch jobs difficult Compute Posterior Probability “MrBayes” Compute likelihood value “PAML” Test“CONSEL”
1 1 1 1 1 The Actual Workflow • Exec. MrBayes • Termination Judgement • Manutal input of new parameters • Post-Process MrBayes • Execute PAML • Execute CONSEL 2 3 4 Need Steering 5 5 5 5 5 6
MrBayes Example and Problems • As a standalone app, requests interactive input • Up to a user to judge computational convergence • But lacks info display to allow good judgment • Not on this screen! 1.User needs to periodically poll his screen and make interactive input 2.Also look at output files from 1000 jobs!
MrBayes Examples and Problems (2) ・Decide onConvergence ・Decide on next parameter Visualize Output file Problems: 3.Manual conversion to graphical display 4.Changing appropriate parameters
Outline • Background • Motivating example • Required features of steering • Steering example • Overview and prototype implementation • Evaluation • Conclustion
Steering portal features for batch workflows with interactivity elements • Pausing/resuming computation • Progress computation as much as possible until user input is absolutely needed • Resume immediately after input • Allow flexible parameter modifications • Various ways to specify parameters for output and input • Various ways to notify users – interactive screen, email, etc. • Various ways of parameter observations – various portal functions • Various ways to modify parameters • Even switching back and forth between your terminal and from a cell phone 10,000 miles away!
Outline • Background • Motivating example • Required features of steering • Steering example • Overview and prototype implementation • Evaluation • Conclustion
Example: (1) Job submission • Standard Condor/DAGMAN job submission • But includes steering functions in job description
Example (2): User Notification • Various notification methods, incl. email • Displays Portal URL in the message • Works on various devices incl. cell phones
Example (3): Steering Portal Visualize current status Continuing of Workflow Portal generating steering web pages dynamically depending on workflow context Parameter Input
Outline • Background • Motivating example • Required features of steering • Steering example • Overview and prototype implementation • Evaluation • Conclustion
Overview of our Steering Portal Workflow and Steering description DAGMAN/Condor Individual job submissions submission Condor Pool POST Scripting Features Retry Function Steering– notification Steering PortalUser NotificationWeb page generationand Job control Steering–display Steering–input
Overview of Steering Portal (2) • The user defines several steering components for the steering portal, defining in a script below: • A set of applications in the workflow • CondorDAGMan+Steering workflow description • Translator for converting output to input to continue workflow • Visualization program to display application output on steering web page • Application input/output specifications • Parameters that require steering • The Steering portal does: • Read the above script • Automatically generate steering web page • Interact with DAGMAN to notify users (email, etc.) and take input from the web portal
Prototype Implementation • Coordination between DAGMAN and Steering Portal • Use DAGMan POST Scripting function to invoke the steering portal • Use DAGMan Retry function to resume workflow execution • Prototype Implementation of the Steering Portal • Interpretation of the steering descriptions embedded in DAGMAN workflow • Appropriate and multiple notifications and steering interfaces available • Notification and interfaces currently selected according to script • Automated selection for the future • Mail and messaging notification function with embedded services • CGI web page generation onto the portal sever using ssh • Steering from anywhere, anytime (incl. cell phones and PDAs
Outline • Background • Motivating example • Required features of steering • Steering example • Overview and prototype implementation • Evaluation • Conclustion
Evaluation • Apply to sample applications (simple pi calculation and more complex phylogenetic tree example) • Evaluate the necessary “work steps” • Items of Evaluation • Modification to the application program itself • CondorDAGMan workflow description • Translator for converting input to output to continue workflow • Visualization program to display application output on steering web page • Application input/output specifications • Parameters that require steering • Modifications to the Condor Job submit file
Phylogenetic Tree Program (1) 20 9-line files, only 1 line differsamongst them
Conclusion and Future Work • Conclusion • Proposed a Steering Portal that allows interactive steering of batch scheduled jobs in Condor/DAGMAN • Created prototypes with flexible notification and visualization/steering features • Applied to sample apps including Pi and Phylogenetic trees • Future work • Support and automatically select various interfaces • Apply to other application, esp. with larger workflows and more complex interactions • Apply to other workflow engines
Contact info • Satoshi Matsuoka, matsu@is.titech.ac.jp, Tokyo Institute of Technology