1 / 32

Flexible Scientific Workflows Using Dynamic Embedding

Flexible Scientific Workflows Using Dynamic Embedding. Anne H.H. Ngu, Nicholas Haasch Terence Critchlow Shawn Bowers, Timothy McPhilips, Bertram Ludaescher. Outline. Scientific Workflow Problems with static scientific workflow Frame actor Dynamic Embedding

iris-park
Download Presentation

Flexible Scientific Workflows Using Dynamic Embedding

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Flexible Scientific Workflows Using Dynamic Embedding Anne H.H. Ngu, Nicholas Haasch Terence Critchlow Shawn Bowers, Timothy McPhilips, Bertram Ludaescher

  2. Outline • Scientific Workflow • Problems with static scientific workflow • Frame actor • Dynamic Embedding • Implementation of Dynamic Embedding • TSI case study • Conclusion

  3. Scientific Workflows A model of the way a scientist works with their data and tools • Mentally coordinate data export, import, analysis and visualization via various software tools Goals: • Design • Automation • Component reuse … make data analysis and management tasks easier for the scientist!

  4. SPA/Kepler Scientific Workflow System • Scientific Process Automation (SPA) • Modeling Workflows using actor-oriented framework • Executing and Monitoring Workflows • Built on top of Ptolemy II (Berkeley) • Graphical User Interface • Similar to a charting program (Visio) • Drag-and-Drop Components • Connect components • Execute workflows • Monitor Execution SPA PtolemyII

  5. Actor-Oriented Modeling Actors • single component or task • well-defined interface (signature) • given input data, produces output data

  6. Actor-Oriented Modeling Parameter • Input that is configured statically • User changes at design • Produces data

  7. Actor-Oriented Modeling Sub-Workflows (aka Composite Actors) • composite actors “wrap” sub-workflows • like actors, have signatures (i/o ports of sub-workflow) • hierarchical workflows (arbitrary nesting levels) for abstraction • versus ‘Atomic Actors'

  8. Actor-Oriented Modeling Directors • define the execution semanticsof workflow graphs • schedule and execute workflow graphs • sub-workflows may be governed by different directors • Examples: Synchronous Data-Flow (SDF), Process Networks (PN), Discrete Event (DE), Finite State Machine (FSM)

  9. Problems in current SPA/Kepler modeling framework • Workflow is static and must be completely specified before orchestration • Specific tools used in the workflow must be picked at design time (It may not be picking the best tool) • All alternative tools must be enumerated exhaustively (resulted in complex workflow)

  10. Problems in current SPA/Kepler modeling framework (cont.) • Our work aims to provide an abstraction of a tool, a resource, or an algorithm • Specific tool or resource resource is selected during execution

  11. Terascale Supernova Initiative (TSI) Workflow

  12. Different TSI Workflows A Each containing a Submit Job Actor B C

  13. TSI Submit Job Actors…when we look at the sub-workflows Each workflow does job submission in a different way C A B Each containing a remote execution

  14. Goals • There exists actors (atomic and composite) that perform a similar task • Submit Job • Transfer Files • Process Files • RemoteExection • SSH2Exec • SSHWrapper • InvokeSubmitJob • Can we provide an abstraction for encapsulating different implementations of a specific task that can be reused across different workflows? • Can we execute the workflow flexibly? • Choosing a specific implementation depending on runtime condition

  15. Requirements • Select and execute actor based on run time conditions • Execute with data process networks • Built in capability for streaming data • Built in concurrency • Selected actors instantiated on demand • Reusable actor can be nested • Actors are usable by scientist

  16. a a F a Frames • Actors are concrete • Correspond to particular implementations • Frames are abstract • Placeholder for actor / composite actor • input, output, and parameter ports • An embedding occurs when a actor is placed in a frame • A refinement is an actor that can be embedded in a frame SSH Actor a RemoteExecutionFrame F F Embedding F[a]

  17. SSH2 Exec SSH2 Exec Web Service Web Service Static Embedding Frames • A refinement to a Frame is embedded during design • Frames become concrete and cannot be reconfigured during workflow execution Design Time Run Time F SSH2 Exec Web Service

  18. F F SSH2 Exec Web Service Static Embedding Frames (cont.) • Execute with data process networks • Refinement instantiated as needed • Can be nested • Actors are usable by scientist • Select refinements based on run time conditions Run Time

  19. SSH2 Exec SSH2 Exec Web Service Web Service Dynamic Embedding Frames • Refinement to frames are embedded during execution • Frames are not concrete during workflow execution Run Time F SSH2 Exec Web Service

  20. Why Dynamic Embedding? • Select refinements based on run time conditions • Execute with data process networks • Refinement instantiated as needed • Can be nested • Actors are usable by scientist

  21. Implementation Of Dynamic Embedding Construct a new workflow to execute the actor Generates Remote Execution Frame Generated Workflow

  22. Dynamic Embedding Process Remote Execution Frame • Wait for inputs to arrive to the Frame. • Select a refinement. • Transfer of input tokens from Frame to the refinement. • Select mappings • Input Port • Output Port • Parameter • Constructs internal workflow. • Run internal workflow. • Transfer of output tokens from the refinement to the Frame. Model Generated By Dynamic Frame

  23. ModelReference Actor • A higher order actor that can execute a given model (workflow) through its input port. • It fits most of the requirement for dynamic embedding except: • User must pre-construct the given model • Output tokens from the given model are transferred only after completion of the internal workflow. • Our implementation of dynamic embedding leverage the capability of ModelReference actor with two major improvements: • The given model is constructed automatically • Output tokens are transferred synchronously • Frame is thus implemented as a subclass of ModelReference actor with four additional components: SelectActor(), FrameSourceActor(); FrameSinkActor() and PortWiring()

  24. Implementing a Type of Frame • Subclass Frame and Implement • Selection Process • selectActor() • Configure Ports and Parameters • getIntputMappings() • getOutputMappings() • getParameterMappings()

  25. Implements selection policy Returns a refinement Refinement that is returned gets automatically embedded Remote Execution String selectedActor selectActor(){ If(testWebService()) selectedActor = “webservice” return getWebServiceActor() else selectedActor = “ssh2exec” return getSSH2ExecActor() } Selection Process: selectActor()

  26. Transfer Token Frame Input PortActor Input Port Actor Output PortFrame Output Port Expressed as list containing pairs of strings {“hostname”,”hostname”} {“command”,”cmd”} F hostname errors out hostname SSH2 Exec errors cmd command stdout Port Wiring Remote Execution String selectedActor getInputPortMapping(){ if(selectedActor==“SSH2Exec”) return {{“hostname”,”hostname”} {“command”,”cmd”}} else if(selectedActor==“webservice”) return {{“hostname”,”url”} {“command”,”method”}} }

  27. TSI Case Study TSI-A Workflow SubmitJobFrame TSI-B Workflow

  28. Remote Execution Frame TSI-A subworkflow TSI-B subworkflow SubmitJobFrame

  29. Benefits • Select refinements based on run time conditions • Execute with data process networks • Refinement instantiated as needed • Can be nested • Actors are usable by scientist

  30. Limitations • Limitations • Unable to type check internal workflow before execution • There is overhead of creating an additional workflow to execute a refinement • Change in selection process requires recoding/recompiling • Can not monitor internal workflow (Useful for debugging)

  31. Future Work • Semantic binding of Ports and Parameters • Configurable selection criteria • Intelligent brokering • Ptolemy Expression Language • Python • Perl • … • Simplified refinement creation • Caching of generated workflows • Design time type checking of internal workflow

  32. UCRL-ABS-226047 Work performed under the auspices of the U. S. Department of Energy by Lawrence Livermore National Laboratory under Contract W-7405-Eng-48

More Related