340 likes | 487 Views
Meta-Task and Workflow Management in ProdSys II (DEfT). Maxim Potekhin BNL ADC Development Meeting February 4, 2013. ProdSys II (DEfT): Status and Summary of Progress. Progress since the last S&C week: Analyzed project requirements Developed an object model for ProdSys II (DEfT)
E N D
Meta-Task and Workflow Managementin ProdSys II (DEfT) Maxim Potekhin BNL ADC Development Meeting February 4, 2013
ProdSys II (DEfT): Status and Summary of Progress Progress since the last S&C week: • Analyzed project requirements • Developed an object model for ProdSys II (DEfT) • Created database schemas for object persistence in RDBMS • Investigated multiple existing solutions for workflow management • Identified mature software components to be used in the project • Identified the proper and standardized format for describing the workflow • Developed a prototype application: DEFT v1.0 • Maintained documentation • The project is extensively documented in both TWiki and a blog: • https://twiki.cern.ch/twiki/bin/viewauth/Atlas/ProdSys • http://prodsys.blogspot.com/
Overview of this presentation • Why? • Motivations for doing this work • What? • Requirements, scope and deliverables based on motivations • How? • The choice of technologies and platforms to meet the requirements • Where? • What has been done so far, and the current prototype of the system • Whereto? • Directions for future development and integration, and the project timeline
Motivations (1) We need to address the following: • The concept of Meta-Task as a group of logically related tasks, needed to complete a specific physics project or sub-project. Absent in the original product (ProdSys I), it emerged based on operational experience with PanDA and its workflow. It effectively became the central object in the high-level workflow management • It is currently modeled in spreadsheets, which act as surrogate database and GUI, require active maintenance and limit the scalability of the overall Production System. • Meta-Tasks must be properly modeled, introduced and integrated into the system to guarantee that it delivers adequate performance going forward.
Motivations (2) …and also the following: • Automation: we need the capability to process Meta-Tasks with minimal human intervention beyond task definition. Right now it is a labor intensive semi-manual process. • But we also need the capability to have operator intervention and Meta-Task recovery: there must be adequate tools for the operators and managers to direct the Meta-Task processing, for example: • To be able to start certain steps before others are defined • To augment a Meta-Task in any stage of processing • To put a task on hold • To recover from failures in an optimal way • Flexibility of job definition (e.g. making it dynamic as opposed to static once the task is created): there are a number of advantages that we hope can be realized once there is a capability to define jobs dynamically, based on the resources and other conditions present once the task moves into the execution stage. Currently, jobs are fairly static objects in the system. • Maintainability: the code of the existing Production System was written "organically", in order to actively support emerging requests from the users, and it starts showing its age • Scalability: there are certain issues with the interaction between ProdSys I and the database back-end. In general, given the dramatic rise in the number of tasks defined and executed, we must ensure a lot of headroom going forward. • Ease of use: there is currently a great amount of detail that the end user (Physics Coordination) must define in order to achieve a valid task request. We must automate and facilitate the task creation process, whereby cumbersome logic is handled within the application, and the user interface is more transparent and friendly.
Requirements Principal requirements have already been touched upon in the “Motivations” section, where we outlined the general new functionality that needs to be implemented. Let’s put the requirements in a slightly different perspective: • Front End: capability to define and persist the Meta-Task objects, and submit them for processing in PanDA. This includes the following functionality: • Quick creation of Meta-Tasks based on pre-defined templates • Similar to template use: “cloning” of existing Meta-Tasks • As an option, the ability to handle more complex Meta-Task topologies than currently in use • Automation: capability to process Meta-Tasks without human intervention (with the exception of complex failure recovery) • Control: operators must have an option to intervene in the execution of a Meta-Task, for example: • To be able to start certain individual tasks at will before the whole Meta-Task is completely defined • To augment a live Meta-Task (for example by growing the size of the dataset) • To put a task on hold (for example to investigate some job failure) • To recover from failures in an optimal way (without using “good” data and having to restart from scratch) • Workload optimization: flexibility of job definition (such as size) by using highly dynamic PanDA brokerage system to pick resources best suited for the workload at hand, and vice versa. • Documentation and Maintainability. • Scalability: demonstration, via stress testing, of the headroom in the system throughput. • Ease of use: implementation, to the maximum extent possible, of user-friendly user interfaces. In conjunction with that, robust authorization and access control to enforce effective management of the workflow. • Monitoring.
Overall Design One principal design decision made early on was to create the new Production System, a.k.a. ProdSys II, as a tandem of two subsystems, which play complementary roles and represent two different levels in managing the overall workflow : • DEfT: Database Engine for Tasks: • DEfT is responsible for formulating the Meta-Tasks and the underlining tasks. Meta-Task topologies can include chains of tasks, bags of tasks and other types task groups, complete with all necessary parameters. DEfT keeps track of, and persists the state of Meta-Tasks under its management, and their constituent tasks. It provides the interface for each Meta-Task definition, management and monitoring throughout its lifecycle. • JEDI: Job Execution and Definition Interface • JEDI is using the task definitions formulated in DEfT to define and submit individual jobs to PanDA, keep track of their progress and handle re-tries of failed jobs, job redirection etc. In addition, JEDI interfaces data management services in order to properly aggregate and account for data generated by individual jobs (i.e. general dataset management) • Note: we won’t always capitalize DEfT and JEDI in this way, and shall use DEFT and Jedi wherever it’s easier to type
DEFT Equation Each task in a Meta-Task can be represented as a set of parameters describing its state, i.e. as a vector. The role of DEFT is to apply rules to transform this vector. y1 y2 y3 … yN x1 x2 x3 … xN = D
DEfT and JEDI working in tandem DEfT JEDI Job 1-1-1 Task 1-1 Job 1-1-2 Task 1-2 PanDA Sites Meta-Task 1 Job 1-1-3 Task 1-3 Job 1-1-4 Task 1-4 Front End and User Interface – Meta-Task Editor Job 2-1-1 Meta-Task 2 Task 2-1 Job 2-1-2 Task 2-2 … … … Meta-Task 3 … Task 2-N Meta-Task 4 Job 2-1-N ... … Meta-Task N …
DEfT and JEDI as an assembly line Job 1-1-1 Job 1-1-1 Job 1-1-2 Job 1-1-2 Job 1-1-3 Job 1-1-3 Job 1-1-4 Job 1-1-4 JEDI JEDI JEDI DEfT Task 1 (t0) Task 1 (t1) Task 1 (t2) Meta-Task 1 Meta-Task 1 DEfT DEfT DEfT T
Major Components and their Communication Design Parameters • JEDI has been designated as the component most closely coupled with the PanDA service and brokerage itself, while DEfT is considered a more platform-neutral database engine for bookkeeping of tasks (hence its name). It makes sense to leverage this factorization and take advantage of component-based architecture, in order to reduce dependencies in the system and keep venues for its evolution open. Based on that, DEfT will have a minimum level of dependency on PanDA specifics, while JEDI takes advantage of full integration with PanDA. Wherever possible, JEDI will have custody of the content stored in the database tables, and DEfT will largely use references to these data. Communication • A few features of the DEfT/JEDI tandem design: in managing the workflow in Deft, it’s unnecessary (and in fact undesirable) to implement subprocesses and/or multi-threaded execution such as one in products like PaPy etc. That is because DEfT is NOT a processing engine, or even a resource provisioning engine, as is the case in some Python-based workflow engines. It's a state machine that does not necessarily need to work in real time (or even near time). Near-time state transitions and resource provisioning is done in PanDA • There are a few ways to establish communication between Deft and Jedi, which are not mutually exclusive. For example, there may be callbacks, and database token updates, which may in fact co-exist. If a message is missed for some reason, the information can be picked up during a periodic sweep of the database. In summary, DEfT and JEDI will work asynchronously. • An interesting question is whether the database should be exposed through the typical access library API, or as a Web Service. Needs to be evaluated.
Notes on Choice of Technologies Reuse. Reuse. Reuse. • Workflow management is by no means a new subject or R&D topic. We must make an effort to reuse either a complete existing solution, or failing that, major components and do our best to avoid in-house development, in order to save on development and maintenance costs. Progress will be measured not by the number of lines of code we manage to write, but the number of lines of code we managed to not write and still get the desired result. • We have completed an extensive analysis of many existing Workflow management systems and packages. There are links and other information to be found in the ProdSys II TWiki. • Various BPEL engines (like Apache ODE), Pegasus, VisTrails, Soma, Weaver, PaPy, Pyphant etc • We must recognize that existing turnkey solutions for complete management of workflow have any or all of the following problems: • They are heavy-weight and with a steep learning curve • Are strongly tied to a particular platform such as Condor • Implement their own resource provisioning schemes, which is inefficient since same is done in PanDA • Hard to integrate with handling and transfer of data used in ATLAS • With packages, there are also issues of performance, ongoing support and maintainability • Based on the above, an optimal solution would be to find a package or a library that can be adopted in an application integrated with PanDA Note on the language platform • Without having a fundamental preference for any particular programming language, it makes sense to stick with a Python solution due to the ready available expertise at all levels and locations of the ATLAS community
The Graph Model (1) Why use the graph model for the Workflow? • Looking at available literature in both industry and academia, graph is the most efficient and natural way to model workflows • While not explicit, the existing system of chained task does assume dependencies best described in a graph, like in this simplified example illustrating a chain of tasks:
The Graph Model (2) Handle more complex cases of workflow, if needed • Can start with implementing the “chain”, “bag” and “bag of chains” topologies • Liberates the designers of the workflow from severe limitations of the current “spreadsheet” model – can do not just what’s immediately possible, but what’s best for delivery of physics results (don’t anticipate drastic changes soon, but something to prepare for) • In the following, we will generally use the concept of tasks as a group of jobs using one or more datasets as input, and one or more datasets is output:
The Graph Model (3) Crucial features of the DEFT Meta-Task Model • In accordance with operational practice in ATLAS, the dependencies between two adjacent tasks in the chain, which we model as nodes of a graph, are best conceptualized as datasets, which then become the edges of the graph • In order to handle “bag” and other complex topologies, we introduce “pseudo-tasks” representing the entry and exit points. It is actually an established technique in handling workflows.
The Petri Net Model and the DEFT State Machine Why do we find the Petri Net Model useful? • The Petri Model adequately represents the transitions between various states of the workflow graph nodes based on the state of the incoming edges • The Petri graph should not be confused with the workflow graph, it represents “places” and “transitions” • It allows the developer to conceptualize DEFT logic as a state machine, which traverses the workflow graph and toggles the state of tasks based on “tokens”, which are • State of datasets serving as inputs to the task • Control inputs set by the human operator
PyUtilib (1) What are we looking for? • We are looking for a package that implements the workflow engine logic along the lines of the models described above and forms the core of DEfT. What is PyUtilib? • Having evaluated a few available packages, we consider PyUtilib (and in particular it’s Workflow component) as a prime choice for the implementation of the state machine in DEfT • Developed and actively maintained by personnel at Sandia National Laboratories. Link: • https://software.sandia.gov/trac/pyutilib/wiki/Documentation/PyUtilibOverview • Successfully used in a few projects done at Sandia • Based on component architecture and contains a lot of useful tools other than just Workflow Management • What do we gain? We avoid writing tons of boilerplate code to handle complex dependencies in the workflow, and benefit from the core application code being simple (for specific reasons explained in the slides to follow). Features of the Workflow Package in PyUtilib • PyUtilib is using component architecture to offer the developer a straightforward and intuitive way to express dependencies in a set of tasks by means of “connectors” • A dependency amounts to a simple assignment operator • Traversal of dependencies is fully automatic and is not exposed to the developer, saving effort. • Individual tasks in a workflow can be actual workflows themselves, lending almost any level of granularity to the system – if we wanted to implement individual PanDA job handling this way, within a task, we can. It’s a matter of design decision (not proposing this at this point).
PyUtilib (2) In the following diagram, each “Task” is a Python object, and its inputs and outputs are attributes of same.
PyUtilib (3) # code example (for illustration purposes only) # define the Task class as appropriate class TaskA(pyutilib.workflow.Task): def __init__(self, *args, **kwds): """Constructor.""“ pyutilib.workflow.Task.__init__(self, *args, **kwds) self.inputs.declare('x') self.inputs.declare('y') self.outputs.declare('z') def execute(self): """Compute the sum of the inputs.""“ self.z = self.x + self.y # can be anything, of course # application code: establish dependency between tasks A and B: w = pyutilib.workflow.Workflow() w.add(A) print w(x=1, y=3) # prints 4
PyUtilib (4) # code example (for illustration purposes only) # define another Task class class TaskB(pyutilib.workflow.Task): def __init__(self, *args, **kwds): """Constructor.""" pyutilib.workflow.Task.__init__(self, *args, **kwds) self.inputs.declare('X') self.outputs.declare('Z') def execute(self): """Compute the value of the input times 2""" self.Z = 2*self.X # application code: establish dependency between tasks A and B: A = TaskA() B = TaskB() A.inputs.x = B.outputs.Z w = pyutilib.workflow.Workflow() w.add(A) print w(X=1, y=3) # prints 5
Meta-Task: the Language How do we represent and document the objects created according to the Graph Model, in human-readable format? • The need to represent Meta-Tasks and their components in a way amenable to being read, edited and archived by humans was realized early on in the project • “Domain Specific Language” was mentioned as one of the development parameters • Do we need to build it from scratch? Probably not, It’s the model! • Since we consider the Graph Model as the optimal way to represent the workflow in its various states, it is a reasonable approach to try and identify a natural way to represent the graph • This leads to realization that there are already standard languages and schemas that do exactly that.
Meta-Task: GraphML and NetworkX Choice of Schema • There is an obvious advantage in choosing the schema that’s is standardized, enjoys support and has parsers already written to handle its specifics. • GraphML appears to be very simple, human-readable and enjoys parser support in many existing visualization and analysis software products • From personal experience, JSON is less well suited for this application as it’s too terse and the data representation is unintuitive. Still an option when needed, for example for AJAX monitoring applications and similar use cases. • Allows us to standardize on the workflow description, visualization, editing, documentation and versioning with essentially zero effort. NetworkX • “NetworkX is a Python language software package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks.” • While its functionality is quite rich, and allows all sorts of graph analysis and exploration, the minimal subset of methods is quite easy to learn and use immediately • Reads GraphML, JSON, and other documents and creates an in-memory model of the graph automatically • Likewise, serializes graphs into a variety of formats, like GraphML, JSON etc • Visualization can be implemented by a few supported Python packages which need to be installed separately, such as matplotlib etc.
Meta-Task: Persisting Graphs in RDBMS How to persist graphs? • With current choice of the format/language, we obviously already have the option to persist workflow objects as XML documents and optionally in XML databases (nice to have), and more optimally in specialized Graph Databases, with some development effort • At the same time, integration with PanDA/JEDI at this juncture pre-supposes the use of RDBMS (practically speaking, Oracle) • Persisting graphs in RDBMS had been addressed before; we revisited existing approached and chose the “Adjacency Table” approach as the most scalable and easy to implement • See TWiki page at: https://twiki.cern.ch/twiki/bin/viewauth/Atlas/TaskModel#Graphs_in_RDBMS • Two separate tables are kept, for the nodes and for the edges. The nodes have unique IDs, the edges contain pairs of IDs in separate columns. This allows for a straightforward representation of most any DAG in RDBMS Note on Graph Databases • There are multiple advantages in matching the storage model to the model of the object that’s being stored. In that regard, Graph Databases could be the ideal solution for future version of DEFT and other components of the Production System • The power of Graph Databases has been recently realized by major organizations to create new, previously unattainable level of capability • As already mentioned, we won’t pursue this in the current design and implementation cycle, but it would be wise to maintain ongoing R&D in that direction along with other noSQL technologies.
Bringing it all together In DEFT, we combine the following elements • Graph Model to represent the Workflow, with tasks being the nodes of the graph and datasets being the edges • NetworkX package to manage the in-memory instances of the Workflow Graphs • GraphML language (XML schema) as the standard way to represent graphs in human readable form, which can be easily imported into NetworkX • The Workflow Package (a part of PyUtilib), which allows us to implement the rules governing state transitions of the tasks based on various conditions. • An array of possible visualization, graphing and editing solutions that can be used due to the standard GraphML format: • GePhi • NetworkX + matplotlib • Straightforward integration (via AJAX/JSON/XML) with advanced Javascript libraries and jQuery add-ons such as jsPlumb, WireIt, Raphael etc.
DEFT Prototype (1) Software prototype • DEFT exists as a functioning, proof-of-integration prototype (CLI utility) • Integration of NetworkX, PyUtilib and DB Oracle schemas • Capability to import and export workflows in GraphML format, as well as to persist data in RDBMS, and access and modify data transparently across these containers Template Change of the Meta-Task State Change of the Meta-Task State Oracle DB read and update GraphML Document input DEFT GraphML Document output
DEFT Prototype (2) Software prototype • Capability to support workflows described by DAG of any complexity, not just “chains” and “bags” • Straightforward cloning and copying of tasks • Possibility to interactively edit workflows in visual editors
Plans DEFT/JEDI Integration • With the DEFT prototype operational, the project is at a stage where we need to actively work on JEDI integration – this is the next immediate step. Important peripheral components • Need to create a module for automated dataset name generation. Right now this logic appears to be spread across various components of ProdSys I. Task visualization, editing and monitoring • Basic visualization tools are already available, such as GePhi and matplotlib add-on to NetworkX (cumbersome installation though). Editing is available in GePhi complete with a GUI interface, and of course GraphML files can also be edited using any text editor. • For more polished look and more dynamic and better user experience, we can develop a browser-based frontend utilizing jsPlumb, WireIt, Raphael etc – but we need to budget manpower for that, since the considerable power of these graphics systems comes with significant complexity of logic and API
Backup slides: examples of workflow visualization and editing in GePhi
Backup slides: examples of workflow visualization and editing in GePhi
Backup slides: examples of workflow visualization and editing in GePhi
Backup slides: examples of Javascript tools to aid in building Meta-Task GUI in DEFT