210 likes | 228 Views
This paper discusses a graphical modelling environment for Grid workflows, focusing on workflow automation and code generation using Java CoG Kit. It covers related work, domain-specific modelling, model interpreter, limitations, and future work.
E N D
A Graphical Modeling Environment for the Generation of Workflows for the Globus Toolkit Francisco Hernandez, Purushotham Bangalore, Jeff Gray, and Kevin Reilly Department of Computer and Information Sciences University of Alabama at Birmingham Birmingham, AL, USA
Overview • Provide an abstract high-level layer to model the Grid Workflows. • Automate the specification of Grid workflows. • Generate Globus specific code from the graphical models with the help of the Java CoG Kit.
Outline • Related Work • Domain-Specific Modeling • Meta-Model • Modeling Process • Interpreter • Limitations • Future Work • Conclusions
Related Work (1) • Idea of composing applications from reusable components is not new: (e.g., Webflow, Unicore, DAGMan, Symphony, Triana). • Workflows have gained increased attention for their application in composing a flow of tasks in a Grid environment: GridAnt
Related Work (2) • Amin et al.1, proposes a technology and architecture-independent abstraction layer to provide interoperability across multiple Grid implementations, resulting in an Open Grid Computing Environment (OGCE). • Concept is comparable to using meta-models that abstract the underline Grid technologies but is realized at a lower level of abstraction. • Amin, K., Hategan, M., von Laszewski, G., and Zulezec, N., “Abstracting the Grid,” Proceedings of the 12th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2004), 11-13 February 2004, La Coruña, Spain.
Domain-Specific Modeling (1) • Domain-specific modeling (DSM) is a technology that focuses on higher levels of abstraction at the problem space and avoids low-level details at the solutions space. Allowing a user to manipulate graphical models of the problem in hand. • A special type of generator called a model interpreter can translate the models into executable specifications used to automatically synthesize software. • GME is a domain-specific modeling environment that can be configured and adapted from meta-level specifications that describe the domain.
Domain-Specific Modeling (2) • DSM has been useful in automating different kinds of applications in which the environment is dynamic and tightly integrated with the physical environment including: • embedded systems1, • automotive manufacturing2, • complex QoS applications3. • Neema, S., Bapty, T., Gray, J., and Gokhale, A., “Generators for Synthesis of QoS Adaptation in Distributed Real-Time Embedded Systems,” FirstACM SIGPLAN/SIGSOFT Conference on Generative Programming and Component Engineering (GPCE ’02), Springer-Verlag LNCS 2487, Pittsburgh, PA, October 6-8, 2002, pp. 236-251. • Long, E., Misra, A., and Sztipanovits, J., “Increasing Productivity at Saturn,” IEEE Computer, August 1998, pp. 35-43. • Bapty, T., Neema, S., and Gray, J., “Model-Integrated Computing For Composition of Complex QoS Applications Using The Generic Modeling Environment (GME),” OMG Workshop on Real-Time and Embedded Distributed Object Computing, Washington, DC, July 15-18, 2002.
General Meta-Meta-Model Domain Meta-Model Domain Models Application Application Application Interpreter 1 Interpreter 2 Domain Specific Modeling (3) Specific Instance Specify Construct Generate
Execute a Job Upload a File Download a File Meta-Model (1) • Workflows describe the execution of complex applications built from individual application’s components. • The basis of the meta-model is the way in which a user specifies a sequence of tasks in an application’s workflow.
Meta-Model (2) • Experimental knowledge of the domain • Four aspects needed to define the meta-model: • Resources • Transfers end-points • Jobs specifications • Workflows
Meta-Model (3) Resources workflows
Modeling Process (1) hernandf authorizes the use of the remote hosts (cherokeeData and cherokeeCompute). The location of the data file should be specified for each end-point in a file transfer. hernandf specify the location of the user’s security credentials.
Modeling Process (2) The user initiates the execution of the application by first uploading the raw input file. The output file is finally downloaded to the local host. The generator creates a RSL string from the attributes specified by the user. In this case for the job HMM.
Workflow Model GME API Code Generation Grid Application • Model In GME Domain: • Models • Atoms • Connections Model in Globus Domain (Jobs, File Transfers, etc) Translator API Interpreter • The interpreter parses the model and generates the control code that manages the application execution. • GME provides an API that traverses the internal representation of the models. A model interpreter uses this API to translate the models into an application that manages the execution of the workflow.
Example of generated code 9:// create the rsl string 10: GlobusRSL hmmRSL =newGlobusRSL(); 11: 12: hmmRSL.setArg("HMM inHMMFile.txt outHMMFile.txt"); 13: hmmRSL.setEnvironmentVariables ("(INPUT_DIR=/lhome/hernandf) (OUTPUT_DIR=/lhome/hernandf)"); 14: hmmRSL.setStdOut("/lhome/hernandf/sttOutHMM.txt"); 15: hmmRSL.setNumProc(2); 16: hmmRSL.setDir("/usr/bin"); 17: hmmRSL.setExec("java");
Limitations • Work on the modeling environment is in the initial phase. Currently, the environment can handle only a limited set of sequential tasks. • Scalability problems due to the generation of specific code for each workflow task. • Not all of the Globus capabilities are currently supported by the meta-model.
Future Work (1) • Improve the scalability problem by generating a reusable workflow engine and generate the appropriate configurations from the graphical models. • Modify the meta-model in order to support capabilities like: • Hierarchical workflows • Task’s parallelism • Check pointing and error recovery • Query Grid information services
Future Work (2) • Generate different output specifications: • Grid Services • Grid Ant • PyGlobus • New version of Java CoG Kit.
Conclusions (1) • The benefits of using domain-specific modeling techniques for creating Grid workflows are: • Domain modeling removes the accidental complexities of creating workflows in a Grid by focusing on higher levels of abstraction at the problem space rather than solution space. • Modeling tools and their interpreters facilitate the more rapid ability to change the workflow details. That is, it is easier to manipulate and change domain models rather than the associated code. • Model-driven techniques possess the ability to generate multiple artifacts from the same model. Thus, different output representations can be generated from the same domain knowledge.
Conclusions (2) • Using these techniques, a user manipulates graphical models that represent the different components from the Globus Toolkit. From these models the user generates the corresponding Java code that manage the execution of the workflow. • This work is an attempt to abstract the Grid environment into a high-level layer such that the essence is not bound to a specific Grid environment.