1 / 23

Data-Centric Approach for Workflows: Managing Data Conflicts and Complexity in Service-Oriented Workflows

This paper explores a data-centric approach for managing data conflicts and complexity in service-oriented workflows. It discusses the challenges faced during workflow orchestration and presents a solution that focuses on data flow modeling and integration. The paper also introduces the concept of a data-centric workflow and explores its advantages in terms of reusability, shared data models, and simplified integration.

terrazas
Download Presentation

Data-Centric Approach for Workflows: Managing Data Conflicts and Complexity in Service-Oriented Workflows

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Data Centric approach for Workflows A. Akram, J Kewley and R. Allan CCLRC e-Science Centre, Daresbury Laboratory, UK

  2. Web Services • Web Services are popular “connection technology” implementing SOA • Web Services can be • Described using service description language i.e. WSDL • Published to a registry of services i.e. UDDI • Discovered through standard mechanisms • Invoked through a declared API • Composed with other services

  3. Service Oriented Workflow • Typical workflow are based on available Web Services and operations • Workflow requires additional work to resolve data conflicts between services • Normal steps to script any complicated workflow are: • Discovering suitable Web Services; • Parsing WSDLs • Extracting the Data Information from WSDLs • Data Mapping to match the service requirements • Data Transformation at each activity level • Resolving the Namespace issues related to different data sets • Efforts spent in such orchestration are to resolve Namespace issues, data ambiguity, and data transformation • Data specific issues make workflows overwhelmingly large, complicated, and difficult to manage and maintain

  4. Service Oriented Workflow <process ……. xmlns:ns1="urn:ehtpx-process" xmlns:ns3="http://clyde.dl.ac.uk:8080/process/services/Score" <assign name="AssignScoreGet"> <copy> <from variable="inputVariable" part="payload" query="/ns1:ProcessAdminElement"/> <to variable="InvokeScoreGet_InputVariable" part="arg1" query="/ns2:getScoreResponse/admin"/> </copy> <assign/> </process>

  5. Service Oriented Workflow

  6. Data Centric Workflow • Data Centric Workflow are modified Data-Flow Diagrams (DFDs) • Workflow in the context of Web Services focus on the data flowing from one process to another • Data Flow Modelling is the process of identifying, modelling and documenting the data moves across the system • Data flow modelling • Examines processes (activities that transform data from one form to another) • Data stores (the holding areas for data) • External entities (what sends data into a system or receives data from a system) • Data flows (routes by which data can flow). • Data modelling develops an accurate model, of the business requirements, stake holders, and sub-processes and activities.

  7. Data Centric Workflow • Data centric workflow • defines define data definitions in the context of the application. • captures the intermediate data sets used by sub-processes during the lifecycle of the application • The services offered by different business partners or parties forming the supply chain follow the Data Flow Model • Web Services are developed in accordance with already negotiated and accepted data models. • The role of any Web Service is pre-defined in the context of an application • Web Services from multiple vendors, partners and collaborators may have same role in the application and can be replaceable • Current services may need to be reengineered to share common data model and provide predictable services. • Services need to develop their mapping routines and interfaces in compliance with the Data Model • Application has its own metadata, which must be coordinated through a shared data vocabulary

  8. Data Centric Workflow • Business relationships and activity co-ordinations are becoming complex and they often demand simplicity in the integration of their services • Business users want common interfaces, business processes, application functionality, tools and services • Advantages of Data Centric Workflow: • Reusable Data Set • Agreed Data Model. • Shared Data Vocabulary. • Standard compliant Data Models. • Uniform View. • Simple Integration. • Single Source of Modification. • Improved Performance • Separation of Roles. • Data Binding and Validation • Stability

  9. Limitation of Web Services • Web Services lacks the notion of: • State • Stateful interactions • Resource lifecycle management • Notification of state changes • Support for sharing and coordinated use of diverse resources

  10. Web Services and State • Stateful Entities Exist • Data in a purchase order • Schema for a message exchange • Current usage agreement for resources • Metrics associated with work load on a server • No WS Standards for State Management • Each system does it in an “idiosyncratic way” • Integration impediment • Missing Component • Formalize a mechanism to represent “state”

  11. Web Services Resource Framework • Web Services Resource Framework is built on the adopted Web Services architecture to address the limitations of Web Services. • WS-RF WS-RF comprises four inter-related specifications: • WS-ResourceProperties defines how WS-Resources are described by XML documents that can be queried and modified; • WS-ResourceLifetime defines mechanisms for destroying WS-Resources; • WS-ServiceGroup describes how collections of Web Services can be represented and managed; • WS-BaseFaults defines a standard exception reporting format. • WS-RF depends on two supporting specifications: • WS-Addressing • WS-Notifications

  12. WS-Resources • A Resource: • A specific set of state data expressible as an XML document • This is not typically all of the resource’s state! • Has a well-defined identity and lifecycle • Singleton resource may not have any unique identifier • Known to, and acted upon, by one or more Web services. • Many Possible Instances • Files, Database tables, EJB Entities, XML documents, Compositions of multiple data sources, Virtualized executions of applications, etc. • A WS-Resource has: • Identity: Can be uniquely identified/referenced • Lifetime: Often created & destroyed by clients • State: Part of the state can be projected as XML • Type: Its Web service interface

  13. WS-Resources

  14. WS-Resource Sharing • WS-Resources are not bound to a single Web Service. • Multiple Web Services can manage and monitor the same WS-Resource instance. • WS-Resources are not confined to a single organization. • Multiple organizations may work together on the same WS-Resource leading to the concept of collaboration. • Different WS can have distinct perspective of single WS-Resource Dynamically generated WS-Resource EPRs can be: • Discovered, • Inspected and • Monitored via dedicated Web Services • Unique identity of the WS-Resource instances (EPR) can be passed between partner processes and organizations: • Results in minimum network overhead • Avoids issues of stale information • Improved security options

  15. WS-Resource Sharing

  16. Managing Multiple WS-Resources • In EA WS-Resources related to different entities can be very similar • Different type of Clients e.g. normal or super user • Different type of Bank Accounts e.g. current or saving account • Different but similar type of categories • Different products in same category • Hierarchical nature of entities • Similar natured WS-Resources mostly have similar natured operations • Similarly natured operations on different WS-Resources can be effectively managed with the single Instance Service • Web Service managing multiple WS-Resources could be deployed as : • Gatekeeper Service • Monitoring Service • Auditing Service

  17. WS-Resource Referencing • WS-Resources are composed of Resource Properties • Resource Properties reflect the state. • Resource Properties can be reference to other WS-Resources • Referencing other WS-Resources defines: • inter-dependency of the WS-Resources. • Eliminates complicated business logic in instance service. • WS-Resources depends on the state of other WS-Resources to • Query • Modify

  18. WS-Resource Referencing

  19. Implied Resource Pattern • Implied Resource pattern has two services: • Factory Service: • Instantiate the resources • Returns the EndpointReference • Instance Service: • Works only on existing resources • Uses the EndpointReference to • Access • Manipulate the resource • Direct instantiation of resources is prohibited • Singleton resource may not have any Factory service

  20. Implied Resource Pattern

  21. Proof of Concept Implementation • The Data Centric workflow was developed as a stateful Web Service • Data model was encapsulated as a WS-Resource • The instantiation of the workflow was done through its Factory service • Factory Services are independent work in isolation. • Resources are created only when required for “late binding” • Most time consuming activity was the modelling of the data related to the workflow • Application specific data was modelled separately in various XML Schema • Simulate the real world problem, different Namespaces were mixed in the complex data types by using <xsd:import> and <xsd:include> • Separation of the data from the Web Services forced us to use Document/literal style Web Services • Development of the individual Web Services was straightforward • Business logic of the workflow was implemented in the Instance service of the workflow

  22. Proof of Concept Implementation • Minor changes in the Data Model and in the partner services don’t require changes in the main • instance service of the workflow • Minor changes in the Data Model (restructuring of the data) had no impact on the workflow; • Severe changes in the Data Model may require changes in the partner services; • Interface changes of the partner services require changes in the workflow; • WSDL for the partner services was easy to manage and update; • Modelling data before implementation solves issues related to the automatic WSDL generation from various tools i.e. JAVA2WSDL for Axis, wscompile for JAX RPC and wsdl.exe for .Net platform • Workflow calls partner services in predefined sequence without any complicated data mapping and transformation logic; • Partner services can even call the next service in the sequence without involving the main workflow service

  23. Conclusion • Scripting workflows based on the operations of available services is impractical • Most of the developers avoid direct interaction with XML • Shortcomings of in workflow scripts are more obvious when dealing with data type mapping and transformation • Data Modelling and taking a top down (WSDL first) approach is required to ensure a consistent solution for orchestration • Common platform-independent data type system facilitates: • Separation of roles, whereby workflow can be developed in isolation from partner services • Data Centric approach greatly increases productivity and simplifies development

More Related