160 likes | 172 Views
Investigate software composition, develop a composition-only language, protocol for large, distributed mega modules, and supporting system. Explore cost estimation in scheduling, selection of megamodules, and monitoring. Access at http://www-db.stanford.edu/CHAIMS/.
E N D
CHAIMS Cost Estimation in CPAM, an Access Protocol for Remote and Autonomous Services Prof. Gio Wiederhold, Dr. Dorothea Beringer, several Ph.D. and master students Stanford University http://www-db.stanford.edu/CHAIMS/ Objective: Investigate new approaches to large-scale software composition. Approach: Develop and validate a composition-only language, a protocol for large, distributed, heterogeneous and autonomous megamodules, and a supporting system. CHAIMS
Composition of Services... • versus composition of Components • reusing small components via copy/paste or shared libraries locally installed • large distributed components within same “domain” as composition, e.g. within one bank or airline • versus composition and integration of Data • data-warehouses • wrapping data available on web CPAM/CHAIMS:»composing processes »composing services of remote, autonomous, large megamodules CHAIMS
Assumption Composition of services that are remote, autonomous, heterogeneous, computation intensive ==> specific requirements Domain expert Client workstation c l i e n t IO module IO module CPAM and distribution systems data control s e r v e r s at provider's sites e b Megamodules a d c CHAIMS
Challenge: Autonomy Megamodule C at site NewCom provided by BestCalc Megamodules are autonomous: • responsibility for maintenance is with provider • client has no direct control over availability of services and resources provided • heterogeneity concerning implementation languages, server platforms, distribution systems, and interface definitions (ontologies, ==> SKC project) • yet client might be able to choose from several providers Megamodule A at site Stanford provided by InfoLab Megamodule B at site SLAC provided by Admin CHAIMS
Challenge: Heavy-weight Services What we would like: ==>monitoring progress of a service ==> possibility to choose cheapest or fastest service ==>exploitingparallelism among services Services are not free for a client: • execution time of a service • transfer time and fee for data • fees for services CHAIMS
Parallelism, Invocation Scheduling Distributed services ==> potential parallel execution • focus is on parallelism of remote (long) services, not on parallelism of local operations i1 a e1 i2 i1 b i3 a i4 c d e1 e2 i2 i3 b c e4 time time e2 e3 i5 e e3 e5 i4 d e4 i5 e e5 dataflow dependency CHAIMS
Cost Estimation for Scheduling (1) Scheduling of invocations: • defer shorter invocations so results will be extractable from all services at about the same time i3 c (>a+b) i1 a i4 e1 i2 d (<c) b time e4 e3 e2 i5 e e5 ==>ESTIMATE ( methodname), returns estimated execution time, fee, and datavolume of results CHAIMS
Cost Estimation for Scheduling (2) Scheduling of invocations: • control-flow dependency (conditional block), yet no data-flow dependency i1 a e1 i2 a d b e2 i3 b c condi- tional time e3 c i4 d e e4 i5 e control flow dependency: results of c only needed under certain conditions, determined by b e5 dataflow dependency CHAIMS
Cost Estimation for Scheduling (3) ==>cost-function based on cost information from a, b, and c i1 a i4 d e1 i3 i2 i1 c i4 b a d e4 e1 i2 e2 b time i3 e4 c e3 time e2 i5 e e5 e3 i5 e e5 Risk: waste of resources and money Risk: waste of time CHAIMS
SETUP () SETPARAM (attributes essential for cost estimation) (f1=fee, t1=time) = ESTIMATE (“Optimum”) SETUP () SETPARAM (attributes essential for cost estimation) (f2=fee, t2=time) = ESTIMATE (“BestRoute”) time BestPick.INVOKE (“BestRoute”, …) Cost Estimation for Selection Choosing megamodules: C H A I M S c l i e n t RouteChoose - Optimum - … BestPick - BestRoute - ... calculate cost function ==> today BestPick is better CHAIMS
Why Run-time Cost Estimation? Static cost information in repository or catalog: • upper bound? average? • fluctuation in resources and load of server • dependence on specific input data Run-time cost information from megamodule: • reflects actual load and resources • takes into account autonomy, no out-of-date cost information in a central repository, no daily updates • easily fits into CPAM and the concept of having several primitives for remote execution • yet: requires in megamodule either statistics of costs over typical loads and input data gained by previous invocations, or special functionality in (wrapped) server software CHAIMS
Monitoring: Incremental Extraction CPAM primitive EXTRACT: • Partial extract: all results are ready, but results are extracted step by step (always available) • Partial extract: only some of the results are ready and can be extracted before rest of results is ready • Progressive extract: preliminary version of a result is ready and can be extracted EXTRACT: takes list of result attributes to be returned, returns values of these attributes CHAIMS
Monitoring: Progress Information CPAM primitive EXAMINE: • EXAMINE allows to ask for the completion of: • an invocation: DONE, NOT_DONE, PARTIAL, ERROR • EXAMINE allows to ask for the progress of: • a result attribute: returns current accuracy • an invocation: returns current progress Progress information is optional, not provided by every megamodule. ==> partial and progressive extraction ==> rescheduling of invocations, stopping slow invocations ==> getting preliminary results which influence program flow (e.g. invocation parameters) CHAIMS
Primitives in CPAM Pre-invocation: SETUP: set up a connection to a megamodule SET-, GETPARAM: preset / get attributes in a megamodule ESTIMATE: get cost estimation for optimization Invocation and result gathering: INVOKE: start a specific method EXAMINE: test status and progress of an invoked method EXTRACT: extract results from an invoked method Termination: TERMINATE: terminate a method invocation TERMINATEALL: terminate the connection to a megamodule All primitives are procedure calls ==> asynchrony in method invocation CHAIMS
Megamodule Provider Composer provides megamodules client side server side writes Wrapper Templates CLAM program adds information to CHAIMS Compiler CHAIMS Repository b d generates e a Client Side Run Time c MEGA Modules CPAM protocol Distribution Systems (CORBA, RMI…) CHAIMS Architecture CHAIMS
CPAM and WfMS • Composing services of autonomous, computation ==> intensive servers • Issues like cost-estimation also important for WfMS Managing workflow across organizations computation-intensive, autonomous services run-time cost estimation, progress monitoring CHAIMS