460 likes | 805 Views
Dealing with “legacy” (heritage) codes in PSEs Maozhen Li, Matthew Shields, Yan Huang, David W. Walker, Omer F. Rana o.f.rana@cs.cf.ac.uk Cardiff University, UK Computer/Application Science Interaction Wouldn’t it be nice if ... Tools + Expertise Scientific Code
E N D
Dealing with “legacy” (heritage) codes in PSEs Maozhen Li, Matthew Shields, Yan Huang, David W. Walker, Omer F. Rana o.f.rana@cs.cf.ac.uk Cardiff University, UK
Computer/Application Science Interaction Wouldn’t it be nice if ... Tools + Expertise Scientific Code
Computer/Application Science Interaction Wouldn’t it be nice if ... Tools + Expertise Scientific Code Can I do this…? Tools + Expertise Scientific Code
Computer/Application Science Interaction Wouldn’t it be nice if ... Tools + Expertise Scientific Code Can I do this…? Tools + Expertise Scientific Code Tools + Expertise Scientific Code
Influences Problem Solving Environments TF • Special Requirements • Performance • Security ... Technologies Existing Codes New Markets Domain Credentials
Influences New Markets Mental Models Preferences HF Profiles Methodologies Experience Bias Problem Solving Environments TF • Special Requirements • Performance • Security ... Technologies Existing Codes New Markets Domain Credentials
Impacts Problem Description/Evaluation Tools - Visual Prog., Iris Explorer, MatLab, ADL(Erlangen-Nuremberg) PSE Infrastructure Component Frameworks - Imperial College, CCA, INRIA, Gateway, Arcade, Advice etc Learning Visualisation Resource Management - Codine, LSF - Globus, Legion, Condor Steering Legacy Codes (wrappers) - SWIG, WG PSE Services Sharability Repeatability Persistence Usage Data Management Usability ALL Three are important Result Sharing (Electronic Notebooks) Maturity
PSE construction tools • Visual Composition • Data Flow or Control Flow • Language based • Functional Languages • Abstraction based • Petri nets, Process (Composition) Algebras IRIS Explorer, ADL, Gateway/WebFlow, ARCADE, Mathematica, MatLab etc - Good survey by “Grid Computing Environments” group -- see http://www.gridforum.org/
Why Wrap Legacy Codes as Components? • Pre-existing codes, mostly in C or Fortran • Generally domain-specific • Hard to re-use in other applications • They are still useful • They are often large, complex monoliths with little structure. • Support Re-use • Support Remote Execution • Support Remote Discovery • Support Remote Data Input/Output Re-write? - try convincing App Scientists
Dynamic Services - “Virtual Organisations” S L S R S S R S R S L S S S R S S R S S S S R R S S S
Wrapping Approaches Similar name in DBs, but different approach • Wrapping executables - “As-Is” Approach • No source available (or provided) • Maintain execution environment • Wrapping Source - “Source-Update” Approach • Some source provided (generally I/O) • Executable can relinquish some control • Data type conversions • Source split Wrapping - “Unit-Mapping” Approach • Split source into units -- wrap units • Maintain unit execution environment + overall manager • Application Supported Wrapping - “App-Wrap” • Steering support • Data management support
Wrapping Approaches • Wrapping executables - “As-Is” Approach • No source available (or provided) • Maintain execution environment • Wrapping Source - “Source-Update” Approach • Some source provided (generally I/O) • Executable can relinquish some control • Data type conversions • Source split Wrapping - “Unit-Mapping” Approach • Split source into units -- wrap units • Maintain unit execution environment + overall manager • Application Supported Wrapping - “App-Wrap” • Steering support • Data management support • Provide Isolation between existing code, in its present • form, and need to re-use and execute code remotely • Enable properties of code to be specified (in terms, • perhaps of its interface), to enable a discovery • mechanism to utilise in, say, a particular application. • Sustain performance, correctness of results, ownership, • and availability
Automating Wrapping • Time consuming and error prone process • Automate the implementation of interfaces to access code • via a system wide data model • Automate interactions between wrapped components • via a discovery service - registry to a more complicated lookup service • Can have • same interface, different implementation
Component Model and Extensions Existing Code
Component Model and Extensions Existing Code
<pse-def> <preface> <name alt="MD1" id="MD01"> MDComponent</name> <pse-type> Molecular Dynamics </pse-type> <component-directory>/home/scmlm1/wgen/Component</component-directory> <legacy-code>/home/scmlm1/md/moldyn</legacy-code> <ORB-Compiler>idl2java</ORB-Compiler> <processors>8</processors> <host-name>sapphire.cs.cf.ac.uk</host-name> </preface> <outports> <outportnum> 6 </outportnum> <outport id="1"> int </outport> <outport id="2"> float </outport> <outport id="3"> float </outport> <outport id="4"> float </outport> <outport id="5"> float </outport> <outport id="6"> float </outport> <href name="file:/home/scmlm1/wgen/Component/output.data" value="output" /> </outports> </ports> XML Data Model Existing Code
Component Model and Extensions Existing Code External Control Input (for Steering)
Component Model and Extensions Data Manager Existing Code Runtime support
Component Model and Extensions Data Manager Existing Code Runtime support Execution Rules
Proxy based Server Wrap code as server Transport can be HTTP, RMI, Sockets Runtime.exec() B I/F C Server A Discover Service Parameter Marshalling and Verification Scheduling Constraints Jini, JXTA, RMI D P
Write your own Classloader() • Extend “Primordial Classloader” in Java • invoked after calling main() method • Matrix m = new Matrix() ; -- execute “new” bytecode • System.out.println()-- invoke static reference to class (putstatic, getstatic etc) • Class loaders enable Java apps (EMACS or Scientific codes) to be dynamically extended • Byte code verifier - defineClass, ClassFormatError • Package over-write/addition: java.lang.hackit -- protect system namespace • Multiple Classloaders can co-exit
If you have source ... • Take source header files - extract function defs • Create skeleton code that can be filled in, or integrated • SWIG • Interface compiler - takes C/C++ or Objective-C code • Use header file declarations to generate glue code • Scripting languages (Python, Perl, Tcl) can then access legacy code via glue code David Beazley and Peter Lomdahl “Light Weight Computational Steering of Very Large Scale MD Codes”, Supercomputing 96
Java: Linking to existing code Fortran Prolog Sockets C-F Link C/C++ PROGRAM COMPILER TO A DLL/.SO FILE (Native Library) JAVA PROGRAM JNI LIBRARY HEADER FILES
Body Component Component IDL XML Listener Publisher Name Server CORBA ORB Component Repository Component Wrapper Generator Legacy Codes
Data Flow for Wrapper Generator Executable in Fortran/C Wrapper Generator Client Stub Server Interface XML Specification Listener/Publisher (Can also be used to support Service Discovery) Supercomputing 2000, FGCS, September 2001
An Application using the WG • The legacy code-- A legacy code for molecular dynamics simulations written in C using MPI • Based on the Lennard-Jones Fluid • Performs Force and Velocity calculations • Code wrapping in two ways (Use Visibroker/Java ORB 4.2 from Inprise) • Single Object • Multiple Objects
Performance Evaluation • Experimental Environment • Running the legacy code itself and the wrapped CORBA object respectively on a cluster of workstations and a parallel machine. • The number of molecules is increased from 2048 to 256,000. • A cluster of workstations running Sun Solaris2.7 and MPICH1.2.0, connected over an intranet (with shared file space) with 10Mb/s Ethernet. • The parallel machine is a Sun E6500 with thirty 336MHz Ultra Sparc II processors, running Solaris2.7 and using MPI libraries from Sun Microsystems. • Using 8 workstations in the cluster and 8 processors of the parallel machine
Splitting Code • Code divided into 4 MPI based objects • Initialisation - calculates starting positions (X,Y,Z) and velocities of molecules • Moveout - calculates molecule movement after each time step, and E/W/N/S/U/D comms - also handles “ghost” regions to support molecule migration • Force - calculates force between molecules • Output - generates simulation results at each time step • Controller object • co-ordinates looping across the other four objects
Executing Code This component also manages its own data Parallel Sequential Component can have internal runtime Data Source Undertakes Co-ordination Role VCCE
Performance Cluster Cspace (Sun E6500)
Listener, Publisher and Body class SimulationImpl extends _SimulationImplBase { public void Listener(String ComponentID, String inputs) { ... if(ComponentID==``MDSimulation'') read parameters(inputs) from UI component invoke Body of the Component } public void Body(String parameters) { ... execute ``mpirun -np 8 moldyn parameters output.dat'' send data(output.dat) to the Publisher of the component invoke the Publisher } public void Publisher(String ComponentID, String outputs) { ... read data(output.dat) from the Body of the component ComponentID=``UIComponent'' invoke UI component with outputs } }
Wrapping for Grids • Assume existence of standard software • Globus for resource management/description • Service Request Broker/Data Cutter for data management • Jini for service discovery - (GRIP, GRRP) • Wrapping involves • Specifying executable properties via RSL & (count=3) (executable=/usr/bin/java) (arguments= -classpath /home/comsc/scmds/research be2d) (lookup = Locator.locator(“cs.cf.ac.uk”); • Publisher undertakes registration - GRRP/Discovery-Join Protocol • pass control toGRAM/local execution tool via VCCE
Adding capability • Supporting re-use of existing codes … but • what else? • Support additional functionality “Think Service … not Code” • Intelligent Wrappers • Scheduling support • Discovery support • Integration with other services • Application specific constraints • Service monitoring
Agent as Wrapper Inteface • Gas Turbine talk: Agent Society • Mediator, Solver, User Interface, Mobile -- GrassHopper • No Intelligence - except in Pythia • Wrap code as an Agent • Code + Management function • Code receives request for service NOT data types • PERFORM, EXECUTE, UPDATE, TELL, ASK, INFORM • Can contain executable code OR can act as a proxy for a code • Programming language is now no longer important
Intelligent Agents … more than a message • Message contains more detail Performative (ask one :sender pse-cardiff :content (BAYES CART ?prior) :receiver data-mine-server :reply-with prior-probability :language JAVA :ontology Data-Mining-Algorithms ) Comms layer Message layer Content layer
S U S B S B S U
Agent Society • Create agents to support services • additional “broker” services • Agents interact via service requests • no data types • intuitive to understand • Agents are the core abstraction • can wrap codes + other software resources • negotiate for services
HPDC 2001 -- workshop on Heterogeneous Computing with Ken Hawick, David Walker, Mike Surridge, Matthew Addis, Daniel Bunford-Jones
Conclusion - Automated Wrapper Generator • Supporting Existing Codes • Enabling re-use and remote execution • Supporting interaction with third party execution • standard data model • Supporting service discovery + management • Supporting service integration • Supported through the VCCE • Wrapping at different levels of granularity • Wrapping at different levels • Complete executable - binary • Source/Partial source • Application supported wrapping • Unit wrapping Application Science as Driver
Some Questions • How valid is the “services model” in scientific computing (say, vs. business computing) • Isn’t scientific computing much more well defined? • Do we really need this complexity? • What are we trying to do with wrappers? • Can we do more with wrappers than just “wrap” • Intelligent Wrappers • Integration with other software (Globus?) • Support for performance • Support for local data management • Support for streaming
Some Questions … 2 • Do we need standards to define services? • Math.solvers.matrix (could be LDAP based) • graphics.visualise.barchart • Could be an open standard that is community shared (could be encoded via XML) • Legacy codes could be integrated in an easy way • How would app scientists respond to wrapped codes? • Distrustful (security, accuracy etc) • Ignore • Technologies must be market + community based • Microsoft SOAP, XML (people factor) • … but also, ones with community credentials • Integration is not always obvious