300 likes | 400 Views
A survival toolkit for the technology outback. Michael Aivazis California Institute of Technology NOBUGS (ok, maybe a few bugs) Sydney 3-5 November 2008. Lay of the land. Evolutionary pressures. Alternatives: evolve, uplift or perish…. Debunking stereotypes. Dymaxion map (B. Fuller).
E N D
A survival toolkitfor the technology outback Michael Aivazis California Institute of Technology NOBUGS (ok, maybe a few bugs) Sydney 3-5 November 2008
Lay of the land NOBUGS, Sydney, 3-5 November 2008
Evolutionary pressures • Alternatives: evolve, uplift or perish… NOBUGS, Sydney, 3-5 November 2008
Debunking stereotypes Dymaxion map (B. Fuller) NOBUGS, Sydney, 3-5 November 2008
User stereotypes • End-user • occasional user of prepackaged and specialized analysis tools • Application author • author of prepackaged specialized tools • Expert user • investigator with a specific scientific goal • Domain expert • author of analysis, modeling or simulation software • Software integrator • responsible for extending software with new technology • Framework maintainer • responsible for maintaining and extending the infrastructure NOBUGS, Sydney, 3-5 November 2008
Technical challenges NOBUGS, Sydney, 3-5 November 2008
Sources of complexity • Project size: • asset complexity: number of lines of code, files, entry points • dependencies: number of modules, third-party libraries • runtime complexity: number of objects types and instances • Problem size: • number of processors needed, amount of memory, cpu time • Project longevity: • life cycle, duty cycle • cost/benefit of reuse • managing change: people, hardware, technologies • Locality of needed resources • compute/persist: where, how, when, who • User interfaces • younger users will have no tolerance for either “bad” or “ugly” • Adapting to new tehnology • Turning craft into: science, engineering, … art NOBUGS, Sydney, 3-5 November 2008
Past successes • Projects: • Caltech ASC Center (DOE) • GeoFramework (NSF) • Computational Infrastructure in Geodynamics (NSF): • DANSE (NSF) • Caltech PSAAP Center(DOE) • Large collaborations • faculty, post-docs, students • geographically distributed • Challenges • independent but coherent evolution • integration NOBUGS, Sydney, 3-5 November 2008
Leveraging NOBUGS, Sydney, 3-5 November 2008
Flexibility through scripting • Scripting enables us to • Organize the large number of simulation parameters • Allow the simulation environment to discover new capabilities without the need for recompilation or re-linking • Integration framework is written in Python • The interpreter • modern object oriented language • robust, portable, mature, well supported, well documented • easily extensible • rapid application development • Support for parallel programming • trivial embedding of the interpreter in an MPI compliant manner • a python interpreter on each compute node • MPI is fully integrated: bindings + OO layer • No measurable impact on either performance or scalability NOBUGS, Sydney, 3-5 November 2008
application-specific application-general framework computational engines Pyre • Pyre is a software architecture: • a specification of the organization of the software system • a description of the crucial structural elements and their interfaces • a specification for the possible collaborations of these elements • a strategy for the composition of structural and behavioral elements • Pyre is multi-layered • flexibility • complexity management • robustness under evolutionary pressures • Pyre is a component framework NOBUGS, Sydney, 3-5 November 2008
Choosing your gear NOBUGS, Sydney, 3-5 November 2008
Some pyre services • journal • flexible control over the generation and delivery of simulation diagnostics from the compute nodes to the workstation • monitor • a distributed service for low bandwidth, on the fly visualizations • currently used mostly for status monitoring and debugging • timer: embedded performance monitor • ipa: user authentication • passwords, SSL certificates, … Grid authentication • weaver • a general source code generation facility • support for many languages • FORTRAN, C, C++, python, HTML, XML • automatic web page creation for cgi scripts • blade: a toolkit-independent UI generator • opal: web based UI and application hosting • pyre based cgi scripts • auto-generation of html/javascript • ajax support in progress (jQuery) NOBUGS, Sydney, 3-5 November 2008
Distributed computing • gsl: • a package that completely encapsulates the middleware • provides both user space and grid-enabled solution • User space: • ssh, scp • pyre service factories and component management • Web services • full pyre/opal support for “science gateways” • pyGridWare from Keith Jackson’s group • Advanced features • dynamic discovery for optimized deployment • reservation system for computational resources NOBUGS, Sydney, 3-5 November 2008
Component Pyre components • Component based solutions are ideal for complex systems • encourage the decomposition of the problem into manageable functional units • expose the interaction mechanisms between these units • enable the nearly independent evolution of the parts • Component frameworks enable an incremental and evolutionary approach • existing codes can start producing results immediately • new services can be incorporated incrementally properties component core name input ports output ports control NOBUGS, Sydney, 3-5 November 2008
Component anatomy • Core: encapsulation of computational engines • middleware that manages the interaction between the framework and codes written in low level languages • Harness: an intermediary between a component’s core and the external world • framework services: • control • port deployment • core services: • deployment • launching • teardown NOBUGS, Sydney, 3-5 November 2008
core public interface bindings computational engine Component cores • Three tier encapsulation of access to computational engines • engine • bindings • facility implementation by extending abstract framework services • Cores enable the lowest integration level available • suitable for integrating large codes that interact with one another by exchanging complex data structures • UI: text editor NOBUGS, Sydney, 3-5 November 2008
Application archiving • Produce a fully repeatable execution by recording • scripts • user choices • sources (cvs/svn tags or even the files themselves) • build procedure • required third party libraries • version of as many runtime components as can be determined • generated data sets (urls, actual files) • Implementation • meta-data in PostgreSQL • HDF5 • embed XML meta-data • parsed for deducing the layout of the file as format evolves • can be extracted for easy indexing NOBUGS, Sydney, 3-5 November 2008
Services for computational engines • Normal engine life cycle: • deployment • staging, instantiation, static initialization, dynamic initialization, resource allocation • launching • input delivery, execution control, hauling of output • teardown • resource de-allocation, archiving, execution statistics • Exceptional events • core dumps, resource allocation failures • diagnostics: errors, warnings, informational messages • monitoring: debugging information, self consistency checks • Distributed computing • Parallel processing NOBUGS, Sydney, 3-5 November 2008
HelloApp: hello world • Output frompyre.application.Applicationimport Application classHelloApp(Application): def main(self): print "Hello world!" return def __init__(self): Application.__init__(self, "hello") return # main if __name__ == "__main__": app = HelloApp() app.run() access to the base class > ./hello.py Hello world! NOBUGS, Sydney, 3-5 November 2008
Properties • Named attributes that are under direct user control • automatic conversions from strings to all supported types • Properties have • name • default value • optional validator functions • Accessible from pyre.properties • factory methods: str, bool, int, float, sequence, dimensional • validators: less, greater, range, choice importpyre.inventory flag = pyre.inventory.bool(name=“some-flag", default=True) style = pyre.inventory.string(name=“my-style", default=“boring") scale = pyre.inventory.float( name="scale", default=1.0, validator=props.inventory.greater(0)) • You can derive your own property type from pyre.inventory.Property NOBUGS, Sydney, 3-5 November 2008
HelloApp: adding properties frompyre.application.Applicationimport Application classHelloApp(Application): … class Inventory(Application.Inventory): importpyre.inventory friend = pyre.inventory.str(“friend", default="world") … NOBUGS, Sydney, 3-5 November 2008
HelloApp: using properties • Now you can say hello to your friend… frompyre.application.Applicationimport Application classHelloApp(Application): … def main(self): print "Hello %s!" % self.inventory.friend return def __init__(self): Application.__init__(self, "hello") return > ./hello.py --name="Michael" Hello Michael! NOBUGS, Sydney, 3-5 November 2008
Units • Properties can have units: • the framework provides the type dimensional • Support for units is in pyre.units • all SI base and derived units • most common abbreviations and alternative unit systems • correct handling of all arithmetic operations • addition, multiplication, functions from math • parsing expressions from the command line importpyre.inventory frompyre.units.timeimport s, hour frompyre.units.lengthimport m, km, mile speed = pyre.inventory.dimensional( name="speed", default=50*mile/hour) v = pyre.inventory.dimensional( name="velocity", default=(0.0*m/s, 0.0*m/s, 10*km/s)) NOBUGS, Sydney, 3-5 November 2008
Parallel HelloApp frommpi.Applicationimport Application classHelloApp(Application): def main(self): importmpi world = mpi.world() print "[%03d/%03d] Hello world" % (world.rank, world.size) return def __init__(self): Application.__init__(self, "hello") return # main if __name__ == "__main__": app = HelloApp() app.run() NOBUGS, Sydney, 3-5 November 2008
Facilities and components • A design pattern that enables the assembly of application components at run time under user control • Facilities are named abstract application requirements • Components are concrete named engines that satisfy the requirements • Dynamic control: • the application script author provides • a specification of application facilities as part of the Application definition • a component to be used as the default • the user can construct scripts that create alternative components that comply with facility interface • the end user can • configure the properties of the component • select which component is to be bound to a given facility at runtime NOBUGS, Sydney, 3-5 November 2008
Inversion of control • A feature of component frameworks • applications require facilities and invoke the services they promise • component instances that satisfy these requirements are injected at the latest possible time • The pyre solution to this problem • eliminates the complexity by using "service locators" • takes advantage of the dynamic programming possible in python • treats components and their initialization state fully symmetrically • provides simple but acceptable persistence (performance, scalability) • XML files, python scripts • databases (PostgreSQL, MySQL) • can easily take advantage of other object stores • is ideally suited for both parallel and distributed applications NOBUGS, Sydney, 3-5 November 2008
Are we there yet? NOBUGS, Sydney, 3-5 November 2008
Cost/benefit … rationalizations • Drawbacks • some reengineering required • paradigm shift • learning curve – not helped by the (current) lack of documentation… • Benefits • clear path forward for “legacy” applications • easy, normalized access to large number of facilities • structured way for enabling engines in modern computational environments • rigorous separation of UI from computational engines • easy re-hosting of compliant application NOBUGS, Sydney, 3-5 November 2008
Forecast • Some things change • languages, platforms, networks • algorithms, tools, processes • … • Some things don’t • people: your colleagues, your customers, your boss, his boss… • money/time always short • there will always be bugs • users never know what they want • but they know how you should implement it… • security always annoying but necessary • collaboration is difficult • see items above • Design for “constrained change” NOBUGS, Sydney, 3-5 November 2008