330 likes | 513 Views
A Special Research Program funded by ASF. Automatic Performance Analysis: Real Tools. ASKALON A Tool Set for Cluster and Grid Computing. Cracow’03 Grid Workshop, Oct. 2003. T. Fahringer, A. Hofer, A. Jugravu, S. Pllana, R. Prodan, C. Seragiotto,
E N D
A Special Research Program funded by ASF Automatic Performance Analysis: Real Tools ASKALONA Tool Set for Cluster and Grid Computing Cracow’03 Grid Workshop, Oct. 2003 T. Fahringer, A. Hofer, A. Jugravu, S. Pllana, R. Prodan, C. Seragiotto, J. Testori, H.-L. Truong, A. Villazon, M. Welzl Institute for Computer Science University of Innsbruck Thomas.Fahringer@uibk.univie.ac.at informatik.uibk.ac.at/dps
Outline • ASKALON: Overview • Performance Analysis and the Grid • Automatic Experiment Management • JavaSymphony: A New Programming Method for the Grid • Summary
ASKALON: Aksum • Automatic • Bottleneck • Analysis PerformanceProphet • Modeling • Simulation • Performance • Prediction Scalea • Instrumentation • Measuring • Performance Analysis informatik.uibk.ac.at/dps A Tool Set for Cluster and Grid Architectures Zenturio • Parameter Studies • Performance Studies • Experiment • Management • Software Testing Performance Experiment Program Machine Database • Programming • Paradigms • MPI,GlobusMPI • OpenMP/MPI • HPF/OpenMP • JavaSymphony • Architectures • NOWs • PC-Clusters • SMP Clusters • GRID Systems • DM/SM Systems 3
ASKALONWeb Services ASKALON Service Repository Registry Application Compilation Command Execution Command Machine Factory SCALEA User Portal SIS Instrumentor Performance Analyzer ZENTURIO User Portal Experiment Generator Performance Estimator Middleware AKSUM User Portal Performance Property Analyzer Search Engine Service Sites ASKALON Visualization Diagrams PROPHET User Portal Overhead Analyzer Experiment Executor Factory ASKALON DataRepository Scheduler Compute Site
Performance Analysis for the Grid so far mostly low level analysis • monitoring, instrumentation, analysis for the Grid infrastructure • but not for applications • lots of low level performance data and visualization • lack of high-level summary information • difficult to associate data with specific middleware components and applications
Performance Analysis for the Grid next steps • higher level analysis • performance analysis for the Grid and its applications (single-entry single exit regions) • summaries instead of details • problems and interpretation instead of raw data • combined Grid performance analysis for (SCALEA, • network AKSUM) • site • application • customizable tools instead of hard-coded analysis • multi-experiment instead of single-experiment analysis • online and scalable performance analysis
Aksum A Tool for Semi-Automatic Multi-Experiment Performance Analysis • user-provided problem and machine sizes • automatedinstrumentation, experiment management, performance interpretation, and search for performance bottlenecks • performance analysis for single-entry single-exit regions • performance problems relatedto the program • targets OpenMP/MPI, and mixed programs • customizable (build your own performance tool) • API for performance overheads • define performance problems and code regions of interest • influence the search (strategy, time, code regions)
Specification of Performance Problems with JavaPSL • JavaPSL is a • API for the specification of performance problems. • high-level interface for raw performance data. • pre-defined and user-defined JavaPSL problems • performance problems as values between 0 and 1 (interpretation) public class SynchronizationOverhead implements Property { private float severity; public SynchronizationOverhead( DynamicCodeRegion d, ReferenceDynamicCodeRegion r) { severity = (float)d.getSynchronizationOverhead() / r.getExecutionTime(); } public boolean holds( ) { return severity > 0; } public float getSeverity( ) { return severity; } public float getConfidence( ) { return 1; } }
Property hierarchy • Defines evaluationorder of performanceproperties • Predefined hierarchies • OpenMP, MPI, mixed mode • Can be customized • Each node has: • a threshold (property instances with severity less than the threshold are discarded) • reference code region • bean properties
Property Hierarchy (first levels) DataMovementOverhead ... SynchronizationOverhead ... ParallelInefficiency ControlOfParallelismOverhead ... Inefficiency LoadImbalance ... ImperfectFloatingPointBehavior SerialInefficiency ImperfectCacheBehavior
Application parameters • Strings to besubstituted insome or allof the input files • Mapped toZEN directivesin the inputfiles • Basis for experiment generation and execution done by ZENTURIO
Outline • Performance Analysis and the Grid • Automatic Experiment Management • JavaSymphony: A New Programming Method for the Grid • Summary
Management of Experiments and Parameter Studies Currently scientists • manually create parameter studies • manage many different sets of input data • launch large number of compilations and executions • administer result files • invoke performance analysis tools • interpret/visualize performance and parameter results, etc. This is a tedious, error-prone, and time consuming process.
ZENTURIO: An Automatic Experiment Management Framework for Cluster and Grid Architectures Support for scientists to semi-automatically conduct large sets of • parameter studies • throughput versus high-performance computing • performance studies • software tests on cluster and Grid architectures.
G-Site Instrumentation Experiment Generator Service Experiment Monitor E-Site Experiment Data Repository Experiment Executor Service Application Data Visualiser Scheduler ZENTURIO A Web Service based Architecture Registry Service User Portal application compilation execution command Experiment Preparation Middleware machine
Application Parameters and Value Sets • Performance and parameter results depend on application parameters and their value sets. • machine sizes {x CPUs, y Grid sites, …} • problem sizes {x atoms, matrix size, …} • program variables {1,2,3,16:110:2} • data distributions {block, cyclic, …} • loop scheduling strategies {static, guided, …} • communication networks {Myrinet, FastEthernet, …} • input/output file names, etc. • An Experiment is defined by its sources with every application parameter replaced by a specific value.
ZEN: Directive-based LanguageSpecification of Arbitrary Complex Experiments • Set of directives to specify value sets of interest for arbitrary application parameters. • Directives: • assignment • substitute • constraint • performance • Annotation of arbitrary source/input files • program files, Makefiles, scripts, input files, etc. • ZENTURIO generates sources for every different experiment based on ZEN directives.
LAPW0 Machine Size Globus RSL script count=2 count=3 count=4 ... + (& (resourceManagerContact = “gescher/jobmanager-pbs”) (*ZEN SUBSTITUTE count\=4 = {count={2:40} } *) (count=4) (jobtype=mpi) (directory=“/home/radu/APPS/LAPW0”) (executable=“../SRC/lapw0”) (arguments=“lapw0.def”) ) count=40
Problem size: lapw0.def 4, 'znse_6.inm', 'unknown', 'formatted', 0 !ZEN$ SUBSTITUTE ktp_.125hour.clmsum = { ktp_.125hour.clmsum, ktp_.25hour.clmsum, ktp_.5hour.clmsum, ktp_1hour.clmsum } 8, 'ktp_.125hour.clmsum', 'old', 'formatted', 0 !ZEN$ SUBSTITUTE ktp_.125hour.struct = { ktp_.125hour.struct, ktp_.25hour.struct, ktp_.5hour.struct, ktp_1hour.struct } 20, 'ktp_.125hour.struct', 'old', 'formatted', 0 58, 'znse_6.vint', 'unknown','formatted', 0 !ZEN$ CONSTRAINT INDEX ktp_.125hour.clmsum == ktp_.125hour.struct ktp_.125hour.clmsum ktp_.25hour.clmsum ktp_.5hour.clmsum ktp_.1hour.clmsum ktp_.125hour.struct ktp_.25hour.struct ktp_.5hour.struct ktp_.1hour.struct
ZEN Performance Behaviour Directive !ZEN$ CR CR_P, CR_L PERF WTIME, ODATA . . . !ZEN$ CR CR_OMPDO, CR_CALLS PERF WTIME, OSYNC BEGIN !$OMP DO SCHEDULE(STATIC) . . . !$OMP END DO NOWAIT !$OMP BARRIER !ZEN$ END CR • request performance data for arbitrary code regions • CR_P = entire program • CR_L = all loops • CR_OMPDO = OpenMP do regions • CR_CALLS = procedure calls • WTIME = execution time • ODATA = data movement • OSYNC = synchronisation • 50 code region mnemonics • 40 performance metrics • supported by SCALEA
JavaSymphonyHigh-Level Object-Oriented Programming of Grid Applications • JavaSymphony (100 % Java) - new object-oriented programming paradigm of concurrent and distributed systems • portability • higher level programming • simple access to resources • explicit control of locality and parallelism • performance-oriented • JavaSymphony programming model: • dynamic virtual architectures (VAs) • API for system parameters • single- and multi-threaded remote distributed objects • distribution/migration of objects and code • asynchronous und one-sided (remote) method invocation • synchronization and events (distributed) And all of that without programming RMI, sockets, and threads!
Summary • Performance analysis for the Grid • higher-level analysis, performance interpretation,multi-experiments, automatic, customizable, • high-level performance instrumentation interface • standardization of performance data • Multi-Experiment Performance Analysis and Parameter studies for the Grid • request for arbitrary number of experiments • automatic management of experiments • fault tolerance, events • combine with schedulers and performance tools • JavaSymphony: A new Programming Model for Grid Applications • Explicit control of locality, parallelism, and load balancing at a high level • dynamic virtual architectures, events, synchronization, migration, multi-threaded objects, asynchronous/snychronour/one-sided remote methods • no RMI, socket or thread programming
Aksum • Automatic • Bottleneck • Analysis PerformanceProphet • Modeling • Simulation • Performance • Prediction Scalea • Instrumentation • Measuring • Performance Analysis ASKALON: informatik.uibk.ac.at/dps A Tool Set for Cluster and Grid Architectures Zenturio • Parameter Studies • Performance Studies • Experiment • Management • Software Testing Performance Experiment Program Machine Database • Architectures • NOWs • PC-Clusters • SMP Clusters • GRID Systems • DM/SM Systems • Programming • Paradigms • MPI,GlobusMPI • OpenMP/MPI • HPF/OpenMP • JavaSymphony University of Innsbruck/ Institute for Computer Science / T. Fahringer 42