160 likes | 271 Views
Software Tools for Dynamic Resource Management. Irina V. Shoshmina, Dmitry Yu. Malashonok, Sergay Yu. Romanov Institute of High-Performance Computing and Information Systems www.csa.ru {irena,mal,serrom}@csa.ru. Resources: CONVEX(es) Parsytec CC/16 Parsytec CCid
E N D
Software Tools for Dynamic Resource Management Irina V. Shoshmina, Dmitry Yu. Malashonok, Sergay Yu. Romanov Institute of High-Performance Computing and Information Systems www.csa.ru {irena,mal,serrom}@csa.ru
Resources: CONVEX(es) Parsytec CC/16 Parsytec CCid Parsytec Power Mouse System SPP1600 SGI OCTANE Workstations SunUltra 450 Paritet (intel cluster) www.csa.ru/CSA Scientific problems: hydroaerodynamics plasma nuclear physics medicine biology chemistry astronomy State of the art
Difficulties • shortage of resources for soluble scientific problems • unsatisfactory management of tasks (the majority of tasks are parallel)
Shortage of resources integrate computational resources of several scientific centres Advantages of integration • increase access and activity of usage of computational resources, • promote an integration of scientific community, • increase the range of resolving scientific and technical problems
Management of tasks Tools optimisation of task distribution on computational nodes • Codine • SunGridEngine • PBS • Condor Disadvantages of tools • weak support of migration of parallel tasks • unsatisfactory load balancing • dependence on versions of PVM and MPI
Main goals of the project • increase of efficiency of use of computing resources • improvement of quality of service of the users Main tasks • migration of parallel tasks • optimisation of distributed resource management • integration resources of several scientific centres
Dynamite software developed by University of Amsterdam in the Esprit project 23499 Dynamite advantages • migration and checkpointing of PVM tasks • automatic work-load balancing of PVM tasks (on a cluster of workstations) • migration of dynamically linked tasks • migration of communication end points • reallocation of tasks
Dynamite disadvantages • dependence on the PVM versions • absence of migration of MPI tasks • absence of satisfactory monitoring system • absence of advanced scheduling system • absence of modules of global distribution
Main steps of the project • Migration of MPI and PVM tasks • Checkpointing of parallel tasks • Monitoring • Resource management • Addition architectures
Global level Local level Local level Two-level system
Migration of PVM and MPI tasks Main problems of migration • migration of PVM tasks • migration of MPI tasks • independence from versions and realisations of PVM and MPI • addition of architectures • files • sockets • kernel supported threads and etc.
Checkpointing of parallel tasks • trace development of parallel tasks • migrate parallel tasks at two levels • migrate of a process of a parallel task (local level) • migrate of a parallel task wholly (global level) • process extreme situations
Checkpointing of parallel tasks Global level local level local level local level
Monitoring Parameters of • computational resources (loading of processors, memory, network), • tasks and queues, • users
Resource management • distribution of tasks and queues at the moment • long-time scheduling • dynamic load balancing at global and local levels
Globus Global environment local level local level local level local level local level local level Integration with Globus