50 likes | 169 Views
Workshop EU – Russia Joint Call in High Performance Computing Prof. VLADIMIR VOEVODIN Deputy Director, Research Computing Center, Moscow State University Corresponding member of Russian Academy of Sciences, voevodin@parallel.ru 25 March 2010, Brussels.
E N D
Workshop EU – Russia Joint Call in High Performance Computing Prof. VLADIMIR VOEVODIN Deputy Director, Research Computing Center, Moscow State UniversityCorresponding member of Russian Academy of Sciences, voevodin@parallel.ru25 March 2010, Brussels
Moscow State University1755 – 201030+ Faculties350+ Departments5 major Research InstitutesMore than 40 000 students, 2500 full doctors, 6000 PhDs,1000+ full professors,5000 researchers.
Research Computing Center, MSU1955 – 201020 Laboratories,220+ Researchers,50 PhDs and 25 Full Doctors, RCC MSU – Supercomputing Center #1 in Russia There are 25 Doctors of Sciences in RCC
Performance analysis tools for HPC (what problems should the project address?) • Most supercomputers have extremely low efficiency for a very wide range of applications. • Most users have no information about their programs after submitting to a queue… • “Low efficiency” can’t be explained by one reason, this is a complex problem… • Sources of losses: policies and quotas of batch systems, RTSs, compilers, communications overheads, load imbalance, Amdahl’s law, a memory wall… • There is neither a unified approach nor a software tool to detect sources of losses for users (for a particular task) and system administrators (for the whole supercomputer).
Performance analysis tools for HPC (some key points of the project) • Create an integrated environment that combines existing and new tools from the level of batch systems down to hardware monitors. • Detect all sources of losses on the level of a particular task, a particular user, and the entire supercomputer. • Collect, analyze, filter and display data in a scalable way up to exascale range systems. • Analyze instrumented as well as non-instrumented tasks. • Target architecture – clusters of SMP nodes with multicore processors and GPUs.