930 likes | 1.15k Views
La Investigación generadora de riqueza. WINCO`05 México D.F., México, 13 de Abril de 2005 Prof. Mateo Valero, UPC, Barcelona. Outline. High Performance Research Group at UPC Centers of Supercomputing: CEPBA CIRI BSC Future context in Europe Networks of Excellence
E N D
La Investigación generadora de riqueza WINCO`05 México D.F., México, 13 de Abril de 2005 Prof. Mateo Valero, UPC, Barcelona
Outline • High Performance Research Group at UPC • Centers of Supercomputing: • CEPBA • CIRI • BSC • Future context in Europe • Networks of Excellence • Large Scale Facilities
Basic concepts about research • Fundamental/basic research versus applied research • Good versus bad research • Good research produces always wealth • Short/Medium/Long Term Research • Products of good research: • Papers and patents • Educated people • Good education is a key component in this picture • Cross-Pollination between Research groups and companies are the other part of the movie • To promote good research is the only way Europe has to be competitive in a short/long future
Historia • Tesis: 1974-1980 • Mucha dificultad • FIB • Crear departamento: asignaturas, contratar,.. • Empezar a investigar • Situación española… no hay $, no existe nada,… CICYT • Decisiones Estratégicas: • Arquitectura de Computadores • Supercomputadores
Computer Architecture • Computer Architecture is a rapidly changing subject • Technology changes very fast • New applications emerge continuously • Computer Architects must deal with both technology and applications • CMOS Technology is coming to an end • A new group of applications is appearing • There is a great opportunity for high performance architectures for these applications
Supercomputers • Faster computers in the world • Used to simulate • Mainly fabricated by USA companies • No experience in Spain • Europe uses and produces software
Entrada de España en la EU • Internacionalización de la Investigación • Nuevas oportunidades • Antes y despues de 1986 • Proyectos industriales • Usamos estos proyectos para crecer en investigación básica
Microkernel/applicationsCooperation in Multiprocessor Systems TIC94-439 High Performance Computing TIC95-429 High Performance Computing II TIC98-511-C02-01 High Performance Computing III TIC2001-995-C02-01 UVall UdZ, URV and ULPGC 2002 1998 1990 1991 1992 1993 1994 1995 1996 1997 1999 2000 2001 2003 2004 CICYT: Spanish Projects HLL-orientedArchitectures TIC89-300 Parallel Architecturesfor Symbolic Computation TIC91-1036 Architecture, Tools and Operating Systems for Multiprocessors TIC89-392 Parallelism Exploitation in High Speed Architectures TIC89-299 Architectures and Compilers for Supercomputers TIC92-880 High Performance Computing IV TIC2004-7739-C02-01
79 researchers 34 PhD 45 doing PhD Now People: large research group
Computer architecture (uniprocessor) • Dynamically scheduled (superscalar) • Front-end engines: instruction fetch mechanisms and branch predictors • Speculative execution: data and address prediction • Organization of resources: • Register file, functional units, Cache organization, prefetching … • Kilo-instruction Processors • Not only performance: area and power consumption
Computer architecture (uniprocessor) • Statically scheduled (VLIW) • Organization of resources (functional units and registers) • Not only performance: area and power consumption • Advanced vector and multimedia architectures • Vector units for superscalar and VLIW architectures • Memory organization • Data and instruction level parallelism
Computer architecture (multiprocessor) • Multithreaded (hyperthreaded) architectures • Front-end engines and trace caches • Speculation at different levels • Dynamic management of thread priorities • Shared-memory multiprocessors • Memory support for speculative parallelization (hardware and software)
System software (multiprocessor) • OpenMP compilation • Proposal of language extensions to the standard • Compiler technology for OpenMP • OpenMP runtime systems • Parallel library for OpenMP (SGI Origin, IBM SP2, Intel hyperthreaded, …) • Software DSM (Distributed Shared Memory) • Intelligent runtimes: load balancing, data movement, …
System software (multiprocessor) • Scalability of MPI • IBM BG/L with 64K processors • Prediction of messages • Programming models for the GRID • Grid superscalar
Algorithms and applications • Solvers for linear equations systems • Out-of-core kernels and applications • STORM • Metacomputing tool for performing stochastic simulations • Data bases • Sorting, communication in join operations • Query optimization • Memory management in OLTP
PhD programs • Research topics are part of our PhD programs • Computer Architecture and Technology at UPC • Quality award: MCD2003-126 • 42% total number of credits, 2003-04 • 30 PhD in the last 5 years (66% of whole program)
International collaboration • Industry • Intel, IBM Watson, Toronto, Haifa and Germany Labs (*), Hewlett-Packard, STMicroelectronics, SGI, Cray, Compaq • University and research laboratories • Univ. of Illinois at Urbana-Champaign, Wisconsin-Madison, California at Irvine, William and Mary, Delft, KTH, … (more than 60) • Major research laboratories in USA: NASA Ames, San Diego (SDSC), Lawrence Livermore (LLNL), … • Standardization committees • OpenMP Futures in the Architecture Review Board (ARB) … with joint publications and developments (*) Part of the CEPBA-IBM Research Institute research agreement
International collaboration • Pre- and post doctoral short/medium/long stays • Industry: Intel, SUN, IBM, Hewlett-Packard, ... • University: Univ. of Illinois at Urbana-Champaign, Univ. of California at Irvine, Univ. of Michigan, ... • Visiting professors and researchers • More than 70 talks in our weekly seminar (last two years, external researchers) http://research.ac.upc.es/HPCseminar/ • PhD courses: • 2001-02: 5 courses (135 hours) • 2002-03: 4 courses (110 hours) • 2003-04: 6 courses (165 hours)
Krste Asanovic (MIT) Venkata Krishnan (Compaq-DEC) Trevor Mudge (U. Michigan) Jim E. Smith (U. Wisconsin) Luiz A. Barroso (WRL) Josh Fisher (HP Labs) Michel Dubois (USC) Ronny Ronnen (Intel, Haifa) Josep Torrellas (UIUC) Per Stenstrom (U. Gothenburg) Wen-mei Hwu (UIUC) Jim Dehnert (Transmeta) Fred Pollack (Intel) Sanjay Patel (UIUC) Daniel Tabak (George Mason U.) Walid Najjar (Riverside) Paolo Faboroschi (HP Labs) Eduardo Sánchez (EPFL) Guri Sohi (U. Michigan) Jean-Loup Baer (Washington Uni.) Miron Livny (U. Wisconsin) Tomas Sterling (NASA JPL) Maurice V. Wilkes (AT&T Labs) Theo Ungerer (Karlsruhe) Mario Nemirovsky (Xstreamlogic) Gordon Bell (Microsoft) Timothy Pinkston (U.S.C.) Walid Najjar (Riverside) Roberto Moreno (ULPGC) Kazuki Joe (Nara Women U.) Alex Veidenbaum (Irvine) G.R. Gao (U. Delaware) Ricardo Baeza (U.de Chile,Santiago) Gabby M. Silberman (CAS-IBM) Sally A. McKee (U. Utah) Evelyn Duesterwald (HP-Labs) Yale Patt (Austin) Burton Smith (Tera) Doug Carmean (Intel, Oregon) David Baker (BOPS) Some of the seminar guests
International collaboration • Mobility programs (*) • Access: Transnational Access for Researchers (2000-03) • Access-2: Transnational Access for Researchers (2002-04) • Networks of Excellence • HIPEAC: High-Performance Embedded Architectures and Compilers, in evaluation (www.hipeac.org) • CoreGRID (*): middleware for GRID, in evaluation (*) Projects of the European Center for Parallelism of Barcelona
Industrial technology transfer • European projects (IST and FET) • INTONE: Innovative OpenMP Tools for Non-experts (2000-03) • DAMIEN: Distributed Applications and Middleware for Industrial use of European Networks (2001-03) • POP: Performance Portability of OpenMP (2001-04) • Antitesys: A Networked Training Initiative for Embedded Systems Design (2002-04) • Attract international companies to establish branches or laboratories in Barcelona • EASi Engineering: S. Girona (*) • Intel Labs: R. Espasa (*) and T. Juan (*), A. Gonzalez (*), • Hewlett-Packard Labs (*) Professors of the Computer Architecture Department (full or part time dedication)
Compaq Sabbaticals Roger Espasa (VSSAD) Toni Juan (VSSAD) Marta Jimenez (VSSAD) Interns Jesus Corbal (VSSAD) Alex Ramirez (WRL) Partnerships BSSAD HP Sabbaticals Josep Llosa (Cambridge) Interns Daniel Ortega Javier Zalamea Parnerships Software Prefetching Two-Level Register File Sun Microsystems Interns Pedro Marcuello Ramon Canal Esther Salami Manel Fernande Microsoft IBM Interns Xavi Serrano (CAS) Daniel Jimenez (CAS) 3 more people in 2001 Parnerships Supercomputing (CIRI) Low Power Databases Binary Translation Faculty Awards Intel Interns Adrian Cristal (Haifa) Alex Ramirez (MRL) Pedro Marcuello (MRL) Parnerships Semantic Gap Smart Registers Memory Architecture for Multithreaded Processors Speculative Vector Processors Labs in Barcelona MRL and BSSAD Advisory Board of MRL Xstream, Flowstorm, Kambaya Advising committee ST- Microelectronics Analog Devices Industrial Relationships
Conclusions • Large (90) team with experience in many topics • Computer architecture • System software • Algorithms and applications • Good production • >100 PhD thesis • Publications in top conferences (>400) and journals (>150) • Prototypes (3) used in research laboratories • 25 professionals in industry • Long track of international collaborations • Academic • Industrial
Outline • High Performance Research Group at UPC • Centers of Supercomputing: • CEPBA • CIRI • BSC • Future context in Europe • Networks of Excellence • Large Scale Facilities • Conclusions
DAC • HPC experience RME, LSI, FEN, FA • Computing needs CEPBA Depts. CER CEPBA • October 1991 • R+D on parallelism • Training • Technology transfer • European context
CEPBA Activities • Technological expert • Developments • Service • Training • Technology transfer • T.T. Management • R & D • 24 proyectos
Service • Access • Users support IBM 64 Power3 Net. SMP 32GB Mem 400 GB Disk COMPAQ 12 alpha 21164 SMP 2GB Mem 32 GB Disk Parsytec 16 Pentium II 2CPUs nodes 3 Networks Fast Ethernet Myrinet HS link 1GB Mem 30 GB Disk SGI 64 R10000 CC-NUMA 8GB Mem 360 GB Disk
European Mobility Programs • Joint CEPBA - CESCA projects • Stays and access to resources
Technology Transfer 1991... 1994 1995 1996 1997 1998 1999 2000... 1986... R+D projects 23 Projects CEPBA Technology Transfer Management 3 cluster projects 28 Subprojects • Technical management & Dissemination • Technological partner & developments
Damien Sep-tools Intone Identify Asra Promenvir+ Bonanova Phase Apparc Nanos R&D Projects 92 93 94 95 96 97 98 99 00 01 Supernode II Dimemas & Paraver Tools Permpar Parmat Parallelization Sloegat Hipsid Metacomputing Promenvir ST-ORM BMW DDT System
PACOS 2 T.T. Management 1994 1995 1996 1997 1998 1999 2000 PCI-PACOS 35 Proposals 28 projects PCI-II CEPBA-TTN • Promote proposals to EC • Technical management of projects • Dissemination
PCI-PACOS Hesperia Neosystems UPC-EIO Ayto. Barcelona Uitesa UPC-EIO AMES CIMNE TGI UPM-DATSI Iberdrola Uitesa UPV Metodos Cuantitativos Gonfiesa CESCA CESGA AZTI UPC-LIM Tecnatom UMA
Volkswagen Ricardo PAC PCI-II Ferrari Genias P3C Ospedali Galliera Le Molinette Parsytec PAC EDS ENEL EDF CSR4 Reiter Kenijoki CANDEMAT CIMNE CEPBA-UPC Italeco Geospace Intecs Univ. Leiden Intera SP Intera UK UPC-DIT CEPBA-UPC Inisel Espacio Infocarto UPC-TSC CEPBA-UPC Cari Verona AIS PAC Univ. Cat. Milan Cristaleria Española UNICAN CEPBA-UPC
Iberdrola SAGE CEPBA-UPC INDO CEPBA-UPC Soler y Palau CIMNE CEPBA-UPC Torres Software Greenhouse CEPBA-UPC ST Mecanica DERBI AUSA CEPBA-UPC CEPBA CESCA UMA UNICAN UPM CEBAL-ENTEC NEOSYSTEMS BCN COSIVER Mides UPC-EIO SENER CIC UNICAN CASA Envision GTD Intespace RUS CEPBA-TTN
References: Technology promotion European Comission AIS CariVerona EDF-LNH EDS Italy ENEL-CRIS Ferrari Auto Genias Geospace Intecs Sistemi Intespace Italeco Kemijoki Le Molinette Ospedali Galliera Parsytec QuantiSci LTD Reiter Ricardo Volswagen CESCA CESGA CIMNE CRS4 P3C PAC RUS Catholic Univ. Milan Politecnico di Milano UNICAN UPM UMA Univ. of Leiden UPC-DIT UPC-EIO UPC-LIM UPC-OE UPC-RME UPC-TSC IPM-DATSI UPV AZTI Candemat CASA CIC Cristaleria Española Envison Gonfiesa GTD Iberdrola Indra Espacio Infocarto SAGE SENER Tecnatom TGI Uitesa AMES AUSA INDO Ayto. BCN BCN COSIVER CEBAL-ENTEC DERBI HESPERIA Métodos Cuantitativos Mides NEOSYSTEMS QuantiSci SL Software Greenhouse Soler y Palau ST Mecánica Torres
Budget (PACOS,PCI-II,TTN) • Managed ESPRIT Funding: 11.8 Mecus (total)
CIRI’s mission CEPBA-IBM Research Institute (CIRI) was a research and development partnership between UPC and IBM. • Established in October 2000 with an initial commitment of four years. • Its mission was to contribute to the community through R&D in Information Technology. • Its objectives are: Research & Development • External R&D Support • Technology Transfer • Education • CEPBA (European Center for Parallelism Barcelona) was a deep computing research center, at the Technical University of Catalonia (UPC), which was created in 1991.
Organization • Management and Technical Boards evaluate project performance and achievements and recommend future directions. • 70 people were collaborating with the Institute: • Board of Directors (4) • Institute’s Professors (10) • Associate Professors (9) • Researchers (5) • PhD Students (21) • Graduate Students (3) • Undergraduate Students (18)
Introduction: CIRI areas of interest • Deep Computing • Performance Tools: Numerical Codes Web Application Servers • Parallel Programming • Grid • Code Optimization • Computer Architecture • Vector Processors • Network Processors • Data Bases • Performance Optimization • DB2 Development & Testing
CIRI R&D Philosophy • Technology Web • Metacomputing • ST-ORM • Applications • Steel stamping • Structural analysis • MGPOM • MPIRE • MPI • UTE2paraver • OMPITrace • OpenMP • OMPTrace • activity • hw counters • Nested Parallelism • Precedences • Indirect access Computer Architecture Performance Visualization Paraver • Dimemas • Collective, mapping • Scheduling • System scheduling • Self analysis • Performance Driven Proc. Alloc. • Process and memory control • RunTime Scheduling • Dynamic load balancing • Page migration
Barcelona Supercomputing Center Centro Nacional de Supercomputación Professor Mateo Valero Director
Experiment Computing & Simulation Theory High Performance Computing Aircraft, Automobile Design Climate and Weather Modeling Fusion Reactor, Accelerator Design, Material Science, Astrophysics
What drives HPC ? “The Need for Speed...” Computational Needs of Technical, Scientific, Digital Media and Business Applications Approach or Exceed the Petaflops/s Range CFD Wing Simulation 512x64x256 Grid (8.3 x10e6 mesh points) 5000 FLOPS per mesh point, 5000 time steps/cycles 2.15x10e14 FLOPS Materials Science Magnetic Material: Current: 2000 atoms; 2.64 TF/s, 512 GB Future: HDD Simulation - 30 TF/s, 2 TB Electronic Structures: Current: 300 atoms; 0.5 TF/s, 100 GB Future: 3000 atoms; 50 TF/s, 2TB CFD Full Plane Simulation 512x64x256 Grid (3.5 x10e17 mesh points) 5000 FLOPS per mesh point, 5000 time steps/cycles 8.7x10e24 FLOPS Source: D. Balley, NERSC Source: A. Jameson, et al Spare Parts Inventory Planning Modelling the optimized deployment of 10000 part numbers across 100 part depots and requries: - 2x10e14 FLOP/s (12 hours on 10, 650 MHz CPUs) - 2.4 PetaFlop/s sust. performance (1 hour turn-around time) Industry trend to rapid, frequent modeling for timely business decision support driver higher sustained performance Digital Movies and Special Effects ~1e14 FLOPs per frame 50 frames/sec 90 minute movie - 2.7e19 FLOPs ~ 150 days on 2000 1GFLOP/s CPUs Source: Pixar Source: B. Dietrich, IBM