1 / 32

Large-Scale Scientific Computing 1946-2006 John G. Zabolitzky

Large-Scale Scientific Computing 1946-2006 John G. Zabolitzky. Segments of Computation. 1. Scientific ↔ Commercial ↔ Consumer ↔ Embedded

tekla
Download Presentation

Large-Scale Scientific Computing 1946-2006 John G. Zabolitzky

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Large-Scale Scientific Computing 1946-2006John G. Zabolitzky Eine Zeitreise in die Welt der Computer.

  2. Segments of Computation • 1. Scientific ↔ Commercial↔ Consumer↔ Embedded • Solution of technical/scientific problems like weather, fluid dynamics, nuclear reactor simulation (usually involving many complicated operations on real floating-point numbers) as opposed to commercial problems like accounting, inventory, banking (usually involving characters and few, simple operations on fixed-point numbers). Not considering consumer applications like music, movies, games; web-servers; dishwashers, coffeemakers, automotive. • 2. Large-Scale ↔ Small/Medium-Scale • Looking at the largest problems which can be treated in the current year. Not looking at small-scale, e.g. laboratory-automation, student paper, or small research problem. (M$ problems, not k$ problems) • 3. Mainstream ↔ Experimental, Unique, small market share machines • Machines which have had a major influence on science/technology in general on a broad scale. • 4. What is a Computer ? • Stored program (not fixed, not external) electronic (not electromechanical) computer Eine Zeitreise in die Welt der Computer.

  3. First 30 Years: Time line 1946-1975: Scalar ("von Neumann") Computing • 1946 Zuse(electromechanical), ENIAC(wired program), Whirlwind .... early attempts • 1950 ERA 1101 (Atlas 1) • 1953 ERA 1103 (Atlas 2) IBM 701 "defense calculator" • 1857 IBM 709 • 1959 CDC 1604 • 1960 IBM 7090 = 709t • 1962 IBM 7094 • 1963 CDC 3600 • 1964 CDC 6600 • 1965 IBM /360 family • 1969 CDC 7600 • 1971 IBM /360-195 Eine Zeitreise in die Welt der Computer.

  4. ERA 1101 (1950) Vacuum Tubes 2 Registers (A(48), Q(24)) 24 bit binary parallel Drum memory 16k words 4.400 add/mul/sec 1-arithmetic section 2-power supply 3-control section 4-maintenance section 5-memory, electronic section 6-memory, drum section 7-heat transfer unit 8,9- control, paper tape reader/punch Eine Zeitreise in die Welt der Computer.

  5. ERA 1103 (1953) Vacuum Tubes 2 Registers (A(72), Q(36)) 36 bit binary parallel Williams tube memory 1k words (CRT tube memory) Drum memory 16k words 4.400 add/mul/sec Eine Zeitreise in die Welt der Computer.

  6. IBM 701 ("defense calculator") (1953) Vacuum Tubes 2 Registers (A(38), Q(36)) 36 bit binary parallel Williams tube memory 2k words (CRT tube memory) Drum memory 8k words 4.000 add/mul/sec Eine Zeitreise in die Welt der Computer.

  7. IBM 709 (1957) Vacuum Tubes 5 Registers (A(38), Q(36), 3 index) 36 bit binary parallel magnetic core memory 4/8/32k words Drum memory 8/16k words 5.500 add/mul/sec Eine Zeitreise in die Welt der Computer.

  8. CDC 1604 (1959) discrete Transistor 8 Registers (A(96), Q(48), 6 index) 48 bit binary parallel magnetic core memory 32k words 40k add/mul/sec Eine Zeitreise in die Welt der Computer.

  9. IBM 7090 (1960) discrete Transistor 5 Registers (A(38), Q(36), 3 index) 36 bit binary parallel magnetic core memory 32k words 40k add/mul/sec Eine Zeitreise in die Welt der Computer.

  10. IBM 7094 (1962) discrete Transistor 9 Registers (A(38), Q(36), 7 index) 36 bit binary parallel magnetic core memory 32k words 80k add/mul/sec Eine Zeitreise in die Welt der Computer.

  11. CDC 6600 (1964) discrete Transistor 32 Registers (8 X, 8 A, 8B, 8 instruction stack) 60 bit binary parallel magnetic core memory 128k words 1 MFLOPS first fluid cooled Eine Zeitreise in die Welt der Computer.

  12. CDC 6600 10 core modules - each 6 kByte - 130 modules total 2 logic frames Eine Zeitreise in die Welt der Computer.

  13. discrete wire mat vector graphic console Eine Zeitreise in die Welt der Computer.

  14. "Last week Control Data ... announced the 6600 system. I understand that in the laboratory developing the system there are only 34 people including the janitor. Of these, 14 are engineers and 4 are programmers ... Constrasting this modest effort with our vast development activities, I fail to understand why we have lost our industry leadership position by letting someone else offer the world's most powerful computer." -- Thomas Watson, CEO of IBM, 1964 "It seems like Mr. Watson has answered his own question." -- Seymour Cray, Control Data Corporation Eine Zeitreise in die Welt der Computer.

  15. Eine Zeitreise in die Welt der Computer.

  16. CDC 7600 (1969) • The 7600 has similar hardware stucture like the 6600 (discrete transistor), with some improvements: • - 12 word instruction stack (was 8 word), total of 36 "registers" • - 275 nsec small core memory cycle time (64kW, was 1000 nsec 128 kW), large core 512 kW • - 36 MHz clock (was 10 MHz) • - more consequently pipelined functional units • - faster peripheral prcoessors Eine Zeitreise in die Welt der Computer.

  17. IBM /360 - 195 (1971) integrated circuit 20 Registers (16 GP, 4 FP) 32/64 bit binary parallel magnetic core memory 1Mword max 756 nsec silicon cache 32 kByte 54 nsec (4 kword) model 195: hidden registers in CPU to overcome /360 limitations Eine Zeitreise in die Welt der Computer.

  18. Compiled by Erich Strohmaier Eine Zeitreise in die Welt der Computer.

  19. Second 30 Years: Time line 1976-2006: Vector and Parallel Computing • 1976 Cray-1 first successful vector computer (~ 50 MFLOPS) • 1982 Cray X-MP first multiple-processor shared-memory vector computer • 1985 Cray-2 large memory (256 MW = 2 GByte) • 1888 Cray Y-MP first to break 1 GFLOPS barrier • 1993 Cray T3D first successful massively parallel machine, 3D-Torus • 16 x 1 GFLOPS < 512 x 0.150 = 76 GFLOPS • 1995 Cray T3E most widely sold MPP machine; break 1 TFLOPS barrier • ~1700 x 1.2 GFLOPS = 2 TFLOPS • 2004 IBM Blue Gene/L world performance leader (development started 1999) • IBM today has dominant market share (> 50%) • leadership recovered after 40 years of CDC/Cray dominance • same interconnect structure as Cray T3D/T3E (3D-Torus) • 2006 lowest-power processors (64k x 5 GFLOPS = 320 TFLOPS) Eine Zeitreise in die Welt der Computer.

  20. Seymour Cray Cray-1 1976 Single Processor 80/160 MFLOPS peak 1 Mword = 8 Mbyte Photograph courtesy of Charles Babbage Institute, University of Minnesota, Minneapolis Eine Zeitreise in die Welt der Computer.

  21. MUCH larger working set: - 8 vector registers, 64 words - 8 scalar registers - 8 address registers - large instruction buffer Performance Features: - vector processing: one operation affects 64 vector elements, streamed through functional unit - small vector startup time - chaining between vector ops - large, fast semiconductor memory - requires vectorization effort Eine Zeitreise in die Welt der Computer.

  22. Cray X-MP 1982 4 processors 800 MFLOPS 16 Mword = 128 MByte Eine Zeitreise in die Welt der Computer.

  23. Cray-2 1985 4 processors 1200 MFLOPS 256 Mword = 2 GByte Eine Zeitreise in die Welt der Computer.

  24. Minnesota Supercomputer Center Minneapolis 1986 CDC Cyber 205 Cray-2 (4) Cray-2 (1) Eine Zeitreise in die Welt der Computer.

  25. Cray Y-MP 1988 + 8/16 processors 1-16 GFLOPS 16M-1Gword = 128M-8GByte Eine Zeitreise in die Welt der Computer.

  26. Cray T3D (1993) First widely successful massively parallel system 512 x 0.15 MFLOPS = 76 GFLOPS 4 Gword = 32 Gbyte distributed memory 3D Torus interconnect MPP requires massive software effort Eine Zeitreise in die Welt der Computer.

  27. Cray T3E (1995) Most successful massively parallel system in the 1990s 2048 x 1200 MFLOPS = 2.4 TFLOPS max.(8 cabinets) 64 Gword = 256 Gbyte distributed memory (large end of config.) 3D Torus interconnect 3 cabinets = 768 processors Eine Zeitreise in die Welt der Computer.

  28. Cache not always useful Latency, congestion not discussed here Eine Zeitreise in die Welt der Computer.

  29. From: Thomas Lippert, FZJ Eine Zeitreise in die Welt der Computer.

  30. From: Thomas Lippert, FZJ Eine Zeitreise in die Welt der Computer.

  31. From: Thomas Lippert, FZJ;1 MW ~ 1 M€/year !! Eine Zeitreise in die Welt der Computer.

  32. After 40 years (1964 - 2004) of CDC - Cray (vector) dominance IBM has regained the market leadership.Low-power technology is the key to success:- high density → fast communication- low utility cost, low building costScalar → Vector→ Parallel: increasing burden on programmer to obtain performance/efficiency Eine Zeitreise in die Welt der Computer.

More Related