1 / 39

Lecture 1. Technology Trend

COM503 Parallel Computer Architecture & Programming. Lecture 1. Technology Trend. Prof. Taeweon Suh Computer Science Education Korea University. Transistor Basics. Digital chips are designed with transistors Transistor is a three-ported voltage-controlled switch

tyson
Download Presentation

Lecture 1. Technology Trend

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. COM503 Parallel Computer Architecture & Programming Lecture 1. Technology Trend Prof. Taeweon Suh Computer Science Education Korea University

  2. Transistor Basics • Digital chips are designed with transistors • Transistor is a three-ported voltage-controlled switch • Two of the ports are connected depending on the voltage on the third port • For example, in the switch below the two terminals (d and s) are connected (ON) only when the third terminal (g) is 1

  3. Silicon • Transistors are built out of silicon, a semiconductor • Silicon is not a conductor • Doped silicon is a conductor • n-type (free negative charges, electrons) • p-type (free positive charges, holes) wafer Majority: Holes Minority: Electrons Majority: Electrons Minority: Holes

  4. Periodic Table of the Elements

  5. MOS Transistors • Metal oxide silicon (MOS) transistors: • Polysilicon (used to be Metal) gate • Oxide (silicon dioxide) insulator • Doped Silicon substrate and wells

  6. MOS Transistors • The MOS sandwich acts as a capacitor (two conductors with insulator between them) • When voltage is applied to the gate, the opposite charge is attracted to the semiconductor on the other side of the insulator, which could form a channel of charge

  7. nMOS Transistor Gate = 1 (ON) (connection between source and drain) Gate = 0 (OFF) (no connection between source and drain)

  8. Transistor Function

  9. CMOS (Complementary MOS) • CMOS is used to build the vast majority of all transistors fabricated today • nMOS transistors pass good 0’s, so connect source to GND • pMOS transistors pass good 1’s, so connect source to VDD

  10. CMOS Layout • Top view • Cross-section

  11. NOT Gate Layout (top view)

  12. NAND Gate Layout

  13. Now, Let’s Make an Inverter Chip • Yield means how many dies are working correctly after fabrication Core 2 Duo die Your Inverter chip

  14. (Semiconductor) Technology • IC (Integrated Circuit) combined dozens to hundreds of transistors into a single chip • VLSI (Very Large Scale Integration) is used to describe the tremendous increase in the number of transistors in a chip • (Semiconductor) Technology: How small can you make a transistor • 0.1 µm (100nm), 90nm, 65nm, 45nm, 32nm, 22nm technologies

  15. x86? • What is x86? • Generic term referring to processors from Intel, AMD and VIA • Derived from the model numbers of the first few generations of processors: • 8086, 80286, 80386, 80486 x86 • Now it generally refers to processors from Intel, AMD, and VIA • x86-16: 16-bit processor • x86-32 (aka IA32): 32-bit processor * IA: Intel Architecture • x86-64: 64-bit processor • Intel takes about 80% of the PC market and AMD takes about 20% • Apple also have been introducing Intel-based Mac from Nov. 2006 * aka: also known as

  16. x86 History (as of 2008)

  17. x86 History (Cont.) 4-bit 8-bit 16-bit 32-bit (i386) 64-bit (x86_64) 32-bit (i586) 32-bit (i686) 2009 2011 2nd Gen. Core i7 (Sandy Bridge) 1st Gen. Core i7 (Nehalem) 2013 2012 4th Gen. Core i7 (Haswell) 3rd Gen. Core i7 (Ivy Bridge)

  18. Moore’s Law • Transistor count will be doubled every 18 months 1.7 billions Montecito 42millions Exponential growth 2,250

  19. Feature Size (Technology) Trend

  20. Power Dissipation • By early 2000, Intel and AMD made every effort to increase clock frequency to enhance the performance of their CPUs • But, the power consumption is the problem P≈ CVDD2f C: Capacitance VDD: Voltage f: Frequency

  21. Power Density Trend Source: Intel Corp.

  22. Watch this! Click the chip Slide from Prof H.H. Lee in Georgia Tech

  23. How to Reduce Power Consumption? • Reduce supply voltage with new technologies • i.e., reducing transistor size • Keep the clock frequency in modest range • No longer increase the clock frequency • Then… what would be the problem? • So, the strategy is to integrate simple many CPUs in a chip Performance Dual Core, Quad Core….

  24. Reality Check, circa 200x • Conventional processor designs run out of steam • Power wall (thermal) • Complexity (verification) • Physics (CMOS scaling) • Unanimous direction  Multi-core • Simple cores (massive number) • Keep • Wire communication on leash • Gordon Moore happy (Moore’s Law) • Architects’ menace: kick the ball to the other side of the court? Modified from Prof. Sean Lee in Georgia Tech

  25. Multi-core Processor Gala Prof. Sean Lee’s Slide in Georgia Tech

  26. DL1 DL1 Core0 Core1 IL1 IL1 L2 Cache Intel’s Core 2 Duo • 2 cores on one chip • Two levels of caches (L1, L2) on chip • 291 million transistors in 143 mm2 with 65nm technology Source: http://www.sandpile.org

  27. Intel’s Core i7 (Nehalem) • 4 cores on one chip • Three levels of caches (L1, L2, L3) on chip • 731 million transistors in 263 mm2 with 45nm technology

  28. Intel’s Core i7 (Sandy Bridge) 2nd Generation Core i7 995 million transistors in 216 mm2 with 32nm technology

  29. Intel’s Core i7 (Ivy Bridge) 3rdGeneration Core i7 1.4 billion transistors in 160 mm2 with 22nm technology http://blog.mytechhelp.com/laptop-repair/the-ivy-bridge/

  30. Intel’s Core i7 (Haswell) 4th Generation Core i7 1.6 billion transistors in 177 mm2 with 22nm technology 2x Graphics performance over Ivy Bridge

  31. AMD’s Opteron – Barcelona (2007) • 4 cores on one chip • 1.9GHz clock • 65nm technology • Three levels of caches (L1, L2, L3) on chip • Integrated North Bridge

  32. Intel Teraflops Research Chip • 80 CPU cores • Deliver more than 1 trillion floating-point operations per second (1 Teraflops) of performance Introduced in September 2006

  33. Intel’s 48 Core Processor • 48 x86 cores manufactured with 45nm technology • Nicknamed “single-chip cloud computer” Debuted in December 2009

  34. Tilera’s 100 cores (June 2011) • Tilera has introduced a range of processors (64-bit Gx family: 36 cores, 64 cores and 100 cores), aiming to take on Intel in servers that handle high-throughput web applications • 64-bit cores running up to 1.5GHz • Manufactured in 40nm technology TILE Gx 3000 Series Overview

  35. IBM Bluegene/Q Processor • The Bluegene/Q processors power the world #3 Sequoia supercomputer, boasting 16.32 petaflops in Lawrence Livermore National Labs • 1,572,864 cores • Bluegene/Q has 18 cores • First processor supporting hardware transactional memory • Each core is a 64-bit 4-way multithreaded PowerPC A2 • 16 cores are used for running actual computations; one will be used for running the operating system; the other is used to improve chip reliability • 1.47 billion transistors • 1.6 GHz http://www.top500.org IBM’s Bluegene/Q Processor (2011)

  36. #1 Supercomputer (2013) • Tianhe-2 (MilkyWay-2) in National University of Defense Technology, China • Xeon Phi • 3,120,000 cores • 1,024 TB Memory • 17,808 MW power consumption • 33 petaflops http://www.top500.org

  37. Performance • If you edit your ms-word document on dual core, would it be running twice faster? • The problem now is how to parallelize applications and efficiently use hardware resources (available cores)… • If you were plowing a field, which would you rather use: Two strong oxen or 1024 chickens? - Seymour Cray (the father of supercomputing) No! Well, it is hard to say in Computing World

  38. Parallel Programming Models • Most widely adopted parallel programming models • OpenMP • Shared-memory programming model • Parallel constructs are added to a sequential program written in Fortran, C or C++ • Comparably simple to use since the burden of working out the details of the parallel program is up to the compiler • Pthread: POSIX (Portable Operating System Interface) Threads • Shared-memory programming model • Pthreads are defined as a set of C and C++ programming types and procedure calls • A collection of routines for creating, managing, and coordinating a collection of threads – So, it is a library • Programming with Pthreads is much more complex than with OpenMP

  39. Parallel Programming Models • MPI: Message Passing Interface • Developed for distributed-memory architectures, where multiple processes execute independently and communicate data as needed by exchanging messages • Most widely used in the high-end technical computing community, where clusters are common • Most vendors of shared memory systems also provide MPI implementations that leverage the shared address space • Most MPI implementations consist of a specific set of APIs callable from C, C++ ,Fortran or Java --Wiki

More Related