310 likes | 502 Views
CSCE 212 Introduction to Computer Architecture. Instructor: Jason D. Bakos. What is Computer Architecture?. The design of computer systems, to… To improve “performance” Run programs faster Use less power, last longer on battery power Generate less or more uniformally distributed heat
E N D
CSCE 212Introduction to Computer Architecture Instructor: Jason D. Bakos
What is Computer Architecture? • The design of computer systems, to… • To improve “performance” • Run programs faster • Use less power, last longer on battery power • Generate less or more uniformally distributed heat • Improve video, 3D rendering, encoding, or decoding frame rate • Handle more secure encryption standards with reasonable latency • Achieve routing or network intrution detection at higher line speeds • Be more scalable • Be less expensive (e.g. higher integration) • Can be achieved via: • Software (better OS, more optimized application code) or • Hardware (processor) • Designing any complex system requires abstraction
145/146/240/245 330 311 212 211/611 211 ELCT 371 Abstraction • Abstration used to manage complexity of design • Hide details that are not important
Domains and Levels of Modeling Functional Structural high level of abstraction low level of abstraction “Y-chart” from Gajski & Kahn Geometric
Domains and Levels of Modeling Functional Structural Algorithm(behavioral) Register-TransferLanguage Boolean Equation Differential Equation “Y-chart” from Gajski & Kahn Geometric
Domains and Levels of Modeling Functional Structural Processor-MemorySwitch Register-Transfer Gate Transistor “Y-chart” from Gajski & Kahn Geometric
Domains and Levels of Modeling Functional Structural Polygons Sticks Standard Cells Floor Plan “Y-chart” from Gajski & Kahn Geometric
MIPS Microarchitecture RTL (datapath) fetch instruction 1. Address <= PC 2. MemRead 3. PC <= PC + 1 4. IR <= MemData Control fetch instruction 1. IorD = 0 2. MemRead = 1 3. PCEn = 1 ALUSrcA = 0 ALUSrcB = 01 ALUOp = ADD PCSource = 01 4. IRWrite = 1
Logic Synthesis • Behavior: • S = A + B • Assume A is 2 bits, B is 2 bits, C is 3 bits
Logic Gates inv NAND2 NAND3 NOR2
Latches Positive edge-sensitive latch
Semiconductors • Silicon is a group IV element (4 valence electrons, shells: 2, 8, 18, 32…) • Forms covalent bonds with four neighbor atoms (3D cubic crystal lattice) • Si is a poor conductor, but conduction characteristics may be altered • Add impurities/dopants (replaces silicon atom in lattice): • Makes a better conductor • Group V element (phosphorus/arsenic) => 5 valence electrons • Leaves an electron free => n-type semiconductor (electrons, negative carriers) • Group III element (boron) => 3 valence electrons • Borrows an electron from neighbor => p-type semiconductor (holes, positive carriers) + - - + + + + + + + + + + + + + - - - - - - - - - - - - P-N junction forward bias reverse bias
MOSFETs negative voltage (rel. to body) (GND) • Metal-poly-Oxide-Semiconductor structures built onto substrate • Diffusion: Inject dopants into substrate • Oxidation: Form layer of SiO2 (glass) • Deposition and etching: Add aluminum/copper wires positive voltage (Vdd) NMOS/NFET PMOS/PFET - - - + + + - - - + + + current current channel shorter length, faster transistor (dist. for electrons) body/bulk GROUND body/bulk HIGH (S/D to body is reverse-biased)
IC Fabrication • Chips are fabricated using set of masks • Photolithography • Basic steps • oxidize • apply photoresist • remove photoresist with mask • HF acid eats oxide but not photoresist • pirana acid eats photoresist • ion implantation (diffusion, wells) • vapor deposition (poly) • plasma etching (metal)
Layout 3-input NAND
Cell Library (Snap Together) Layout
8” Wafer • 8 inch (200 mm) wafer containing Pentium 4 processors • 165 dies, die area = 250 mm2, 55 million transistors, .18mm
Feature Size • Shrink minimum feature size… • Smaller L decreases carrier time and increases current • Therefore, W may also be reduced for fixed current • Cg, Cs, and Cd are reduced • Transistor switches faster (~linear relationship)
Clock Speed • Clock speed is affected by: • Fabrication technology • Architecture: how much work performed in a single cycle • Execution time = • instructions per program * cycles per instruction * seconds per cycle • Now we must add to the product: • (number of program threads / number of processor cores)
Integration Density Core 2 Duo (2007) has ~300M transistors
Microprocessor Technology • Advances in fabrication (lithography, photoresist, metal layers) • …faster transistor switching (faster processor) • …smaller transistors/wires • …higher integration density • …more “real estate” • …architectural improvements!
Microarchitectural Parallelism • Parallelism => perform multiple operations simultaneously • Instruction-level parallelism • Execute multiple instructions at the same time • Multiple issue • Out-of-order execution • Speculation • Branch prediction • Thread-level parallelism (hyper-threading) • Execute multiple threads at the same time on one CPU • Threads share memory space and pool of functional units • Chip multiprocessing • Execute multiple processes/threads at the same time on multiple CPUs • Cores are symmetrical and completely independent but share a common level-2 cache