310 likes | 335 Views
Explore the intricate world of computer architecture including abstraction levels, logic synthesis, semiconductor elements, MOSFETs, IC fabrication processes, and microarchitectural parallelism concepts.
E N D
CSCE 212Introduction to Computer Architecture Instructor: Jason D. Bakos
What is Computer Architecture? • The design of computer systems, to… • To improve “performance” • Run programs faster • Use less power, last longer on battery power • Generate less or more uniformally distributed heat • Improve video, 3D rendering, encoding, or decoding frame rate • Handle more secure encryption standards with reasonable latency • Achieve routing or network intrution detection at higher line speeds • Be more scalable • Be less expensive (e.g. higher integration) • Can be achieved via: • Software (better OS, more optimized application code) or • Hardware (processor) • Designing any complex system requires abstraction
145/146/240/245 330 311 212 211/611 211 ELCT 371 Abstraction • Abstration used to manage complexity of design • Hide details that are not important
Domains and Levels of Modeling Functional Structural high level of abstraction low level of abstraction “Y-chart” from Gajski & Kahn Geometric
Domains and Levels of Modeling Functional Structural Algorithm(behavioral) Register-TransferLanguage Boolean Equation Differential Equation “Y-chart” from Gajski & Kahn Geometric
Domains and Levels of Modeling Functional Structural Processor-MemorySwitch Register-Transfer Gate Transistor “Y-chart” from Gajski & Kahn Geometric
Domains and Levels of Modeling Functional Structural Polygons Sticks Standard Cells Floor Plan “Y-chart” from Gajski & Kahn Geometric
MIPS Microarchitecture RTL (datapath) fetch instruction 1. Address <= PC 2. MemRead 3. PC <= PC + 1 4. IR <= MemData Control fetch instruction 1. IorD = 0 2. MemRead = 1 3. PCEn = 1 ALUSrcA = 0 ALUSrcB = 01 ALUOp = ADD PCSource = 01 4. IRWrite = 1
Logic Synthesis • Behavior: • S = A + B • Assume A is 2 bits, B is 2 bits, C is 3 bits
Logic Gates inv NAND2 NAND3 NOR2
Latches Positive edge-sensitive latch
Semiconductors • Silicon is a group IV element (4 valence electrons, shells: 2, 8, 18, 32…) • Forms covalent bonds with four neighbor atoms (3D cubic crystal lattice) • Si is a poor conductor, but conduction characteristics may be altered • Add impurities/dopants (replaces silicon atom in lattice): • Makes a better conductor • Group V element (phosphorus/arsenic) => 5 valence electrons • Leaves an electron free => n-type semiconductor (electrons, negative carriers) • Group III element (boron) => 3 valence electrons • Borrows an electron from neighbor => p-type semiconductor (holes, positive carriers) + - - + + + + + + + + + + + + + - - - - - - - - - - - - P-N junction forward bias reverse bias
MOSFETs negative voltage (rel. to body) (GND) • Metal-poly-Oxide-Semiconductor structures built onto substrate • Diffusion: Inject dopants into substrate • Oxidation: Form layer of SiO2 (glass) • Deposition and etching: Add aluminum/copper wires positive voltage (Vdd) NMOS/NFET PMOS/PFET - - - + + + - - - + + + current current channel shorter length, faster transistor (dist. for electrons) body/bulk GROUND body/bulk HIGH (S/D to body is reverse-biased)
IC Fabrication • Chips are fabricated using set of masks • Photolithography • Basic steps • oxidize • apply photoresist • remove photoresist with mask • HF acid eats oxide but not photoresist • pirana acid eats photoresist • ion implantation (diffusion, wells) • vapor deposition (poly) • plasma etching (metal)
Layout 3-input NAND
Cell Library (Snap Together) Layout
8” Wafer • 8 inch (200 mm) wafer containing Pentium 4 processors • 165 dies, die area = 250 mm2, 55 million transistors, .18mm
Feature Size • Shrink minimum feature size… • Smaller L decreases carrier time and increases current • Therefore, W may also be reduced for fixed current • Cg, Cs, and Cd are reduced • Transistor switches faster (~linear relationship)
Clock Speed • Clock speed is affected by: • Fabrication technology • Architecture: how much work performed in a single cycle • Execution time = • instructions per program * cycles per instruction * seconds per cycle • Now we must add to the product: • (number of program threads / number of processor cores)
Integration Density Core 2 Duo (2007) has ~300M transistors
Microprocessor Technology • Advances in fabrication (lithography, photoresist, metal layers) • …faster transistor switching (faster processor) • …smaller transistors/wires • …higher integration density • …more “real estate” • …architectural improvements!
Microarchitectural Parallelism • Parallelism => perform multiple operations simultaneously • Instruction-level parallelism • Execute multiple instructions at the same time • Multiple issue • Out-of-order execution • Speculation • Branch prediction • Thread-level parallelism (hyper-threading) • Execute multiple threads at the same time on one CPU • Threads share memory space and pool of functional units • Chip multiprocessing • Execute multiple processes/threads at the same time on multiple CPUs • Cores are symmetrical and completely independent but share a common level-2 cache