290 likes | 350 Views
CS, CoE, EE 362 Digital Computers II: Architecture. Prof. Mark Franklin: jbf@cse.wustl.edu Course Assistants: Drew Frank: ajf1@cec.wustl.edu Required Book: “Heuring & Jordan” 2 nd Edition Optional Book: “Intro. VHDL” Yalamanchili Read: Academic Integrity Statement.
E N D
CS, CoE, EE 362Digital Computers II: Architecture • Prof. Mark Franklin: jbf@cse.wustl.edu • Course Assistants: • Drew Frank: ajf1@cec.wustl.edu • Required Book: “Heuring & Jordan” 2nd Edition • Optional Book: “Intro. VHDL” Yalamanchili • Read: Academic Integrity Statement. • Course Web Site: http://www.cse.wustl.edu/~jbf/cse362.d/cse362.html Mark Franklin, S06
Four Key Questions • What components must every computer have ? • How can computers be described, specified and evaluated ? • What constitutes computer architecture (hardware, software, firmware, algorithms, etc.) ? • How does technology effect computer architecture (chip size, feature size, power, pin density, etc) ? Mark Franklin, S06
Essential Computer Components • Processor: interpret/execute instructions. • Memory: store instructions & data. • Communication Device(s): communicate with outside world, I/O. Classic Computer Architecture (SISD: Single Instruction Stream-Single Data Stream) Processor Control Unit Input/ Output Memory ALU Mark Franklin, S06
Architecture Components • INSTRUCTION SET DESIGN: Programmer visible instruction set Algorithm, compiler, OS design, algorithmic complexity • HIGH LEVEL COMPONENT ORGANIZATION: Memory system, bus structure, processor design, branch handling, pipelining, execution algorithms, instructions/second, clocks/instruction. • HARDWARE: Detailed logic design, packaging VLSI & Logic design CAD algorithms speed, area, power, … Mark Franklin, S06
Program Control Unit ALU ALU ALU ALU Program Memory Interconnection Network Data Memory Unit Input / Output (SIMD) Single Instruction Stream – Multiple Data Stream Architecture Mark Franklin, S06
Performance Expression: Amdahl’s Law Mark Franklin, S06
It does no good to have many processors if there is not enough parallelism. What portion of a computation can be sequential if we want the processors to be used at 50 percent efficiency ? ( S = p/2 ) Amdahl’s Law Mark Franklin, S06
1 ExTimeold ExTimenew Speedupoverall = = (1 - Fractionenhanced) + Fractionenhanced Speedupenhanced Generalize Amdahl’s Law Example: “Suppose a program runs in 100 seconds on a machine. Multiply operations are responsible for 80 seconds of this time. How much do we have to improve the speed of multiplication if we want the program to run 4 times faster?” What about 5 times faster? PRINCIPAL: Make the common case fast! Mark Franklin, S06
Computer Market Partitioning(costs are for processor, not system) • Desktop Computing ($100 - $1,000): • Price-performance • Servers: ($200 - $2,000) • Availability (reliability + effectiveness) • Scalability • Throughput • Embedded Computers: ($0.20 - $1,000) • Real-time performance • Power and memory minimization • Cost minimization • Interface with special purpose logic; use of processor cores Mark Franklin, S06
HLL (e.g., C, C++, Perl) vs Machine/Assembly Language (AL) • HLL Pros: • Easier to express algorithms due to higher level constructs (e.g., For, Case, Arithmetic expressions, objects, etc.) • Type checking (Hardware for type checking ?). • Some memory allocation checking. • Assembly Language Pros: • More control over ISA more speed, less memory • More control over I/O • Combination is often best for embedded systems:HLL calling AL . Mark Franklin, S06
b = c + d*e LOAD R1, d LOAD R2, e LOAD R3, c MPY R4, R2, R1 ADD R5, R4, R3 STORE R5, b Example: HLL AL Mapping HLL AL Mark Franklin, S06
Buses: I • A set of path(s) (wires) connecting on-chip or off-chip modules. • Serial bus: transmit one bit at a time • Parallel bus: transmits many bits simultaneously • Generally time-shared. • Generally has separate data & control paths. • Typically has a separate bus controller or arbiter that decides which modules can use the bus at any given time. Mark Franklin, S06
Buses: II • Some common buses: • On-chip: AMBA, Wishbone, (generally not standard) • Off-chip: PCI Bus Family), • ---------------- 32bit transfer 64bit transfer • 33-MHz PCI 133 MB/sec 266 MB/sec • 66-MHz PCI 266 MB/sec 532 MB/sec • 100-MHz PCI-X ------------ 800 MB/sec • 133-MHz PCI-X ------------ 1 GB/sec • PCI-e(xpress) serial, 1 lane 500 MB/sec • PCI-e(xpress) serial, 4 lanes 2 GB/sec • Off-chip: Other buses - SCSI, IDE, Infiniband • Common issues: Arbitration, congestion. • Logical equivalence between buses, multiplexers and switches. Mark Franklin, S06
Bandwidth Requirements Mark Franklin, S06
Bandwidth Trend Mark Franklin, S06
Simple Queuing Theory View of Buses • Bus is a shared resource and can be viewed as a server in a queuing system. • Modules attached to the bus present inputs (i.e., requests) to the server (or Bus) and are queued up if the server is busy. Memory Server CPU Queue BUS I/O Mark Franklin, S06
Basic Queueing Theory • Utilization: % time a server is busy • Average Queue Length: Avg # of jobs in queue. • Average System Delay (latency): Avg time from job entry into, to job departure from system. • Arrival Time Distribution: Poisson Distribution of arrival times (exponential interarrival times). • Service Time Distribution: Exponentially distributed service times. • Queue Charactericstics: Infinite length; FIFO service discipline. Mark Franklin, S06
Basic Queueing Results Mark Franklin, S06
Basic Queueing Results Waiting Time Queue Length M/M/1 M/M/1 Mark Franklin, S06
Computer Generations • 1: 1950 - 1959 Vacuum Tubes • 2: 1960 - 1968 Transistors • 3: 1969 - 1977 Integrated Circuit • 4: 1978 - 2005 LSI-Large Scale Integration; VLSI-Very LSI • 5: 2005 - 20?? ULSI-Ultra LSI; parallel processing Mark Franklin, S06
Technology: How we make a chip (roughly) Mark Franklin, S06
Integrated Circuit Cost Cost.per.wafer Cost.per.die = ----------------------------------- (Dies.per.wafer) x (Yield) Wafer.area Dies.per.wafer = ------------------- (approximate) Die.area 1 Yield = ---------------------------------------------- (empirical observation) (1 + (Defects.per.area)x(die.area/2))2 Typical: Die area = 1.5 cm x 1.5 cm; Wafer Diameter = 10 inches; Defects.per.cm2 = 1.7; Yield = 50 % Mark Franklin, S06
TECHNOLOGY TRENDS • Semiconductors: • Transistor Density: +50%/year, quadruple in 4 years. • Die Size: +10 - 25%/year • IC Logic Technology: • Transistors per Chip: +50 - 60%/year • Device Speed: +30%/year • Wire/Communications Speed: ~constant (Cu vs Al) • Magnetic Disk Technology: • Density: +25 - 60% / year • Access Time: +35% / 10 years (8 ms). Mark Franklin, S06
Wafer Size 12-inch wafer Mark Franklin, S06
SILICON & MAGNETIC DENSITIES Mark Franklin, S06
Processor Performance Gains Performance (x VAX-10/780) Mark Franklin, S06
Processor Cost Trends with Time Mark Franklin, S06
SILICON & MAGNETIC DENSITIES Mark Franklin, S06