Machine Instruction Rulebook

Machine Instruction Rulebook • CPU design must support machine instructions • Thus a CPU has an instruction set • Intel, for example, has the x86 instruction set • As instruction set grows old, designers have choice • One option: start over (e.g. IBM's S/360) • Typical choice: backward compatibility (x86) • Middle option: modularize the instruction set (e.g. extensions for Intel, ARM)

CISC versus RISC • RISC = Reduced Instruction Set Computer • CISC = Complex Instruction Set Computer • RISC circuitry is simpler than that of CISC • Translates to speed, reliability, lower costs • RISC doesn’t have lots of “high-level” instructions at hardware level • RISC puts burden on compilers • Even CISC chips are typically RISC on the inside • Intel is CISC, ARM is RISC

Pipelining • Push multiple instructions thru cycle at same time • Trick: divide instruction processing into stages • For example, fetch first instruction • Then do two things at same time: • Decode first instruction • Fetch next instruction • Then do three things at same time: • Execute first instruction • Decode second instruction • Fetch third instruction

An Example Six-Stage Pipeline • FI: Fetch Instruction • DI: Decode Instruction • CO: Calculate Operands (figure out where operands are) • FO: Fetch Operands • EI: Execute Instruction • WO: Write Operand

Changing the Meaning of Time

Adding Stages, Saving Time • Each clock pulse redefined • was: “go through all stages for one instruction” • now: “go through a stage for many instructions” • The “old” clock tick included all six stages • The “new” clock tick is for all stages at once • How do we quantify? Need delay for longest stage • Modern pipelines can have as many as 30 or more stages • So, how do we save time? Marginal time per instruction • Consider timing for the nine instructions in last slide • Without pipeline: 9 “old” clock ticks • With pipeline?

Pentium 4 Pipeline

Creating Stages with Memory • Need some way to separate stages • Need to “pipe” output of one stage to input of next • Regular memory uses flip-flops (edge-triggered) • The pipeline can remember using latches (level-triggered) • So establish latches to remember output of each stage • Latch also serves as input for the next stage

Pipelines Hate Branching

Two Ways to Predict

A Third Way: EPIC • Explicitly Parallel Instruction Computing • Co-developed by Intel and HP • Itanium was first implementation of EPIC • Innovative solution to branch prediction problem • don't try to predict at all! • execute all the paths in the code (up to a point) • keep up with register copies for each path • when branch decided, free up “extra” registers

Hazard: Resource Conflict

Hazard: Data Conflict RAW: read ahead of write (fetching operand before change)

Superscalar Architecture • Use multiple function units • multiple instructions can execute in parallel • each uses its own circuitry (e.g. multiple ALUs) • Issues • some instructions shouldn’t execute in parallel • difficult to design CPU that decides • put burden on compiler (e.g. the Pentium optimized compiler versus generic compiler)

Multiple Function Units Execute Fetch Decode Execute Store Execute

Also Multiple Types of Units

Pipeline versus Superscalar

Decoding and Dispatching

Doing Things Out of Order • Dispatcher can grab any instruction • while waiting on fetch, do something useful • Look out for data hazards • RAW: read ahead of write • WAR: write ahead of read • WAW: write ahead of write • Dispatcher must be aware of dependencies • Renaming registers: pros and cons

Dealing with Dependencies

Pentium II Dispatch/Execute Unit

Pentium 4 does Superscalar

UltraSPARC II Pipeline

Improving Performance • Rev up the clock speed • Redesign circuits to reduce delay, e.g. • Use ripple adder: decrease worst-case ALU propagation • Reorder some microinstructions • Add an incrementer to PC register • Add pre-fetcher, pipeline, superscalar • Add lots of registers (RISC) • Branch prediction, speculative execution • Note tradeoff among speed, cost, and space • Find things to do in parallel

Working Smarter and Harder

Machine Instruction Rulebook

Machine Instruction Rulebook

Presentation Transcript

Luggage Frenzy Rulebook

Description from US Lacrosse Womens Rulebook

Java Virtual Machine Instruction Set Architecture

TITAN COMMUNITY RULEBOOK

CS61CL Machine Structures Lec 5 – Instruction Set Architecture

Lecture 13 Java Virtual Machine: Instruction Set

Java Virtual Machine: Instruction Set

Theory of Computer Science Instruction set for machine W

Procurement Rulebook

Texas TSA State Rulebook

[PDF] Necromunda: Rulebook by Games Workshop

CS61C Instruction Representation and Machine Language Lecture 5

CS61C - Machine Structures Lecture 6 - Instruction Representation

Machine control instruction

8086/8088 Instruction Set, Machine Codes and Addressing Modes

[EBOOK] DOWNLOAD Fantasy AGE Core Rulebook

[PDF] READ] Free Deathwatch RPG: Core Rulebook full

Read ebook [PDF] Traveller Core Rulebook

READ EBOOK [PDF] Rogue Trader RPG: Core Rulebook

Download WARHAMMER 40000 RULEBOOK ENGLISH for android

[READ DOWNLOAD] Black Crusade RPG: Core Rulebook