The Layered Machine

The Layered Machine High-level language level Assembly language level Operating system level Instruction set architecture level Microarchitecture level Digital logic level

What Does Architecture Mean? Enterprise architecture Software architecture Machine design Chipset and motherboard design Chip design (Microarchitecture)

The von Neumann Architecture

Very Simple CPU Layout Control Unit Registers ALU

ALU and Registers

Some Important Registers • PC • (program counter) tracks address of next instruction • add one to it or set it explicitly • IR • (instruction register) holds current instruction • control unit’s master plan for the other parts • ALU inputs • “left” and “right” • hold data used by current instruction

Basic Instruction Cycle • Fetch next instruction into IR using PC • Change PC to point to next instruction • Decode current instruction • Read some memory into a register (optional) • Execute using ALU and designated registers • Store a register into some memory (optional) • Start over with a fetch

CISC and RISC • RISC = Reduced Instruction Set Computer • CISC = Complex Instruction Set Computer • RISC circuitry is simpler than that of CISC • Translates to speed, reliability, lower costs • RISC doesn’t have lots of “high-level” instructions at hardware level • RISC puts burden on compilers • CISC often wins over RISC for backward compatibility

EPIC • Explicitly Parallel Instruction Computing • Co-developed by Intel and HP • Itanium is first implementation of EPIC • Innovative solution to branch prediction problem • execute all the paths in the code (up to a point) • save state of each path in unique location • when branch decided, eliminate states for paths not chosen

Pipelining • Push multiple instructions thru cycle at same time • Trick: divide instruction processing into stages • For example, fetch first instruction • Then do two things at same time: • Decode first instruction • Fetch next instruction • Then do three things at same time: • Execute first instruction • Decode second instruction • Fetch third instruction

4-Stage Pipeline Fetch Decode Execute Store

Superscalar Architecture • Use multiple function units • multiple instructions can execute in parallel • each uses its own circuitry • Problem • some instructions shouldn’t execute in parallel • too difficult to design CPU that decides • put burden on compiler (e.g. the Pentium optimized compiler versus generic compiler)

Multiple Function Units Execute Fetch Decode Execute Store Execute

Moore’s Law

Kurzweil’s Revision

Memory Organization • Each location is addressable • Addresses are binary numbers • Addresses used at different granularities • each bit is possible, but not very likely • each byte is possible and seldom used (today) • typical usage is a “word” of memory • Each CPU designed with a particular word size (e.g. 8-bit, 16-bit, 32-bit, 64-bit)

Memory Addresses • Question: How many addresses are possible with an 8-bit computer with 512 bits of memory? • Possible answers: 2**8 = 256 addresses each of which has two bits 64 addresses using a 8-bit word • The “8-bit” refers to word size, not total # of bits (or # of addresses) in memory

96-bit Memory Examples

Cache Memory • Basic idea • very fast, but more expensive memory • often closer and smaller than main memory • nowadays multiple levels of cache inside CPU package • used to save frequently used data/instructions • Problem • how to decide what to put in cache? • locality principle: grab a block when you get a word • nowadays cache typically split for data and instructions

The Layered Machine

The Layered Machine

Presentation Transcript

Layered Curriculum

The Layered Curriculum

LAYERED AUDITS

Layered Curriculum

The Layered Ocean

Layered Ink

The Layered Atmosphere

Layered Activities

Layered Convection

Layered Intelligence for Machine Monitoring

Layered Curriculum

Layered

Layered Curriculum

Layered Manufacturing

Layered Architecture

The Layered Ocean

Layered Coding

Layered Architectures

Layered Coding

Layered Curriculum