190 likes | 336 Views
The Layered Machine. High-level language level. Assembly language level. Operating system level. Instruction set architecture level. Microarchitecture level. Digital logic level. What Does Architecture Mean?. Enterprise architecture. Software architecture. Machine design.
E N D
The Layered Machine High-level language level Assembly language level Operating system level Instruction set architecture level Microarchitecture level Digital logic level
What Does Architecture Mean? Enterprise architecture Software architecture Machine design Chipset and motherboard design Chip design (Microarchitecture)
Very Simple CPU Layout Control Unit Registers ALU
Some Important Registers • PC • (program counter) tracks address of next instruction • add one to it or set it explicitly • IR • (instruction register) holds current instruction • control unit’s master plan for the other parts • ALU inputs • “left” and “right” • hold data used by current instruction
Basic Instruction Cycle • Fetch next instruction into IR using PC • Change PC to point to next instruction • Decode current instruction • Read some memory into a register (optional) • Execute using ALU and designated registers • Store a register into some memory (optional) • Start over with a fetch
CISC and RISC • RISC = Reduced Instruction Set Computer • CISC = Complex Instruction Set Computer • RISC circuitry is simpler than that of CISC • Translates to speed, reliability, lower costs • RISC doesn’t have lots of “high-level” instructions at hardware level • RISC puts burden on compilers • CISC often wins over RISC for backward compatibility
EPIC • Explicitly Parallel Instruction Computing • Co-developed by Intel and HP • Itanium is first implementation of EPIC • Innovative solution to branch prediction problem • execute all the paths in the code (up to a point) • save state of each path in unique location • when branch decided, eliminate states for paths not chosen
Pipelining • Push multiple instructions thru cycle at same time • Trick: divide instruction processing into stages • For example, fetch first instruction • Then do two things at same time: • Decode first instruction • Fetch next instruction • Then do three things at same time: • Execute first instruction • Decode second instruction • Fetch third instruction
4-Stage Pipeline Fetch Decode Execute Store
Superscalar Architecture • Use multiple function units • multiple instructions can execute in parallel • each uses its own circuitry • Problem • some instructions shouldn’t execute in parallel • too difficult to design CPU that decides • put burden on compiler (e.g. the Pentium optimized compiler versus generic compiler)
Multiple Function Units Execute Fetch Decode Execute Store Execute
Memory Organization • Each location is addressable • Addresses are binary numbers • Addresses used at different granularities • each bit is possible, but not very likely • each byte is possible and seldom used (today) • typical usage is a “word” of memory • Each CPU designed with a particular word size (e.g. 8-bit, 16-bit, 32-bit, 64-bit)
Memory Addresses • Question: How many addresses are possible with an 8-bit computer with 512 bits of memory? • Possible answers: 2**8 = 256 addresses each of which has two bits 64 addresses using a 8-bit word • The “8-bit” refers to word size, not total # of bits (or # of addresses) in memory
Cache Memory • Basic idea • very fast, but more expensive memory • often closer and smaller than main memory • nowadays multiple levels of cache inside CPU package • used to save frequently used data/instructions • Problem • how to decide what to put in cache? • locality principle: grab a block when you get a word • nowadays cache typically split for data and instructions