180 likes | 374 Views
Itanium. CSE 820. IA-64. Intel introduced a new ISA with no backward compatibility to x86 IA-32. What do you get from a clean sheet?. IA-64. The first product line is the Itanium. Status:
E N D
Itanium CSE 820
IA-64 Intel introduced a new ISA with no backward compatibility to x86 IA-32. What do you get from a clean sheet? Michigan State University Computer Science and Engineering
IA-64 The first product line is the Itanium. Status: • NEC announced that a 32-processor, Itanium 2-based server has achieved the world's best TPC-C benchmark result on a 32-processor SMP platform. • 1GHz, 3MB tertiary cache, 512 GB RAM Michigan State University Computer Science and Engineering
SPEC (top in 3/03) SPECint2000 • Pentium4 3GHz 1100 • IBM 690 1.3GHz 839 • Pentium4 2.2GHz 811 • Itanium 2 1GHz 810 SPECfp2000 • Itanium 2 1GHz 1431 • IBM 690 1.3GHz 1266 • Pentium4 3GHz 1090 Michigan State University Computer Science and Engineering
Registers • 128@ 65-bit general-purpose registers • 64-bit + NaT • 128@ 82-bit floating-point registers • 2 extra exponent bits over IEEE 80-bit • 64 @ 1-bit predicate registers • 8 @ 64-bit branch registers • for indirect branches • Other registers for system control, memory mapping, performance counters, and communication with the OS Michigan State University Computer Science and Engineering
Integer Registers • 0-31 general purpose • 32-128 used as a register stacksimilar to SPARC: renaming registers for function calls; includes a frame pointer (CFM)Also, special hardware handles stack overflow Michigan State University Computer Science and Engineering
Register Rotation Register rotation of registers 32-128 is used for allocating registers insoftware-pipelined loops When combined with predication, loops can be unrolled without separate prologue and epilogue—reducing the code expansion overhead of loop unrolling That is, the overhead cost of loop unrolling is reduced so smaller loops can be unrolled. Michigan State University Computer Science and Engineering
Explicit Parallelism One important aspect of the IA-64 is to allow the compiler to do more andto allow the compiler to communicate more information to hardware. In particular, the compiler can indicate when an instruction cannot be executed in parallel with its successors. Michigan State University Computer Science and Engineering
Group A sequence of consecutive instructions with no data dependences among them. All instructions can be executed in parallel, if sufficient hardware and if memory dependences are preserved. A group can be arbitrarily long, but the compiler must explicitly indicate the boundary with a stop instruction between groups. Michigan State University Computer Science and Engineering
Bundle 128-bit wide • Three 41-bit instructions • 4 MSB are opcode • 6 LSB specify predicate registers • 5-bit template • Encoded • Specifies execution unit for each instruction • Indicates “stops” Opcode combines MSB 4 bits + template info Michigan State University Computer Science and Engineering
Execution Slots • I-unit: ALU ops, shifts, moves • M-unit: ALU ops, loads, stores • F-unit: FP ops • B-unit: Branches • L+X: Extended immediates, stops, NOP2-instruction slots for 64-bit immediates Michigan State University Computer Science and Engineering
Predication • Predicate registers are set using compare or test instructions • 10 tests • Write 2 predicate registers (complement) • Multiple comparisons can be handled in one instruction • A conditional branch is simply a predicated branch Michigan State University Computer Science and Engineering
Deferred Exception Handling Itanium uses poison bits:NaT = “Not a Thing” (65th GPR bit)NaTVal = “Not a Value” (special FP value) Generated by speculative loads(all ops will propagate NaT and NaTVal) There exist nonspeculative loads which do not defer exceptions FP exceptions are handled separately using special FP status registers. Michigan State University Computer Science and Engineering
Deferred Exception Handling If NaT (or NaTVal) if nonspeculative, e.g store, an immediate exception is raised if chk.s, branch to a compiler-generated routine to recover from speculative op. (special instructions exist so O/S can save registers with NaT on context switch) Michigan State University Computer Science and Engineering
Advanced Loads Hoist loads above stores it may be dependent upon Instruction ld.a generates entry in ALAT table which stores register destination and memory address. On store, the ALAT is accessed by memory address to check for conflict.If conflict, mark ALAT entry as invalid. Michigan State University Computer Science and Engineering
Advanced Load Before any nonspeculative instruction (store) is to use the value from an advanced load the ALAT is checked. If OK, clear ALAT.If not OK • If ld.c reexecute load • If chk.a reexecute load and any speculative instructions which depend on the load Michigan State University Computer Science and Engineering
Michigan State University Computer Science and Engineering
Michigan State University Computer Science and Engineering