160 likes | 330 Views
Intel Itanium 2 Processor. Intel’s Server Solution Raymond Ball April 2, 2004. Presentation Overview. Why Intel Itanium 2 in a DSP class? General specifications and features Instruction set DSP in Itanium 2 Itanium 2 vs. TigerSHARC (?). Why Itanium 2.
E N D
Intel Itanium 2 Processor Intel’s Server Solution Raymond Ball April 2, 2004
Presentation Overview • Why Intel Itanium 2 in a DSP class? • General specifications and features • Instruction set • DSP in Itanium 2 • Itanium 2 vs. TigerSHARC (?)
Why Itanium 2 • Itanium 2 designed for heavy loaded and number crunching servers which has some similarities to DSP • It’s always a good idea to see what other solutions are available • Designs tend to over time borrow ideas from other fields which may give insight • To see if the power in the processor is really worth the cost • Because I was interested
Clock 0.9-1.5 GHz L3 cache up to 6MB 64 bit 128 bit bus (400 MHz) Price: $3k - $5k ea. IA-32 “compatible” Considered RISC Pipeline 8 deep 6 instructions / cycle in 2 bundles of 3 Power consumption: 110W (130W max) 128+128+64+8 registers Specifications (April 2004)
Register Stack Engine (RSE) • First 32 registers are global (static) • GR0 is hardwire as 0 • Seen this in SHARC because immediate will kill the pipeline • GR32 – GR63 local procedure registers • The remaining 96 registers are used to store stacked register frames • If more room is needed, the registers are pushed onto memory • Transparently maintains the illusion of an infinite number of registers • Only for the GRs (other registers are all global)
Instruction set • Instructions come in bundles of 3 operations and 2 bundles are pulled in once a cycle • Uses a special Explicitly Parallel Instruction Computing (EPIC) format • The format moves the responsibility of resource management on to the compiler • Template value dictates to which execution unit an operation will be performed Bit 127 Bit 87 Bit 46 Bit 5 Bit 0
Bundled Code Example { .mii add r1 = r2, r3 sub r4 = r5, r6 ;; shr r7 = r8, r9 } { .mfi ld4 r14=[r56] fadd f10=f12,f13 add r16=r18,r19 } { .mmi st4 [r16]=r67 ;; add r24=r56,r57 add r28=r58,r59 }
Save me compiler! • Instruction set and pipeline so difficult to handle you won’t do much better than the compiler • With the EPIC architecture, more resource management is put on the compiler, which means extra work for human compilers • The most efficient DSP algorithms tend to come from human compilers • Difficult to utilize all of the system resources like a hand made DSP algorithm • What’s wrong with r1 = r2 + r3?
DSP Relation • How does the instruction set compare to a DSP processor? • RISC type instruction set • For example, no mem-to-mem move • Itanium 2 could easily be used to efficiently do a DSP algorithm • The Itanium 2 basically includes every trick in the book thus far, which includes borrowing ideas from DSP
Pro-DSP • Many single cycle instructions • Instructions are designed for a heavily pipelined environment • Processor has ways of accessing the data in a SIMD fashion (8x8-bit, 4x16-bit, 2x32-bit, 1x64-bit) • High precision registers (82-bit floating-point accumulator) • People wonder whether 64-bit processing is necessary, well THIS is where it’s necessary • High number of registers for fast access
Anti-DSP • No hardware loops • No hardware circular buffers • Only a single bus (although fast 6.4GB/s) • High power usage
TigerSHARC vs. Itanium 2 • COST! ($0.3k vs. $3k) • Both heavily pipelined • Both very hard to code by hand • There really is no comparison • Processors were made for two different intensions • The framework that is typically built around the chips makes it even harder to compare
Conclusion • You get what you pay for… or maybe a little less • The Itanium 2 is consider to be a high-end server processor • Anything high-end tends to be very over priced (rack mount equipment) • Sure, it’s a DSP processor but for that price it should make you toast in the morning too
References • Intel Itanium 2 Processor Hardware Developer’s Manual • Intel Itanium 2 Processor Reference Manual • A 1.5-GHz 130-nm Itanium 2 Processor With 6MB On-die L3 Cache. IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 38, NO. 11, NOVEMBER 2003. Stefan Rusu, Senior Member, IEEE, Jason Stinson, Simon Tam, Member, IEEE, Justin Leung, Harry Muljono, and Brian Cherkauer.