1 / 15

Intel Itanium 2 Processor

Intel Itanium 2 Processor. Intel’s Server Solution Raymond Ball April 2, 2004. Presentation Overview. Why Intel Itanium 2 in a DSP class? General specifications and features Instruction set DSP in Itanium 2 Itanium 2 vs. TigerSHARC (?). Why Itanium 2.

amber-petty
Download Presentation

Intel Itanium 2 Processor

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Intel Itanium 2 Processor Intel’s Server Solution Raymond Ball April 2, 2004

  2. Presentation Overview • Why Intel Itanium 2 in a DSP class? • General specifications and features • Instruction set • DSP in Itanium 2 • Itanium 2 vs. TigerSHARC (?)

  3. Why Itanium 2 • Itanium 2 designed for heavy loaded and number crunching servers which has some similarities to DSP • It’s always a good idea to see what other solutions are available • Designs tend to over time borrow ideas from other fields which may give insight • To see if the power in the processor is really worth the cost • Because I was interested

  4. Clock 0.9-1.5 GHz L3 cache up to 6MB 64 bit 128 bit bus (400 MHz) Price: $3k - $5k ea.  IA-32 “compatible” Considered RISC Pipeline 8 deep 6 instructions / cycle in 2 bundles of 3 Power consumption: 110W (130W max) 128+128+64+8 registers Specifications (April 2004)

  5. Register Stack Engine (RSE) • First 32 registers are global (static) • GR0 is hardwire as 0 • Seen this in SHARC because immediate will kill the pipeline • GR32 – GR63 local procedure registers • The remaining 96 registers are used to store stacked register frames • If more room is needed, the registers are pushed onto memory • Transparently maintains the illusion of an infinite number of registers • Only for the GRs (other registers are all global)

  6. Instruction set • Instructions come in bundles of 3 operations and 2 bundles are pulled in once a cycle • Uses a special Explicitly Parallel Instruction Computing (EPIC) format • The format moves the responsibility of resource management on to the compiler • Template value dictates to which execution unit an operation will be performed Bit 127 Bit 87 Bit 46 Bit 5 Bit 0

  7. Bundled Code Example { .mii add r1 = r2, r3 sub r4 = r5, r6 ;; shr r7 = r8, r9 } { .mfi ld4 r14=[r56] fadd f10=f12,f13 add r16=r18,r19 } { .mmi st4 [r16]=r67 ;; add r24=r56,r57 add r28=r58,r59 }

  8. Save me compiler! • Instruction set and pipeline so difficult to handle you won’t do much better than the compiler • With the EPIC architecture, more resource management is put on the compiler, which means extra work for human compilers • The most efficient DSP algorithms tend to come from human compilers • Difficult to utilize all of the system resources like a hand made DSP algorithm • What’s wrong with r1 = r2 + r3?

  9. DSP Relation • How does the instruction set compare to a DSP processor? • RISC type instruction set • For example, no mem-to-mem move • Itanium 2 could easily be used to efficiently do a DSP algorithm • The Itanium 2 basically includes every trick in the book thus far, which includes borrowing ideas from DSP

  10. Pro-DSP • Many single cycle instructions • Instructions are designed for a heavily pipelined environment • Processor has ways of accessing the data in a SIMD fashion (8x8-bit, 4x16-bit, 2x32-bit, 1x64-bit) • High precision registers (82-bit floating-point accumulator) • People wonder whether 64-bit processing is necessary, well THIS is where it’s necessary • High number of registers for fast access

  11. Anti-DSP • No hardware loops • No hardware circular buffers • Only a single bus (although fast 6.4GB/s) • High power usage

  12. TigerSHARC vs. Itanium 2 • COST! ($0.3k vs. $3k) • Both heavily pipelined • Both very hard to code by hand • There really is no comparison • Processors were made for two different intensions • The framework that is typically built around the chips makes it even harder to compare

  13. Conclusion • You get what you pay for… or maybe a little less • The Itanium 2 is consider to be a high-end server processor • Anything high-end tends to be very over priced (rack mount equipment) • Sure, it’s a DSP processor but for that price it should make you toast in the morning too

  14. References • Intel Itanium 2 Processor Hardware Developer’s Manual • Intel Itanium 2 Processor Reference Manual • A 1.5-GHz 130-nm Itanium 2 Processor With 6MB On-die L3 Cache. IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 38, NO. 11, NOVEMBER 2003. Stefan Rusu, Senior Member, IEEE, Jason Stinson, Simon Tam, Member, IEEE, Justin Leung, Harry Muljono, and Brian Cherkauer.

More Related