460 likes | 568 Views
Enhancing the MOVE framework. Endianness port Long Immediates. Master’s thesis | Ivo Janssen | 11 mei 2001. introduction. overview. introduction endianness long immediates conclusions. introduction. introduction. project motivation overview Move project. introduction.
E N D
Enhancing the MOVE framework Endianness port Long Immediates Master’s thesis | Ivo Janssen | 11 mei 2001
introduction overview • introduction • endianness • long immediates • conclusions Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
introduction introduction • project motivation • overview Move project Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
introduction motivation • Laboratory ofComputer Engineering • NEC C&CRL, Princeton, NJ, USA • PcomP / packet processor • endianness • immediates • linux/x86 machines • cheap • little endian Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
introduction the move framework application C/C++ machine technology description softwareframework hardwareframework cycle count cost/performance explorer modify configuration Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
introduction the move framework C++ Pascal c = a+b; c = c<<4; d = func100(c); begin c := a+b; c := c*16; d := func100(c) end; RISC TTA add r3,r8,r9 r8 -> add_o r9 -> add_t add_r -> r3 shl r3,r3,4 r3 -> shl_o 4 -> shl_t shl_r -> r3 jump 100 100 -> jump_t Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
introduction the move framework Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
introduction the move framework move bus 1 2 3 0 cycle 0 1 2 3 4 r8 -> add_o r9 -> add_t r3 -> shl_o r3 -> shl_o 4 -> shl_t shl_r -> r3 add_r -> r3 add_r -> r3 r8 -> add_o r9 -> add_t add_r -> r3 r3 -> shl_o 4 -> shl_t 100 -> jump_t Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
endianness endianness • what is endianness • endianness in the Move framework Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
endianness what is endianness • Gulliver’s travels • byte ordering • 32 bit architecture • ‘byte addressable’ Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
endianness what is endianness • little endian (x86, PDP-11, Alpha)least significant byte is stored at the most significant address memory address 00 01 02 03 11 22 33 44 = 0x44332211 • big endian (Sparc, HPPA, m68k)most significant byte is stored at the most significant address memory address 00 01 02 03 44 33 22 11 = 0x44332211 Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
endianness changing endianness • ‘byte swap’ memory address 00 01 02 03 11 22 33 44 44 33 22 11 Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
endianness host and target endianness • host endianness • file on disk always has the same endianness • ‘swap’ if host != file • target endianness • file on disk is e.g. a binary with a certain endianness • ‘swap’ if host != target Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
endianness endianness in the Move framework • apply principles on Move framework • host endianness Move framework has to run on both little (x86) and big (sparc) • target endianness host should be able to run both ‘big move’ and ‘little move’ Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
endianness endianness in the front end C/C++ gcc-move assembler move libraries .o linker seq. TTA bintools Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
endianness endianness in the back end seq. TTA profiling s. simulator machine profile verification scheduler p. simulator parameters par. TTA .txt Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
immediates immediates • what are immediates • existing implementation • requirements • possible solutions • ‘resource variant’ • ‘pseudo-move variant’ • conclusions • future work Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
immediates what are immediates • immediates take lots of bits • more than available space? 1993 -> sub_o dy=y-1993; y -> sub_t sub_r -> dy 1993 sub_o 2 8 6 guard source destination Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
immediates existing implementation • fixed immediate fields • always writes to ‘immediate register’ move slot move slot move slot immediate field 1993 definition guard i0 sub_o use Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
immediates requirements • possibility of no dedicated fields • short immediates stay in source field • long immediate bits in instruction stream • add state between ‘definition’ and ‘use’ • clean code interface • must be applicable to PcomP Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
immediates possible solutions • make move slot wider • use multiple ‘short immediates’ to construct large • schedule immediate fields in the move slots immediate field 1993 move slot move slot move slot guard i0 sub_o Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
resource table LIT i0 free time immediate bits busy busy busy busy i0 -> r4 free immediates ‘resource variant’ Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
immediates ‘resource variant’ • decoupling of def/use • no dedicated fields required • LIT tag • immediates not part of movelist • Ifetch unit stores bits in immediate registers • immediate registers become part of state Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
immediates scheduling algorithms • mach-file format • data structures • algorithms Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
immediates mach file format LongImmediate { Registers: i0 20, signed, ir_0; i1 20, signed, ir_1; i2 32, signed, ir_2; Control: {}; i0 20 : {4}; i1 20 : {5}; i0 20 : {4}, i1 20: {5}, i2 32: {4,5}; } ImmediateUnits { i0 32, signed, ir_1; i1 20, signed, ir_2; i2 20, signed, ir_3; } Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
immediates scheduling algorithms FindImmMoveBus if (immediate fits in source field) return success else forall (iregs) do assign ireg socket to source check resources on ireg if (possible allocation imm-use found) tentatively claim imm-use for (this cycle downto zero) do check if ireg is available check if LIT encodig is possible tentatively assign LIT tag if (movebuses allocatable) then break commit imm-def and imm-use return success Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
immediates benchmarks • various mach-files: • mach.pcomp • 6 buses, 3 imm. reg., 2 imm. ‘slots’ • mach.one • 6 buses, 1 imm. reg., 1 imm. ‘slot’ • mach.small • 3 buses, 1 imm. reg., 1 imm. ‘slot’ • mach.big • 8 buses, 2 imm. reg., 2 imm. ‘slots’ • no dedicated fields Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
immediates benchmarks • various benchmarks: • dsp-suite (arfreq, music, radproc, edge, expand, flatten, smooth) • g722main • cjpeg, djpeg • go • compress • m88ksim Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
immediates benchmarks • metric under test: • instruction counts • also derived: code size • prediction: • slight increase instr. count • if dedicated fields go, huge reduction in code size Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
immediates benchmark results • instruction count increases • especially for smaller machines • 1-2% average increase(6% for small machines) • code size decreases • dedicated fields are ~20% of instruction word width • code size decrease can be near 20% if dedicated fields go Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
immediates ‘pseudo-move variant’ • TNO-FEL needed implementation too • paradigm shift: • ‘resource variant’: clean code interface • resulting in the ‘pseudo-move variant’ Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
immediates ‘pseudo-move variant’ 1993 -> i0 immediate operation i0 -> r33 1993 -> sub_o dflw(r33) r33 -> sub_o Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
immediates ‘pseudo-move variant’ • split immediate move in two operations • schedule the immediate operation (def and use) as normal moves • count on bypass of virtual register as optimization Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
immediates qualitative comparison ‘resource variant’ • one ‘move’ less added • more flexible encoding • clean code interface ‘pseudo-move variant’ • importing is possible • ‘real’ moves can be scheduled effeciently Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
immediates quantitative comparison • two completely different schedulers • compare relative cycle counts, not absolute • cycle count increase both about the same • small machines:‘real’ move -> better schedule Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
immediates future work • exploration • importing of immediate writes • sharing of immediate writes Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
immediates exploration Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
immediates region scheduling immediate writes immediate bits immediate bits A B C i0 -> r4 Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
immediates sharing immediate writes resource table LIT i0 0 time immediate bits 2 2 i0->sub_o 2 i0 -> ld_t 1 0 Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
conclusions conclusions endianness • completed • host-dependency: • sources compile onboth platforms • target-dependency: • one Makefile switch controls all tools Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
conclusions conclusions immediates • small (negligible) instruction count increase • possible large decrease code size • clean code interface not entirely achieved Enhancing the MOVE framework - Ivo Janssen - May 11, 2001
borrel ricardishof, 21:00 uur