The Core SVP compiler

The Core SVPcompiler Thomas A.M. Bernard t.bernard@science.uva.nl University of Amsterdam September 3rd, 2008

What is the SVP compiler ? Core SVP Compiler Applications Architecture How to program multi-core processors efficiently ? • Black box…

The SVP tool chain - DRISC

µTC • µTC is a C-based language that captures the concurrency creation and management of SVP: • - The constructs reflect machine instructions added to DRISC* cores in order to support SVP+ at the ISA level. • - We use µTC as an intermediate language - compiler target • also useful in understanding the model from a programming perspective.

DRISC ISA cre – create a family of microthreads brk – break from a family sqz – squeeze pre-empts a family kill – kill the family from outside end – end finishes the code of microthread swch – context switch Pseudo- instructions encoded by control stream DRISC = RISC instruction set + µT instructions New instructions to handle the concepts of the model:

µTC keywords Function specifiers thread squeezable Types index family place Constructs create sync kill squeeze break µTC Type qualifiers shared

Implementation of µTC • Based on the GCC compiler framework • This parses µTC and generates assembler for DRISC processors - currently based on Alpha ISA, • the GCC compiler is retargettable with a backend compiler generating compilers for arbitrary processors.

GCC 4.1 Compiler Structure Front-end Middle-end Back-end C Machine description GIMPLE - Rules of ASM - Memory def. - Lexical/Syntax - Type checking - Semantic GIMPLE-SSA GIMPLE - IR Generation (tree based) - SSA Generation - Optimizations (target independent) RTL - IR Generation - RTL Generation (list of objects) - Low-level optimizations ASM - Code Generation - Register allocator - Instruction Sched.

Design of µT Compiler Front-end Middle-end Back-end µTC Machinedescription GIMPLE New keywords: - Lexical extended - Syntax definitions - Semantics rules - New rules (using new ISA-µT) - New memory features (class of registers) GIMPLE-SSA New nodes in the tree. GIMPLE Optimizations upgraded upon the new information (explicit concurrency) RTL New nodes in the tree. New objects ASM-µT We keep the concurrency captured in the µTC language all along the compilation stages. - Code generator extended with the new rules. - Register allocator updated

µTC Compilation threadvoid foo (int a, float b) { int local_a = a; float local_b = b; … } threadvoid main() { int a_g =0; float b_g = 5.6; familyfid; … create(fid;;0;9;1;;) foo (a_g,b_g); sync(fid); … }

.align 2 .globl main .ent main $main..ng: main: .registers 0 0 0 0 0 0 #GI,SI,LI,GF,SF,LF .prologue 0 stl $31,24($15) ldah $1,$LC0($29) !gprelhigh lds $f10,$LC0($1) !gprellow sts $f10,20($15) ldq $1,32($15) allocate $1 #TCB_INSTRUCTIONS setstart $1,0 #TCB_INSTRUCTIONS swch #CONTEXT SWITCHING setlimit $1,9 #TCB_INSTRUCTIONS setstep $1,1 #TCB_INSTRUCTIONS ldq $1,32($15) cred $1,foo #CREATE_D ldq $1,32($15) bis $1,$31,$31 #SYNC addl $31,$1,$1 stl $1,16($15) end #END_THREAD .end main .ident "--University of Amsterdam, Extended by CSA group-- GCC: (GNU) 4.1.0" .section .note.GNU-stack,"",@progbits µT-ASM .set noreorder .set volatile .set noat .set nomacro .arch ev4 .text .align 2 .globl foo .ent foo $foo..ng: foo: .registers 0 0 0 0 0 0 #GI,SI,LI,GF,SF,LF .prologue 0 mov $16,$1 sts $f17,36($15) stl $1,32($15) ldl $1,32($15) stl $1,20($15) lds $f10,36($15) sts $f10,16($15) end #END_THREAD .end foo .section .rodata .align 2

GCC registers requirement analysis 32 0 $2 $f2 $1 $L0 $f1 $LF0 local_a local_b Dependents $0 $G0 $f0 $GF0 a b Locals Integer Register Window FP Register Window Thread function: foo $2 Shared a_g $L1 $1 $f1 $0 $L0 $f0 $LF0 Globals fid b_g 0 Integer Register Window FP Register Window Virtual register window Thread function: main Reference: T.A.M. Bernard, C.R. Jesshope, Compile-time register requirements analysis for μTC using the GCC framework, Submitted to HiPEAC 2009.

Area of research (1) Reference: T.A.M. Bernard, C.R. Jesshope, and P.M.W. Knijnenburg, Strategies for Compiling μTC to Novel Chip Multiprocessors, S. Vassiliadis et al. (Eds.): SAMOS 2007, LNCS 4599, pp. 127-138, 2007. • Register allocation: • Allocate two sets of 32 registers ($G,$L,$S,$D) per microthread. • Allocate the registers on the right variables following explicit declarations done at µTC level. • “dynamic allocation” (similar to ia64): • Problem: number of register classes know only at compile-time • Those numbers vary from one family to another.

Area of research (2) • Code generation: • Schedule/Emit the new µT-ISA instructions. • Data dependencies: Discover the dependencies between instructions in order to add the ‘swch’ instruction. • Investigating register spilling strategies. (c.f. Rustam’s talk)

Questions ?The core SVP compiler Thomas A.M. Bernard t.bernard@science.uva.nl University of Amsterdam September 3rd, 2008

The Core SVP compiler

The Core SVP compiler

Presentation Transcript

Compiler++ Evolving the compiler - C2.DLL

The Scale Compiler

SVP Equipment

The Query Compiler

The Query Compiler

COMPILER

The Tiger compiler

The Query Compiler

SVP Server

The Scout Compiler

The Tiger compiler

The Query Compiler

The Query Compiler

The Imperial | SVP GROUP