140 likes | 239 Views
Using a CSP based Programming Model for Reconfigurable Processor Arrays. By: Zain-ul-Abdin Zain-ul-Abdin@hh.se. Motivation. Emergence of new heterogeneous parallel architectures Increased Performance Power Efficiency Traditional methods Automatic parallelization by compilers
E N D
Using a CSP based Programming Model for Reconfigurable Processor Arrays By: Zain-ul-Abdin Zain-ul-Abdin@hh.se
Motivation • Emergence of new heterogeneous parallel architectures • Increased Performance • Power Efficiency • Traditional methods • Automatic parallelization by compilers • Use of Thread model of computation • Highly non-deterministic • Use of Concurrent Programming Model • Expresses computations in a productive manner by matching it to target hardware • Supported by a compiler for allowing portability "Using a CSP based Programming Model for Reconfigurable Processor Arrays", Zain-ul-Abdin
Array of Processors • Consists of heterogenous processors with specialized interconnection netrworks • Improved performance by exploiting paralellism rather than scaling clock frequency • Flexible due to dynamically reconfigurable interconnection network • Energy Efficient • Individual brics can be switched off when not in use • The Clock frequency of brics can be optimized "Using a CSP based Programming Model for Reconfigurable Processor Arrays", Zain-ul-Abdin
Ambric Programming Model • Design consists of: • Objects: defines the functionality in either java subset or assembly. • Structured composition described in aStruct "Using a CSP based Programming Model for Reconfigurable Processor Arrays", Zain-ul-Abdin
Ambric-Simple Example Design Toplevel design SimpleDesigntop { Root_IF root_Inst; } interface Root_IF {} binding CompRoot implements Root_IF { simpledesign process1; Vio inOut = {NumSources = 1, NumSinks = 1}; channel c0 = {inOut.out[0], process1.in}; channel c1 = {process1.out, inOut.in[0]}; } Object Structure interface simpledesign { inbound in; outbound out; } binding Javasimpledesign implements simpledesign { implementation "simpledesign.java"; } Object Implementation import ajava.io.InputStream; import ajava.io.OutputStream; publicclass simpledesign { publicvoid run(InputStream<Integer> in, OutputStream<Integer> out) { while (true) { out.writeInt(in.readInt()); } } } "Using a CSP based Programming Model for Reconfigurable Processor Arrays", Zain-ul-Abdin
Why use Occam-pi? • Language level support for concurrency • Provides higher order combinators for facilitating composition of re-targetable data parallel descriptions • Sematically transparent PAR/SEQ style • Explicit control of graularity of parallelism and data locality "Using a CSP based Programming Model for Reconfigurable Processor Arrays", Zain-ul-Abdin
Occam-pi Language • Based on ideas of CSP with pi-calculus • Abstractions for underlying hardware • Processes • Channels (Unbuffered message passing) • Rendezvous behavior of channels • Receiver blocks until the sender wrote the value • Sender continues after the receiver read the value "Using a CSP based Programming Model for Reconfigurable Processor Arrays", Zain-ul-Abdin
Occam-pi Language PROC SimpleEx() INT x,y: CHAN OF INT c,d: PAR SEQ c ! 117 d ? x SEQ c ? y d ! 118 : • Primitive actions • Variable assignment • Channel output ! • Channel input ? • PAR • SEQ • Variables can only be written by one process in parallel • Likewise, only a single process can read from a channel, and another single process can write to the channel "Using a CSP based Programming Model for Reconfigurable Processor Arrays", Zain-ul-Abdin
Compilation Methodology • Implemented a Backend for Ambric in Tock(Translator of Occam to C by Kent) • Staged compilation • Native SOPL code generation for Ambric • Use of concurrency of Occam-pi • Reduced memory footprint "Using a CSP based Programming Model for Reconfigurable Processor Arrays", Zain-ul-Abdin
Occam-Ambric Compilation "Using a CSP based Programming Model for Reconfigurable Processor Arrays", Zain-ul-Abdin
Ambric-related Transformations • Introduction of Channel-end Specifiers • Enables use of flat data parallelism • Replicators transformations: • SEQ Replicators to For loops • PAR Replicators unrolled to multiple PROCs • Emission of aStruct structural interface and binding code for each PROC • Emission of aJava class code corresponding to each PROC "Using a CSP based Programming Model for Reconfigurable Processor Arrays", Zain-ul-Abdin
1D-Discrete Cosine Transform "Using a CSP based Programming Model for Reconfigurable Processor Arrays", Zain-ul-Abdin
Performance Results • 8-point DCT Implementations "Using a CSP based Programming Model for Reconfigurable Processor Arrays", Zain-ul-Abdin
Conclusions • Proposed the use of Occam-pi for programming a coarse-grained processor architecture • Raises the abstraction level while not compromising the efficiency • To extend the compiler for supporting mobility features of Occam-pi for reconfigurable logic "Using a CSP based Programming Model for Reconfigurable Processor Arrays", Zain-ul-Abdin