320 likes | 479 Views
The StreamIt Compiler: Targeting Raw. Michael Gordon Bill Thies Michal Karczmarek Saman Amarasinghe. StreamIt Overview. Filter is the basic unit of computation Filters communicate with neighboring blocks using typed FIFO channels The channels support three operations:
E N D
The StreamIt Compiler: Targeting Raw Michael Gordon Bill Thies Michal Karczmarek Saman Amarasinghe
StreamIt Overview • Filter is the basic unit of computation • Filters communicate with neighboring blocks using typed FIFO channels • The channels support three operations: • pop(): remove item from end of input channel • peek(i): get value i spaces from end of input channel • push(val): push value onto output channel
Filter • Each filter contains: • An init(…) function which is called an initialization time. • A work() function to describe the execution of the filter in the steady state • Other helper functions called by init() or work() • Variables persistent over executions of the work() function
unstructured structured • Pipeline • Split/Join • Feedback Composing Filters
Pipeline • Sequence of streams • Each stream can be Filter, Pipeline, SplitJoin, or FeedbackLoop Stream 1 Stream 2 Stream N
SplitJoin • Independent parallel streams Splitter Stream N Stream 1 Stream 2 Joiner • Splitter and Joiner types are pre-defined: • Duplicate (Splitter) – send item to all streams • Weighted RoundRobin– route in pattern
Feedback Loop • For introducing cycles Queue ofinitial inputs Joiner Loop Stream Body Stream Splitter • Splitters and Joiners are same as in SplitJoin
Other StreamIt Features (not implemented) • Messaging • dynamic, low-volume messages sent from within a work() function • Message timing that allows a filter to specify when a message will be received • Broadcast support • Re-Initialization • Allows the stream graph to be modified during runtime • Achieved through init() calls
class Adder extends Filter { int N; void init (int N) { this.N = N; input = new Channel(Float.TYPE, N); output = new Channel(Float.TYPE, 1); } void work() { float sum = 0; for (int i=0; i<N; i++) { sum += input.popFloat(); } output.pushFloat(sum); } } public class Equalizer extends Pipeline { void init(float samplingRate, int N) { add(new SplitJoin() { void init() { int bottom = 2500; int top = 5000; setSplitter(DUPLICATE()); for (int i=0; i<N; i++, bottom*=2, top*=2) { add(new BandPassFilter(samplingRate, bottom, top)); } setJoiner(ROUND_ROBIN()); }}); add(new Adder(N)); } } class FMRadio extends StreamIt{ void init() { add(new DataSource()); add(new LowPassFilter(samplingRate, cutoffFrequency, numTaps)); add(new FMDemodulator(samplingRate, maxAmplitude, bandwidth)); add(new Equalizer(samplingRate, 4)); add(new Speaker()); } } Code Example - FM Radio
The Current Version of StreamIt • Currently the StreamIt compiler only supports static rates. • The number of items peeked, popped, and pushed by each filter remains constant over the life of the filter • Channels can communicate only scalar data types
Compiler Flow StreamIt Code KOPIfront-end Parse tree Conversion to StreamIt IR SIR (parameterized) Graph Expansion UNIPROCESSOR RAW SIR (expanded) Fusion /Fission RawBackend Conversionto C Conversion to Low IR
Fission / Fusion • Fission • Split a filter into a pipeline for load balancing • Duplicate a filter, placing the duplicates in a SplitJoin to expose parallelism. • Fusion • Merge filters into one filter for load balancing and synchronization removal
Raw Backend (Expanded) Layout FlatIR Flattener Scheduler Initial, Steady State Router Switch Code Generation Simulator sw0.s, sw1.s, … Synchronization Schedule Joiner Scheduler Tile Code Generation tile0.c, tile1.c, … Makefile Generation Makefile
Raw Backend - Layout • Layout • At this point layout is done by hand • Will be automated soon (before ASPLOS) • After partitioning, each filter is mapped to one Raw tile • Splitters are folded into their corresponding upstream filter (no tile needed) • Some joiners require their own tile • Neighboring Joiners are collapsed
Raw Backend - Switch Code • To generate the switch code we use a simulator to simulate the execution of the graph over the layout. • The switch code is generated as the simulator runs • In its current form, StreamIt is totally static and can be simulated (hopefully partitioning has balanced the load of each filter).
Simulator • First we produce an initialization schedule and a steady state schedule. • The steady state schedule is periodic, preserving the number of items on each channel. • We simulate the graph first on the initialization schedule then on the steady state schedule. • We use each schedule to calculate the number of times each filter can execute, but ordering is independent of the schedule.
Joiners • Joiners require special attention because they could lead to deadlock, if we program the switch to receive in the order specified by the joiner: Pop 20, Push 20 1 1 1 1 Pop 1, Push 1
Joiners • To resolve the deadlock, the joiner receives items as calculated by the simulator (ignoring the joiner weights). • The joiner buffers these items internally and pushes data in the order given by the joiner weights.
Communication • Filter tiles act as data routers as well • The compiler creates router nodes as necessary (tiles that are not allocated) • The communication model cannot handle some forms of circular communication at this time.
Raw Backend - Tile Code • Tile Code is pretty much a direct translation from the Java code • Loop work() and introduce buffers to handle channels: • Each filter buffers its input until it has received pop items, then it fires (done for simplicity). • pop() and peek() are reads from the buffer • A push() is a static network send
class Foo extends Filter { public void init() { input = new Channel (Integer.TYPE, 1); output = new Channel (Integer.TYPE, 1); } public void work () { int j, x = 0, pop; pop = input.popInt(); for(j=0; j<50; j++) { x = x + pop; } output.pushInt (x); } } class HelloWorld6 extends StreamIt { public void init () { int i; add (new Source()); for (i = 0; i < 14; i++) add(new Foo()); add (new Sink()); } } Simple Example 1
Simple Example 1 • Pipeline of 16 filters with equal rates and work (with peeking):
class Foo extends Filter { int loop; public void init(int i) { loop = i; input = new Channel (Integer.TYPE, 1); output = new Channel (Integer.TYPE, 1); } public void work () { int j, x = 0, pop; pop = input.popInt(); for(j=0; j<5*loop; j++) { x = x + pop; } output.pushInt(x); } } Simple Example 2 class HelloWorld6 extends StreamIt { public void init () { int i; add (new Source()); for (i = 0; i < 14; i++) add(new Foo(i)); add (new Sink()); } }
Simples Example 2 • Pipeline of 16 filters with unequal work (work increases as we get downstream):