110 likes | 119 Views
This paper explores heuristics for synchronous data flow scheduling, focusing on fixed production and consumption rates. It discusses the use of separate buffers vs shared buffers and presents a spin model for separate buffers. The paper also explores the state space and discusses model checking and LTL feasibility.
E N D
Exploring heuristics forSynchronous Data Flow scheduling Pieter Hartel Theo Ruys Marc Geilen
Data driven Fixed prod & cons rates Cycles are ok Conditionals not ok (No self edges) Simple semantics Scheduling NP hard Signal processing Many variants What do we mean bySynchronous Data Flow? Periodic schedules: aababc (max c0=4) and aaabbc (max c0=6) Two problems: Separate buffers for c0 and c1 or shared buffer? a x3 b x2 c x1 2 3 1 2 c0 c1
Semantics (Lee et al, 1987) a b c • Γ= • s(0)= • γ(i) = pick column of Γ • s(i+1) = s(i)+γ(i), 0 c0 c1 Explore the state space Looking for a cycle Avoiding duplicate work i.e. Model Checking... Common buffer space: 4, separate buffers: 6
Separate buffer SPIN model (DAC05) byte c0, c1; byte s0=4, s1=2; /* min. buffer size calculated from Γ*/ init { do /*a*/ :: c0 += 2; s0 = max(c0,s0) /*b*/ :: c0>=3 -> c0 ─= 3; c1 += 1; s1 = max(c1,s1) /*c*/ :: c1>=2 -> c1 ─= 2; od } Can we make this more abstract? c1 c0 a b c 2 3 1 2 LTL feasible: [] (s0+s1 <= 6) ; infeasible [] (s0+s1 <= 5) LTL without s0,s1: [] (c0<=4 && c1<=2)
Limit # times a node is fired(sound but not complete) Can’t compute n* from c* byte na, nb, nc; #define c0 (na*2─nb*3) #define c1 (nb*1─nc*2) init { do /*a*/ :: (na<3) -> na++; /*b*/ :: (nb<2 && c0>=3) -> nb++; /*c*/ :: (nc<1 && c1>=2) -> nc++; od } #define property (c0<=4 && c1<=2) #define reset (c0==0 && c1==0) Repetition vector: Γ =0 LTL formula : X (property U reset)
Clustering is sound, but incomplete c x5 c x5 Inmarsat 1 1152 1 1152 1 96 d x5760 transform d’ x60 1 96 Proof: 96*60=5760 96*5=480 480 480 e x12 e x12
Simple C program generates Promela models from: the topology matrix and initial token assignment of 10 common benchmarks in 146 versions SPIN does the hard work Performance comparable with special purpose research tools from Eindhoven & Twente (separate buffers only) Checking the bounds
States stored(20-30K per second) Limiting & look ahead Clustering
Start with initial guess g for the optimal bound and step size s using modified DAC05 model repeat exit if SPIN finds a schedule with an optimal bound b ≤ g -- down g ← g+s -- up end repeat Finding the bounds(common buffer pool)
How well does this work? up Optimal bound down Best case: start at optimal bound+1
Creative laziness: use SPIN as the Swiss army knife of Computer Science Creating abstract models is hard The effective ideas can be implemented in special purpose tools Some new theory for the common buffer pool sizes Conclusions