180 likes | 326 Views
Embedded Computer Architecture 2. (BOCA). Bijzondere Onderwerpen Computer Architectuur Block D Examples of Regular Dependency Graphs. The SFG of the AR filter. Streaming environment. Infinite impulse response. (y is on both side of the equation). Recall: Recurrent relations:.
E N D
Embedded Computer Architecture 2 (BOCA) Bijzondere Onderwerpen Computer Architectuur Block D Examples of Regular Dependency Graphs
The SFG of the AR filter Streaming environment. Infinite impulse response. (y is on both side of the equation) Recall: Recurrent relations: Step 1: Expanding Step 2: Just one operation Step 3: substitution of
ai The SFG of the AR filter 0 1 2 3 sz,-1 sz,0 sz,1 sz,2 sz,3 xz yz=sz,3 sz,3 yz+1=sz+1,3 xz+1 aiis local constant sz+1,3 yz+2=sz+2,3 xz+2 sz-i-1,N-1 sz+2,3 yz+3=sz+3,3 xz+3 x, sz+3,3 yz+4=sz+4,3 xz+4 +, sz+4,3 sz,i-1 sz,i yz+5=sz+5,3 xz+5 sz+5,3 yz+6=sz+6,3 xz+6 Basic cell sz+6,3
0 1 2 3 sz,2 sz,1 sz,0 uz,2 uz,0 uz,1 xz,-1 vz,3 yz+1,3 xz+1,-1 vz+1,3 yz+2,3 xz+2,-1 vz+2,3 The four cells in a row vz-1,3 yz+3,3 xz+3,-1 vz+3,3 uz-1,0 uz-1,1 uz-1,2 yz+4,3 xz+4,-1 vz+4,3 a0 a1 a2 a3 x x x x yz+5,3 xz+5,-1 vz+5,3 + + + + sz,0 sz,1 sz,2 yz,3 xz,-1 yz+6,3 xz+6,-1 vz,3 uz,0 uz,1 uz,2 The SFG of the AR filter Separating the different variables, I/O and intermediate. All variables 2-dimensional. Removing superfluous variables. Results in: yz,3
0 1 2 3 sz,2 sz,1 sz,0 uz,2 uz,0 uz,1 xz,-1 yz,3 d =(1,0) vz,3 yz+1,3 xz+1,-1 vz+1,3 yz+2,3 xz+2,-1 vz+2,3 yz+3,3 xz+3,-1 vz+3,3 yz+4,3 xz+4,-1 vz+4,3 yz+5,3 xz+5,-1 vz+5,3 yz+6,3 xz+6,-1 The SFG of the AR filter The only possible scheduling vector in the direction of the I/O enumeration is: Possible scheduling vectors are: xz,-1 sz,i yz,3 vz,i uz,i (3,1) (1,0)
0 1 2 3 sz,2 sz,1 sz,0 uz,2 uz,0 uz,1 xz,-1 yz,3 d =(1,0) s =(1,0) vz,3 yz+1,3 xz+1,-1 vz+1,3 yz+2,3 xz+2,-1 vz+2,3 D0 D0 D0 D0 D1 D1 D1 yz+3,3 xz+3,-1 vz+3,3 D1 yz+4,3 xz+4,-1 vz+4,3 yz+5,3 xz+5,-1 vz+5,3 yz+6,3 xz+6,-1 The SFG of the AR filter SFG 1: x(z)-1 y(z)3 x(z+1)-1 y(z+1)3 y(z+2)3 x(z+2)-1 Not systolic. In one clock period N (=4) additions need to be done. At each clock tick an output is produced and an input is absorbed.
0 1 2 3 sz,2 sz,1 sz,0 uz,2 uz,0 uz,1 xz,-1 yz,3 d =(1,0) s =(4,1) vz,3 yz+1,3 xz+1,-1 vz+1,3 yz+2,3 xz+2,-1 vz+2,3 D1 D1 D1 D1 D4 D4 D4 yz+3,3 xz+3,-1 vz+3,3 D1 yz+4,3 xz+4,-1 vz+4,3 yz+5,3 xz+5,-1 x(z-1)-1 y(z-3)3 vz+5,3 y(z+1)3 x(z+3)-1 yz+6,3 xz+6,-1 y(z+5)3 x(z+7)-1 The SFG of the AR filter SFG 2: Systolic. In one clock period one multiply-add need to be done. At each N (=4) clock ticks an output is produced and an input is absorbed.
x0 y0 k0 x1 y1 k1 x2 k2 y2 x3 k3 y3 Switch Switching The input-output relation can be described by: in which k(i) denotes the index of the input that is connected to the output yi Notice that xk(i) cannot directly be implemented by a simple function, so: recurrent relations:
Switching j The input-output relation can be described by: x0 x1 x3 x2 x2 y0 x2 s0,-1=0 0 0 i x1 x1 x1 0 y1 s1,-1=0 in which k(i) denotes the index of the input that is connected to the output yi x0 x0 x0 x0 y2 s2,-1=0 x3 y3 s3,-1=0 0 0 0 Notice that xk(i) cannot directly be implemented by a simple function, so: recurrent relations:
0,3 0,0 0,1 0,2 1,0 1,2 1,3 2,0 1,1 2,2 2,3 3,0 3,1 3,2 3,3 2,1 Switching Recurrent relation: Globally recursive DG: x3 x0 x2 x1 k0 y0 0 k1 y1 0 k2 y2 0 Basic cell: xj k3 y3 ki 0 s3,-1 s3,0 s3,1 s3,2 s3,3 j =? The various SFG’s that can be derived from this DG do not lead to interesting implementations. 1 si,j si,j-1 0
a b c d b c a d a b c d a b c d a b c d a b c d a b c d a b c d Time Division Multiplex Switching channel p = 2 channels 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 . . . . a b c d a b c d a b c d . . . . TDM stream frames frame q frame q+1 0 1 2 3 4 5 6 7 8 9 10 11 time t The sample at t belongs to the frame q and channel p with t = p + 4.q , and therefore and Switching: (routing)
x y TDM switch 0 1 2 3 4 5 6 7 8 9 10 11 a b c d b c a d a b c d a b c d a b c d a b c d a b c d a b c d Time Division Multiplex Switching Switch function: or Q(p) describes the number of time units a sample from the input stream has to be delayed in order to arrive in time in the destination channel p in output stream y. Notice that this function allows broadcasting. This is not possible with the alternative switching function Example: Q(0) = 2 Q(1) = 1 Q(2) = 1 Q(3) = 0 Switching: (routing)
Switch function: We assume that the values Q(t mod 4) are provided by a variable Rt . So Rt = Q(t mod 4). and Time Division Multiplex Switching y R TDM switch x This leads to the following recurrent relation: x0 y0 y1 x1 y2 y3 y0 Basic cell
Switch function: We assume that the values Q(t mod 4) are provided by a variable Rt . So Rt = Q(t mod 4). and Time Division Multiplex Switching y R TDM switch x This leads to the following recurrent relation: Basic cell: xt-j Rt j =? 1 si,j st,j-1 0
s0,2 s0,-1 0,1 0,2 1,0 1,1 1,2 1,3 2,0 2,1 2,2 2,3 3,0 3,1 3,2 3,3 0,3 0,0 Time Division Multiplex Switching x0 DG: R0 y0 0 x1 R1 y1 0 x2 R2 From global to local gives 8 alternatives: s: 2x x: 2x R: 2x y2 0 x3 R3 y3 0
s0,-1 s0,2 R0,1 x0 R0 y0 0,0 0,1 0,2 0,3 0 x1 R1 y1 1,0 1,1 1,2 1,3 0 x2 R2 allowed s y2 2,0 2,1 2,2 2,3 0 x3 R3 d 3,0 3,1 3,2 3,3 y3 0 Time Division Multiplex Switching Locally recursive DG: st,j Rt yt xt
s0,-1 s0,2 R0,1 x0 R0 y0 0,0 0,1 0,2 0,3 0 x1 R1 y1 1,0 1,1 1,2 1,3 0 x2 s = (1,1) R2 y2 2,0 2,1 2,2 2,3 0 x3 R3 d 3,0 3,1 3,2 3,3 y3 0 Time Division Multiplex Switching Locally recursive DG: Every line indicates a register
s d Time Division Multiplex Switching SFG: xt Rt D2 D D2 D2 D2 yt 0,0 0,1 0,2 0,3 D D D 0 D D D D x(t)1 R(t)0 s(t)-1 s(t)3 This is a nice systolic implementation. Designs like this were ‘invented’ and patented in the sixties and seventies. Here they are simply derived.