530 likes | 723 Views
Prasanta K. Jana Department of Computer Sc. & Engg. Indian School of Mines, Dhanbad. Optoelectronic Parallel Computing with its Applications. IWLSC-06, 8-10 Feb. Department of Computer Sc. & Engg. Indian School of Mines, Dhanbad. IWLSC-06, 8-10 Feb.
E N D
Prasanta K. Jana Department of Computer Sc. & Engg. Indian School of Mines, Dhanbad Optoelectronic Parallel Computing with its Applications IWLSC-06, 8-10 Feb.
Department of Computer Sc. & Engg. Indian School of Mines, Dhanbad IWLSC-06, 8-10 Feb
Optoelectronic Parallel computers A hybrid Structure using optical and electronic links to support massive parallel processing An Example: OTIS (Optical Transpose Interconnection System) Proposed by Marsden et al. in Optics Letter, 18 (1993) 1083-1085 IWLSC-06, 8-10 Feb.
Why Hybrid? Processors are spreaded over several levels of packaging hierarchy Background: IWLSC-06, 8-10 Feb.
Why Hybrid (Contd…) Advantages of Optical Links over Electrical Links • 1. Higher speed Use of light pulses Wave guides support Pipelining • 2. Less cross talk Sharing of optical bus Only one message on a shared electronic bus IWLSC-06, 8-10 Feb.
Why HybridAdvantages (Contd..) • 3. Less Power Consumption Power requirement nearly independent of lengths References: M. Feldman et al. in Applied Optics, 27(9), 1998,1742-1751 F. Kiamilev et al. In Journal of Lightwave Technology, 9(12), 1991 IWLSC-06, 8-10 Feb.
Why Hybrid (Contd…) • Electronic links are very efficient if the distance is up to few millimeters Conclusion: Use optical links for far processors and electronic links for near processors – A Hybrid. IWLSC-06, 8-10 Feb.
OTIS organization • Processors are divided into groups • Each group contains several processors • Electronic links for intra-group and optical links for inter-group processors. • (G, P) is connected to (P, G) IWLSC-06, 8-10 Feb.
Group 1 Group 2 1,2 2,2 1,1 2,1 1,4 2,4 1,3 2,3 3,2 4,2 3,1 4,1 3,4 4,4 3,3 4,3 Group 2 Group 3 Group 4 16 Processors OTIS Mesh IWLSC-06, 8-10 Feb.
OTIS Models OTIS Mesh OTIS Hypercube OTIS Ring OTIS Mesh of Trees IWLSC-06, 8-10 Feb.
Existing Parallel algorithms on Optoelectronic Models [1] C. F. Wang and S. Sahni, “Basic operations on the OTIS-Mesh optoelectronic computer,” IEEE Trans. On Parallel and Distributed Systems Vol.9, No. 12, pp. 1226–1998. December, 1998. [2] K. Day, “Topological Properties of OTIS-Networks,” IEEE Trans. On Parallel and Distributed Systems Vol.13, No. 4, pp. 359-366–1998. April, 2002 [3] A. Datta, “Summation and Routing on a Partitioned Optical Passive Stars Network with large Group Size,” IEEE Trans. On Parallel and Distributed Systems Vol.14, No. 12, pp.1275-1285, 2003. [4] J. Lie, Yi Pan and H. Shen, “Subalgorithmic Deterministic Selection on Arrays with a Reconfigurable Optival Bus,” IEEE Trans. On Computers Vol.51, No. 6, pp. 702–707. June, 2002. IWLSC-06, 8-10 Feb.
[5] A. Osterloh, “Sorting on the OTIS-Mesh,” Proc. 14th Int. Parallel and Distributed Processing Symposium (IPDPS 2000), pp. 269-274, 2000. [6] C. F. Wang and S. Sahni, “Matrix multiplication on the OTIS-Mesh optoelectronic computer,” IEEE Trans. On Computers,Vol.50, No. 7, pp. 635–646, July, 2001. [7] S. Sahni and C.F.Wang, “BPC permutations on the OTIS-Mesh optoelctronic computer,” Proc. Fourth Int’l. Conference Massively Parallel Processing Using Optical Interconnections (MIPPOI ’97), pp. 130-135, 1997.[8] S. Rajasekaran and S. Sahni, “Randomized routing, Selection, and Sorting on the OTIS-Mesh optoelectronic computer, IEEE Trans. On Parallel and Distributed Systems Vol.9, No. 9, pp. 833-840, 1998. IWLSC-06, 8-10 Feb.
[9] C. F. Wang and S. Sahni, “Image processing on the OTIS-Mesh optoelectronic computer,” IEEE Trans. On Parallel and Distributed Systems Vol.11, No. 2, pp. 97–109. December, 1998.[10] P. K. Jana, “Polynomial interpolation on OTIS-Mesh optoelctronic computers,” Distributed Computing-IWDC 2004: Lecture notes in computer Science (Springer), Heidelberg, pp. 373-378, 2004.[11] P. K. Jana, “Improved Parallel Prefix Computation on Optical Multi-Trees ,” in Proceedings of IEEE Indicon 2004, IITKharagpur, India, 20 - 22 Dec. 2004, pp. 414-418.[12] P. K. Jana and Koushik Sinha “Bit reversal permutation on optical multi-trees (OMULT),” in Proceedings of 12th International conference on Advanced computing and communications(ADCOM 2004), Ahmedabad, India, 15-18 December 2004. IWLSC-06, 8-10 Feb.
Prefix Problem: Given N data values x1, x2, …, xN compute Pi = x1 o x2 o x3 o… o xi , 1≤ i ≤ N where o is an associative binary operation Expanded form: P1 = x1 P2 = x1 o x2 P3 = x1 o x2 o x3 PN = x1 o x2 o x3 o… o xN IWLSC-06, 8-10 Feb.
Sequential AlgorithmP1 = x1Pi = Pi-1 oxi i 2Requires O(n) time IWLSC-06, 8-10 Feb.
Applications of Prefix Computation • Knapsack Problem • Job Sequencing with deadline • Compiler Design • Computational Biology • Evaluation of Polynomials • Solving System of Linear Equations • Polynomial Interpolation IWLSC-06, 8-10 Feb.
Our Proposed Algorithmon OTIS Mesh For n-point prefix on an OTIS mesh using n processors In 5.5n ¼ + 3 Electronics move + 2 OTIS move IWLSC-06, 8-10 Feb.
Group 11 12 Group 1,2 1,2 1,1 1,1 2,2 2,2 2,1 2,1 1,2 1,2 1,1 1,1 2,2 2,2 2,1 2,1 Group 2 Group 21 22 Group 16 Processors OTIS Mesh 4 IWLSC-06, 8-10 Feb.
Initialization IWLSC-06, 8-10 Feb.
1 5 9 13 65 69 73 77 129 133 137 141 193 197 201 205 • 2 6 10 14 66 70 74 78 130 134 138 142 194 198 202 206 • 4 8 12 16 68 72 76 80 132 136 140 144 196 200 204 208 • 3 7 11 15 67 71 75 79 131 135 139 143 195 199 203 207 • 17 21 25 29 81 85 89 93 145 149 153 157 209 213 217 221 • 18 22 26 30 82 86 90 94 146 150 154 158 210 214 218 222 • 20 24 28 32 84 88 92 96 148 152 156 160 212 216 220 224 • 19 23 27 31 83 87 91 95 147 151 155 159 211 215 219 223 • 53 57 61 113 117 121 125 177 181 185 189 241 245 249 253 • 54 58 62 114 118 122 126 178 182 186 190 242 246 250 254 • 52 56 60 64 116 120 124 128 180 184 188 192 244 248 252 256 • 51 55 59 63 115 119 123 127 179 183 187 191 243 247 251 255 • 33 37 41 45 97 101 105 109 161 165 169 173 225 229 233 237 • 34 38 42 46 98 102 106 110 162 166 170 174 226 230 234 238 • 36 40 44 48 100 104 108 112 164 168 172 176 228 232 236 240 • 35 39 43 47 99 103 107 111 163 167 171 175 227 231 235 239 • Initialization IWLSC-06, 8-10 Feb.
1-1 1-5 1-9 1-13 65-65 129-129 193-194 1-2 1-6 1-10 1-14 1-4 1-8 1-12 1-16 65-80 129-144 193-208 1-3 1-7 1-11 1-15 17-17 81-81 145 -145 209-209 17-32 81-96 145-160 209-224 49-49 113-113 177 -177 241-241 49-64 113-128 177-192 241-256 33-33 97-97 161-161 225-225 33-48 97-112 161-176 225-240 Block Prefix Computation Time: 2n¼ + 1 Electronics Move IWLSC-06, 8-10 Feb.
1-1 1-5 1-9 1-13 65-65 129-129 193-194 1-2 1-6 1-10 1-14 1-4 1-8 1-12 1-16 65-80 129-144 193-208 1-3 1-7 1-11 1-15 17-17 81-81 145 -145 209-209 17-32 81-96 145-160 209-224 49-49 113-113 177 -177 241-241 49-64 113-128 177-192 241-256 33-33 97-97 161-161 225-225 33-48 97-112 161-176 225-240 OTIS Move (All moves are not shown) Time:One OTIS move
1-16 65-80 129-144 193-208 17-32 81-96 145-160 209-224 49-64 113-128 177-192 241-256 33-48 97-112 161-176 225-240 Result After OTIS Move IWLSC-06, 8-10 Feb.
0 1-64 1-128 1-192 1-16 1-80 1-144 1-208 1-48 1-112 1-176 1-240 1-32 1-96 1-160 1-224 Modified Prefix Time:2n¼ + 3 Electronics Move IWLSC-06, 8-10 Feb.
193-193 193-208 129-129 129-144 65-65 65-80 1-1 1-16 209-209 209-224 145-145 145-160 81-81 81-96 17-17 17-32 0 1-64 1-128 1-192 1-16 1-80 1-144 1-208 1-48 1-112 1-176 1-240 1-32 1-96 1-160 1-224 177-177 177-192 113-113 113-128 49-49 49-64 225-225 225-241 161-161 161-176 97-97 97-112 33-33 33-48 OTIS Move Time: One OTIS Move
1-193 1-208 1-129 1-144 1-65 1-80 1-1 1-16 1-209 1-224 1-145 1-160 1-81 1-96 1-17 1-32 1-177 1-192 1-113 1-128 1-49 1-64 1-241 1-256 1-225 1-240 1-161 1-176 1-97 1-112 1-33 1-48 Final Result by broadcasting Time: 1.5n¼ -1 Electronics Move IWLSC-06, 8-10 Feb.
Overall Time Complexity:5.5n¼ +3 Electronics moves+ 2 OTIS moves IWLSC-06, 8-10 Feb.
ALL-To-All Communications on OTIS-Ring Problem Statement: Each processor holds one message and sends the same to every other processor. Also Known as Gossiping / Total-exchange / All-broadcast IWLSC-06, 8-10 Feb.
All-to-All Communication P1 P2 Pn-1 Pn P1 P2 Pn-1 Pn IWLSC-06, 8-10 Feb.
Applications of All-to-All Broadcast Matrix-matrix multiplication Matrix-vector multiplication Extreme finding Reduction Prefix computation IWLSC-06, 8-10 Feb.
Polynomial Interpolation: Given a set of functional values say, y1, y2, , yN, at some discrete points x1, x2, , xN, the problem of interpolation is to evaluate the function at some intermediate point x where, x1 < x < xN. N-point Lagrange formula : Where and IWLSC-06, 8-10 Feb.
N-point Hermite formula :where Li (x) = Li (x) IWLSC-06, 8-10 Feb.
Expanded form : =(xi-x1)(xi-2)…(xi-xi-1)(xi-xi+1)…(xi-xN), i = 1, 2…, N IWLSC-06, 8-10 Feb.
Goertzel Ben, “ Lagrange interpolation on a processor tree with ring connection,” J. of Parallel and Distributed Computing. Vol. 22, No. 2(1994) 321-323. IWLSC-06, 8-10 Feb.
1 X1 X1 X1 1 X2 X2 X2 1 X5 X5 X5 1 X4 X4 X4 1 X3 X3 X3 IWLSC-06, 8-10 Feb.
(X1-X5)(X1-X2) X5 X2 X1 (X2-X1)(X2-X3) X1 X3 X2 (X5-X4)(X5-X1) X4 X1 X5 (X3-X2)(X3-X4) X2 X4 X3 (X4-X3)(X4-X5) X3 X5 X4 IWLSC-06, 8-10 Feb.
(X1-X5)(X1-X2) (X1-X4)(X1-X3) X4 X3 X1 (X2-X1)(X2-X3)(X2-X5)(X2-X4) X5 X4 X2 (X5-X4)(X5-X1) (X5-X3)(X5-X2) X3 X2 X5 (X3-X2)(X3-X4) (X3-X1)(X3-X5) X1 X5 X3 (X4-X3)(X4-X5) (X4-X2)(X4-X1) X2 X1 X4 IWLSC-06, 8-10 Feb.
An open problem : If there exists a length-L Hamiltonian cycle in G (Group), then there exists a length-L2 hamiltoniancycle in OTIS-G Khaled Day, et al. “Topological properties of OTIS-Networks,” IEEE TPDS, Vol. 13, N0. 4, April, 2002. IWLSC-06, 8-10 Feb.
1, 1 1, 2 1, 5 1, 3 1, 4 2, 1 5, 1 5, 2 5, 5 2, 2 2, 5 2, 3 5, 4 5, 3 2, 4 3, 1 4, 1 3, 2 4, 2 3, 5 4, 5 4, 3 4, 4 3, 4 3, 3 Hamiltonian cycle IWLSC-06, 8-10 Feb.
Lemma 1: If we start with (1, T +1), T = Ceil (L / 2), we can always find a Hamiltonian cycle if L is odd Proof: In short notation, the result is : (1, T+1) CR (1, T ) OM (T, 1) CR (T, L) OM (L,T ) CR (L, T - 1) OM (T- 1, L) CR (T - 1, L -1) OM (L –1, T - 1) CR(L –1, T - 2) … (T +1, 1) OM (1, T +1) IWLSC-06, 8-10 Feb.
An example for L = 7 (1,5) CR (1,4) OM (4,1) CR (4,7) OM (7,4) CR (7,3) OM (3,7) CR (3,6) OM (6,3) CR (6,2) OM (2,6) CR (2,5) OM (5,2) CR ( 5,1) OM (1,5). S = {1, 4, 7, 3, 6, 2, 5, 1}. Two Divisions: S1 = {1, 7, 6, 5} S2 = {4, 3, 2, 1} 1 2 7 3 6 4 5 IWLSC-06, 8-10 Feb.
Fails for even value of L For Example, L= 6 (1,4)CR(1,3)OM(3,1)CR(3,6)OM(6,3)CR(6,2)OM(2,6)CR(2,5) OM(5,2)CR(5,1)OM(1,5) 1 2 S = { 1, 3, 6, 2, 5, 1} S1 = {1, 6, 5} and S2 = {3, 2, 1} 6 3 4 5 IWLSC-06, 8-10 Feb.
0, 0 0, 1 0, 4 0, 2 0, 3 1, 0 4, 0 4, 1 4, 4 1, 1 1, 4 1, 2 4, 3 4, 2 1, 3 2, 0 3, 0 2, 1 3, 1 2, 4 3, 4 3, 2 3, 3 2, 3 2, 2 Hamiltonian cycle IWLSC-06, 8-10 Feb.
References On OTIS Models O [1] G. C. Marsden, P.J. Marchand, P. Harvey and S. C. Esener, “Optical transpose interconnection system architectures,” Optics Letters, Vol. 18, No. 13, pp. 1083-1085, July, 1993. [2] F. Zane, P. Marchand, R. Paturi and S. Esener, “Scalable network architectures using the optical transpose interconnection system (OTIS),” J. of Parallel and Distributed Computing, Vol. 60 No. 5, pp. 521-538, 2000. [3] C. F. Wang and S. Sahni, “OTIS optoelectronic computers,” Parallel Computation Using Optical Interconnections, K. Li, Y. Pan and S.Q.Zhang, Eds. Kluwer Academic, 1998. [4] S. Sahni, “Models and Algorithms for Optical and Optoelectronic Parallel computers,” IWLSC-06, 8-10 Feb.
References [5] C. F. Wang and S. Sahni, “Basic operations on the OTIS-Mesh optoelectronic computer,” IEEE Trans.On Parallel and Distributed Systems Vol.9, No. 12, pp. 1226–1998. December, 1998. [6] Egecioglu and A. Srinivasan, Optimal Parallel Prefix on mesh architecture. Parallel Algorithms and Applications 1 (1993), 191–209. [7] P. K. Jana, B. D. Naidu, S. Kumar, M. Arora, and B. P. Sinha, “Parallel prefix computation on extended multi-mesh network,” Information Processing Letters, Vol. 84, No. 6, pp. 295-303, October 2002 [8] S.G.Akl, The Design and Analysis of Parallel Algorithms. Englewood Cliffs, NJ: Prentice Hall, 1989. [9] S. Rajasekaran and S. Sahni, “Randomized routing, Selection, and Sorting on the OTIS-Mesh optoelectronic computer, IEEE Trans. On Parallel and Distributed Systems Vol.9, No. 9, pp. 833-840, 1998. IWLSC-06, 8-10 Feb.
References (Conti…) [10] C. F. Wang and S. Sahni, “Image processing on the OTIS-Mesh optoelectronic computer,” IEEE Trans. On Parallel and Distributed Systems Vol.11, No. 2, pp. 97–109. December, 1998. [11] C. F. Wang and S. Sahni, “Matrix multiplication on the OTIS-Mesh optoelectronic computer,” IEEE Trans. On Computers, Vol.50, No. 7, pp. 635–646, July, 2001. [12] A. Osterloh, “Sorting on the OTIS-Mesh,” Proc. 14th Int’l. Parallel and Distributed Processing Symposium (IPDPS 2000), pp. 269-274, 2000. [13] S. Sahni and C.F.wang, “BPC permutations on the OTIS-Mesh optoelctronic computer,” Proc. Fourth Int’l. Conference Massively Parallel Processing Using Optical Interconnections (MIPPOI ’97), pp. 130-135, 1997. [14] Chih-Fang. Wang, and S. Sahni, “OTIS Optielectronic Computers,” Parallel computation using optical interconnection, K. Li, Y. Pan and S. Q. Zhag Eds Kluwer Academic, 1998. IWLSC-06, 8-10 Feb.
Thank You IWLSC-06, 8-10 Feb.
Modified Prefix on mesh 1 7 13 19 25 31 0 0 0 0 0 0 2 8 14 20 26 32 1 7 13 19 25 31 3 9 15 21 27 33 1-2 7-8 13-14 19-20 25-26 31-32 3 9 15 21 27 33 6 12 18 24 30 36 4-5 10-11 16-17 22-23 28-29 34-35 5 11 17 23 29 35 6 12 18 24 30 36 4 10 16 22 28 34 4 10 16 22 28 34 0 0 0 0 0 0 Step1 Data Initially stored Step 2 (0.5 n¼ - 1 steps) IWLSC-06, 8-10 Feb.
Modified Prefix on Mesh (Conti..)0 0 0 0 0 0 0 0 0 0 0 0 1 7 13 19 25 31 1 7 13 19 25 31 1-2 6-8 12-14 18-20 24-26 30-32 1-2 3-8 9-14 15-20 21-26 27-32 6 12 18 24 30 6 12 18 24 303-5 9-11 15-17 21-23 27-29 33-35 1-5 6-11 12-17 18-23 24-29 30-35 3 9 15 21 27 33 3 9 15 21 27 33 4 10 16 22 28 34 4 10 16 22 28 34 0 0 0 0 0 0 0 0 0 0 0 0 Step 4 (2 steps) Step 3 (2 steps) IWLSC-06, 8-10 Feb.
0 6 12 18 24 30 0 1-6 1-12 1-18 1-24 1-30 1 6-7 12-13 18-19 24-25 30-31 1 1-7 1-13 1-19 1- 25 1- 31 1-2 1-8 1-14 1-20 1-26 1-32 1-2 1-8 1-14 1-20 1-26 1-32 1-5 1-11 1-17 1-23 1-26 1-32 1-5 1-11 1-17 1-23 1-29 1-353-4 9-10 15-16 21-22 27-28 33-34 1-4 1-10 1-16 1-22 1-28 1-34 3 9 15 21 27 33 1-3 1-9 1-15 1-21 1-27 1-33 Modified Prefix on Mesh (Conti…) Step 6 (0.5 n¼ + 1 steps) Step 5 (n¼ - 1 steps) Total Time: 2n¼ + 3 steps IWLSC-06, 8-10 Feb.