150 likes | 328 Views
Evolutionary Computation on the Connex Architecture. István Lőrentz 1 Mihaela Malita 2 Răzvan Andonie 3 (presenter) 1 Electronics and Computers Department, Transylvania University of Brasov, Romania 2 Computer Science Department, Saint Anselm College Manchester, NH,
E N D
Evolutionary Computation on the Connex Architecture István Lőrentz1 Mihaela Malita2 Răzvan Andonie3 (presenter) 1Electronics and Computers Department, Transylvania University of Brasov, Romania 2Computer Science Department, Saint Anselm College Manchester, NH, 3Computer Science Department, Central Washington University Ellensburg, WA, USA MAICS 2011 The 22nd Midwest Artificial Intelligence and Cognitive Science Conference
Presentation Outline • The Connex Architecture (more in Prof. Gheorghe M. Stefan) • Evolutionary Algorithms (EA) • Parallelizing EA on Connex • Example problems • Results • Conclusions
The ConnexChip • The Connex Array: • Many-core data parallel area of 1024 Processing Cells (PC) • Area: ~ 50 mm2 of the 1024-PC array, including 1Mbyte of memory and the two controllers • Clock speed: 400 MHz • Also on the chip • Multi-core area: 4 MIPS cores • Speculative parallel pipe of 8 PE • Interfaces • DDR, PCI • Video and Audio interfaces for 2 HDTV channels • Total Power: ~ 5 Watts • Total Area: 82 mm2 • 65nm implementation
The Connex Array • Sequencer • Issues in each cycle (on a 2-stage pipe) one instruction for Connex Array and one instruction for itself • I/O Controller • Controls a 6.4 GB/s I/O channel • Works in parallel with code running on the Connex Array • Processing Cell • Integer unit • Data memory • Boolean (predicate) unit
Genetic Algorithms (GA) Initialize population randomly • Chromosomes represented as vectors of integer components in Connex • Maximum chromosome length: 1024 elements • Population forms a matrix • Processing blocks are parallelized Crossover Mutation Evaluation Select new generation Convergence or limit ? No Yes STOP
Evolution strategy (ES) Initialize population randomly • Similar algorithm to GA • Population and mutation parameters encoded in vectors • Recombination forms a new individual from multiple parents • Mutation adds a gaussian-distributed random variable to each vector component • Deterministic selection of new generation, based of fitness ranking Recombination Mutation Evaluation Select new parent generation Convergence or limit ? No Yes STOP
Parallel Crossover • Combines genes of two individuals (parents) • Example: 1-point crossover at a random position in Vector-C: vector crossover (vector X, vector Y) { int position = rand( VECTORSIZE ) ; where ( i < position) C = X; elsewhere C = Y; return C; } • Uses Connex's parallel-if construct: where(cond) {…}elsewhere {...}
Parallel Mutation • A single position is selected, randomly vector mutate(vector X){ int pos = rand(VECTOR_SIZE); float amount = rand11(); where (i == pos) X += amount; return X; } • The operation will affect only the selected position
Evaluation of fitness function The class of fitness functions that can be evaluated efficiently on Connex are those composed by: 1. data-parallel stage (local computation on each PC), followed by 2. parallel reduction (sum) For example: - Sum of squared differences - Knapsack problem: sum of weighted items - Travelling salesman problem: sum of distances between cities in a route
Example 1. The Rosenbrock function Benchmark problem for optimizations Vector-C implementation: where ( i<N ) Xsh = rotateLeft(X, 1); where( i<(N-1) ) { X2 = X * X; Xsh -= X2; Xsh *= Xsh * 100; X2 = 1 - X; X2 = X2 * X2; X2 += Xsh; } return sumv(X2);
Example 2: The molecular distance geometry problem (MDGP) The problem: given a set of distance measurements between atoms, determine their cartesian coordonates Formulated as a global optimization problem, minimize: • Not all distances are known • Some distances can be given as upper and lower bounds
Representing MDGP on Connex • Each given distance d(i,j) is mapped to a processing element • Some PC share vertices • Shared vertices share also random generator seeds • No interprocessor communication (except parallel reduction)
Running MDGP On Connex Evaluate distances Xi,Yi = vertices D = vector of known distances void evaluateDist(vector Xi,Yi,D) { vector Dx, Dy; Dx=Xi[k]-Xj[k]; Dy=Yi[k]-Yj[k]; Dx *= dx; Dy *= dy; Dx += dy; return sumAbsDiff(Dx,D); }
Results Results Summary of operations: parallel instruction counts, sequential instructions and speedups, where N=1024, the vector size.
Conclusions - The Connex chip is suitable to parallelize evolutionary algorithms, by vectorization - By horizontal data mapping, we can benefit of the parallel reduction, for a certain class of optimization problems