960 likes | 1.11k Views
Parallel Covariance Matrix Creation. Final Presentation. Supervisor: Oded Green. Table of Contents - Overview. Introduction Building the covariance matrix The naïve algorithm Our algorithm Terminology The Algorithm Optimizations Results MVM on Plurality The MVM algorithm
E N D
Parallel Covariance Matrix Creation Final Presentation Supervisor: Oded Green
Table of Contents - Overview • Introduction • Building the covariance matrix • The naïve algorithm • Our algorithm • Terminology • The Algorithm • Optimizations • Results • MVM on Plurality • The MVM algorithm • Plurality Platform • Results • Future Projects • Conclusions Parallel Covariance Matrix Creation - Final Presentation
Table of Contents • Introduction • Building the covariance matrix • The naïve algorithm • Our algorithm • Terminology • The Algorithm • Optimizations • Results • MVM on Plurality • The MVM algorithm • Plurality Platform • Results • Future Projects • Conclusions Parallel Covariance Matrix Creation - Final Presentation
Project’s Goals • Developing a parallel algorithm for the creation of a covariance matrix • Compatibility with Plurality’s HAL platform • Maximized parallelization and core utilization • Integrating the algorithm into Elta’s MVM (Minimum Variance Method) algorithm implementation Parallel Covariance Matrix Creation - Final Presentation
MVM Algorithm MVM is a modern 2-D spectral estimation algorithm used by Elta’s Synthetic Aperture Radar (SAR). The MVM algorithm: • Improves resolution • Removes side lobe artifacts (noise) • Reduces speckle compared to what is possible with conventional Fourier transform SAR imaging techniques • One of MVM’s main building blocks is the creation of a covariance matrix Parallel Covariance Matrix Creation - Final Presentation
Plurality Platform Plurality’s HyperCore Architecture Line (HAL) family ofmassively parallel manycore processors features: • Unique task-oriented programming model • Near-serial programmability • High performance at low cost per watt per square millimeter • Unique shared memory architecture - 2 MB cache size Parallel Covariance Matrix Creation - Final Presentation
Table of Contents • Introduction • Building the covariance matrix • The naïve algorithm • Our algorithm • Terminology • The Algorithm • Optimizations • Results • MVM on Plurality • The MVM algorithm • Plurality Platform • Results • Future Projects • Conclusions Parallel Covariance Matrix Creation - Final Presentation
Implementing the Naïve Algorithm Implementing the naïve algorithm will give us a greater understanding of the parallelization problem. • Motivation: Parallel Covariance Matrix Creation - Final Presentation
The Naïve Algorithm Chip [NxM] Parallel Covariance Matrix Creation - Final Presentation
The Naïve Algorithm Sub aperture [N1xM1] Parallel Covariance Matrix Creation - Final Presentation
The Naïve Algorithm Parallel Covariance Matrix Creation - Final Presentation
The NaïveAlgorithm Parallel Covariance Matrix Creation - Final Presentation
The Naïve Algorithm • Every Sub-aperture holds its covariance • matrix Cov Parallel Covariance Matrix Creation - Final Presentation
The NaïveAlgorithm • The covariance matrix Cov is the sum of all • Sub-aperturesCov matrixes Parallel Covariance Matrix Creation - Final Presentation
The NaïveAlgorithm • Shortcomings • Each multiplication is executed many times • For a 32x32 chip, the total number of multiplies is 11.4M when the optimal number of multiplications is 208K (x28!) • The naïve algorithm is difficult to parallelize. Two main difficulties: • Simultaneous writing to the same Rcells – requires mutexes • Memory cost of holding a Cov matrix for every permutation (each is 250 KB) is too expensive Parallel Covariance Matrix Creation - Final Presentation
The NaïveAlgorithm • Disadvantages • Mutexes - adds complexity • Memory space - cache size is only 2 MB • Plurality Platform • The problem requires different solution! Parallel Covariance Matrix Creation - Final Presentation
Our Algorithm A Whole different Ball Game! Parallel Covariance Matrix Creation - Final Presentation
But first … Before presenting the algorithm there is a need to create a common language for the terms we have created. Parallel Covariance Matrix Creation - Final Presentation
Table of Contents • Introduction • Building the covariance matrix • The naïve algorithm • Our algorithm • Terminology • The Algorithm • Optimizations • Results • MVM on Plurality • The MVM algorithm • Plurality Platform • Results • Future Projects • Conclusions Parallel Covariance Matrix Creation - Final Presentation
Terminology Examples • Permutation [1,0] • Permutation [1,1] M1 M2 Permutation Parallel Covariance Matrix Creation - Final Presentation
Terminology Examples • Permutation [1,0] • Permutation [1,1] M1 M2 Permutation Parallel Covariance Matrix Creation - Final Presentation
Terminology Examples • Permutation [1,0] • Permutation [1,1] M1 M2 Permutation Parallel Covariance Matrix Creation - Final Presentation
Terminology Block M1 M2 Block Parallel Covariance Matrix Creation - Final Presentation
Terminology Block Parallel Covariance Matrix Creation - Final Presentation
BNW Terminology BNW Parallel Covariance Matrix Creation - Final Presentation
Terminology Shifting • Shift onlyupwards • and leftwards Block M1 • The block is always inside • the shifted window M2 Parallel Covariance Matrix Creation - Final Presentation
Terminology Shifting • Shift onlyupwards • and leftwards Block M1 • The block is always inside • the shifted window M2 Parallel Covariance Matrix Creation - Final Presentation
Terminology Shifting • Shift onlyupwards • and leftwards Block M1 • The block is always inside • the shifted window M2 Parallel Covariance Matrix Creation - Final Presentation
Terminology Shifting • Shift onlyupwards • and leftwards Block M1 • The block is always inside • the shifted window M2 • Shift of (0,0) is named • Zero iteration Parallel Covariance Matrix Creation - Final Presentation
Terminology Cov- The covariance matrix[M∙N, M∙N] Parallel Covariance Matrix Creation - Final Presentation
Terminology Rcell Parallel Covariance Matrix Creation - Final Presentation
Table of Contents • Introduction • Building the covariance matrix • The naïve algorithm • Our algorithm • Terminology • The Algorithm • Optimizations • Results • MVM on Plurality • The MVM algorithm • Plurality Platform • Results • Future Projects • Conclusions Parallel Covariance Matrix Creation - Final Presentation
Our Algorithm – Key Features • Parallel • Each multiplication is executed once (208k for 32x32 chip) • Memory efficient • Generic Concept: Each Rcell in Cov is calculated by one specific permutation. This enables different permutations to work simultaneously. Parallel Covariance Matrix Creation - Final Presentation
Our Algorithm(simplified) 1. For each permutation (1:313) 1.1 For each legal BNW 1.1.1. Multiply the two multipliers 1.1.2. For each legal shift (including the zero iteration) 1.1.2.1. Add the multiplication product to the matching Rcell in Cov Parallel Covariance Matrix Creation - Final Presentation
Our algorithm(simplified) Finding all unique permutations • Iterative algorithm 1. Initialize Delta (x,y) set and Permutation(x,y) set 2. For each pair of cells (M1,M2) in a N1xM1 matrix 2.1. If |M1-M2| is not in D 2.1.1. Add |M1-M2| to D 2.1.2. Add (M1,M2) to P • Unique permutation count is 313 ( for Sub-aperture [13x13]) • Executed off-line Parallel Covariance Matrix Creation - Final Presentation
Our algorithm (simplified) Cov- The covariance matrix[M∙N, M∙N] Chip [NxM] Parallel Covariance Matrix Creation - Final Presentation
Our algorithm (simplified) • For a given Permutation [1,1] M1 M2 Parallel Covariance Matrix Creation - Final Presentation
Our algorithm (simplified) • There’s a Block Block M1 M2 Parallel Covariance Matrix Creation - Final Presentation
Our algorithm (simplified) • Leagal BNWs for this Block BNW Block M1 M2 Parallel Covariance Matrix Creation - Final Presentation
Our algorithm (simplified) • For a given BNW Block M1 M2 Parallel Covariance Matrix Creation - Final Presentation
Our algorithm (simplified) RES • RES=M1∙M2* Block M1 M2 Parallel Covariance Matrix Creation - Final Presentation
Our algorithm (simplified) • The multipliers Numbering Block Parallel Covariance Matrix Creation - Final Presentation
Our algorithm (simplified) RES • The Zero Iteration Rcell (1,5) RES 1 Block Diag(5-1) 5 Main Diag Parallel Covariance Matrix Creation - Final Presentation
Our algorithm (simplified) • Shifting Block Parallel Covariance Matrix Creation - Final Presentation
Our algorithm (simplified) RES • Shifting Rcell (2,6) RES RES 2 Diag(5-1) Block 6 Main Diag Parallel Covariance Matrix Creation - Final Presentation
Our algorithm (simplified) • Shifting 2 Block 6 Parallel Covariance Matrix Creation - Final Presentation
Our algorithm (simplified) RES • Shifting RES RES 5 RES Diag(5-1) Block 9 Main Diag Parallel Covariance Matrix Creation - Final Presentation
Our algorithm (simplified) We came across a regularity in the offset of the Rcell coordinates when shifting: Leftwards (+Sub-ap size, +Sub-ap size) Upwards (+1,+1) Parallel Covariance Matrix Creation - Final Presentation
Our algorithm (simplified) • Each color represents adifferent permutation Parallel Covariance Matrix Creation - Final Presentation
Our Algorithm(simplified) Summary For a given permutation: • RES is always written into the same group of Rcells • All on the same diagonal • Not necessarily all diagonal cells • There is no overlapping between Rcells of different permutations. • The basis for parallelism! • Each shift writes to one unique Rcell. • Theoretically enables parallelism of Rcell granularity(an instance per Rcell) Parallel Covariance Matrix Creation - Final Presentation