430 likes | 567 Views
Sparse code optimization. Automatic transformation of linked list pointer structures. Sven Groot. 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
E N D
Sparse code optimization Automatic transformation of linked list pointer structures Sven Groot

Restructuring compiler • Special type of optimizing compiler • Restructures source code (e.g. to enable vectorization or parallelization) • Techniques: • Loop interchange • Strip mining • Loop collapsing • Loop fusion and fission • Data structure transformation • Sparse compiler is a type of restructuring compiler Sven Groot
Restructuringcompiler (cont’d) • Loop interchangeexample void MatrixMultiply1(float **result, float **left, float **right, int size, intrightWidth) { int row, col, x; for( row = 0; row < size; ++row ) { for( col = 0; col < rightWidth; ++col ) { for( x = 0; x < size; ++x ) { result[row][col] += left[row][x] * right[x][col]; } } } } Sven Groot
Restructuring compiler (cont’d) • Loop interchangeresult void MatrixMultiply2(float **result, float **left, float **right, int size, intrightWidth) { int row, col, x; for( row = 0; row < size; ++row ) { for( x = 0; x < size; ++x ) { for( col = 0; col < rightWidth; ++col ) { result[row][col] += left[row][x] * right[x][col]; } } } } Sven Groot
Restructuring compilers (cont’d) • Pointers pose problems void MatrixMultiply3(float **result, float **left, float **right, int size, intrightWidth) { int row, col; float **tempRight; float *tempLeft; for( row = 0; row < size; ++row ) { for( col = 0; col < rightWidth; ++col ) { tempRight = right; tempLeft = left[row]; while( tempRight < right + size ) { result[row][col] += *tempLeft * (*tempRight)[col]; ++tempRight; ++tempLeft; } } } } Sven Groot
Col Next Next Matrix ColHead Index=1 ColHead Index=2 ColHead Index=3 Cell Cell Row Cell ColNext RowHead Index=1 Cell Cell Cell RowNext Next Cell Cell RowHead Index=3 Cell ColNext Linked list matrix Sven Groot
Linked List Matrix (cont’d) struct Cell { float Value; intColIndex; intRowIndex; struct Cell *RowNext; // Cell in the next row struct Cell *ColNext; // Cell in the next column }; structRowHead { intRowIndex; struct Cell *Cell; structRowHead *Next; }; structColHead { intColIndex; struct Cell *Cell; structColHead *Next; }; struct Matrix { int Dimensions; structColHead *Col; struct RowHead *Row; }; Sven Groot
Linked list matrix (cont’d) • Matrix multiplication using linked lists voidMatrixMultiply(struct Matrix left, float **right, float **result, intrightWidth) { structRowHead *leftRow = left.Row; struct Cell *leftCell; int dimensions = left.Dimensions; intcol, row, x; for( col = 0; col < rightWidth; ++col ) { leftRow = left.Row; for( row = 0; row < dimensions; ++row ) { if( leftRow != NULL && leftRow->RowIndex < row ) leftRow = leftRow->Next; if( leftRow != NULL && leftRow->RowIndex == row ) { leftCell = leftRow->Cell; for( x = 0; x < dimensions; ++x ) { if( leftCell != NULL && leftCell->ColIndex < x ) leftCell = leftCell->ColNext; if( leftCell != NULL && leftCell->ColIndex == x && leftCell->RowIndex == row ) { result[row][col] += leftCell->Value * right[x][col]; } } } } } } Sven Groot
Linked list Matrix (cont’d) • Alternative matrix multiplication intMatrixMultiplyAlternative(struct Matrix left, float **right, float **result, intrightWidth) { • structRowHead*leftRow = left.Row; • struct Cell *leftCell; int dimensions = left.Dimensions; intcol; for( col = 0; col < rightWidth; ++col ) { leftRow = left.Row; while( leftRow != NULL ) { leftCell = leftRow->Cell; while( leftCell != NULL ) { result[leftRow->RowIndex][col] += leftCell->Value * right[leftCell->ColIndex][col]; leftCell = leftCell->ColNext; } leftRow = leftRow->Next; } } return 0; } Sven Groot
Transformation • The goal: remove all references to the linked list from the loop • The means: move linked list references into initialization loop • Initialization copies linked list contents into array • Transformed loop uses array • Two methods, sublimation and annihilation • Must be done automatically Sven Groot
Sublimation • Transforming the innermost loop for( x = 0; x < dimensions; ++x ) { if( leftCell != NULL && leftCell->ColIndex < x ) leftCell = leftCell->ColNext; if( leftCell != NULL && leftCell->ColIndex == x && leftCell->RowIndex == row ) { result[row][col] += leftCell->Value * right[x][col]; } } Sven Groot
Sublimation (cont’d) • Initialization • Transformed main loop • leftCellArray = malloc(sizeof(float) * dimensions); • for( x = 0; x < dimensions; ++x ) { if( leftCell != NULL && leftCell->ColIndex < x ) leftCell = leftCell->ColNext; if( leftCell != NULL && leftCell->ColIndex == x && leftCell->RowIndex == row ) { • leftCellArray[x] = leftCell->Value; } else • leftCellArray[x] = 0; } for( x = 0; x < dimensions; ++x ) { • result[row][col] += leftCellArray[x] * right[x][col]; } Sven Groot
Sublimation (cont’d) • Transforming the inner loop (alternative) • Initialization • Transformed main loop while( leftCell != NULL ) { result[leftRow->RowIndex][col] += leftCell->Value * right[leftCell->ColIndex][col]; leftCell = leftCell->ColNext; } leftCellArray = malloc(sizeof(float) * dimensions); memset(leftCellArray, 0, sizeof(float) * dimensions); while( leftCell != NULL ) { leftCellArray[leftCell->ColIndex] = leftCell->Value; leftCell = leftCell->ColNext; } for( leftCellCounter = 0; leftCellCounter < dimensions; ++leftCellCounter ) { result[leftRow->RowIndex][col] += leftCellArray[leftCellCounter] * right[leftCellCounter][col]; } Sven Groot
Loop extraction • Putting it in context for( row = 0; row < dimensions; ++row ) { if( leftRow != NULL && leftRow->RowIndex < row ) leftRow = leftRow->Next; if( leftRow != NULL && leftRow->RowIndex == row ) { leftCell = leftRow->Cell; leftCellArray = malloc(sizeof(float) * dimensions); for( x = 0; x < dimensions; ++x ) { if( leftCell != NULL && leftCell->ColIndex < x ) leftCell = leftCell->ColNext; if( leftCell != NULL && leftCell->ColIndex == x && leftCell->RowIndex == row ) { leftCellArray[x] = leftCell->Value; } else leftCellArray[x] = 0; } for( x = 0; x < dimensions; ++x ) { result[row][col] += leftCellArray[x] * right[x][col]; } free(leftCellArray); } } initialization Main loop Sven Groot
Loop extraction (cont’d) • Initialization for( row = 0; row < dimensions; ++row ) { if( leftRow != NULL && leftRow->RowIndex < row ) leftRow = leftRow->Next; leftCellArrayArray[row] = malloc(sizeof(float*) * dimensions); memset(leftCellArrayArray[row], 0, sizeof(float) * dimensions); if( leftRow != NULL && leftRow->RowIndex == row ) { leftCell = leftRow->Cell; for( x = 0; x < dimensions; ++x ) { if( leftCell != NULL && leftCell->ColIndex < x ) leftCell = leftCell->ColNext; if( leftCell != NULL && leftCell->ColIndex == x && leftCell->RowIndex == row ) { leftCellArrayArray[row][x] = leftCell->Value; } else leftCellArrayArray[row][x] = 0; } } } Sven Groot
Loop extraction (cont’d) • Transformedmain loop for( row = 0; row < dimensions; ++row ) { for( x = 0; x < dimensions; ++x ) { result[row][col] += leftCellArrayArray[row][x] * right[x][col]; } } Sven Groot
LOOP extraction (cont’d) • Putting it in context (alternative) while( leftRow != NULL ) { leftCell = leftRow->Cell; leftCellArray = malloc(sizeof(float) * dimensions); memset(leftCellArray, 0, sizeof(float) * dimensions); while( leftCell != NULL ) { leftCellArray[leftCell->ColIndex] = leftCell->Value; leftCell = leftCell->ColNext; } for( leftCellCounter = 0; leftCellCounter < dimensions; ++leftCellCounter ) { result[leftRow->RowIndex][col] += leftCellArray[leftCellCounter] * right[leftCellCounter][col]; } free(leftCellArray); leftRow = leftRow->Next; } Sven Groot
Loop Extraction (cont’d) • Initialization (alternative) • Transformedmain loop leftCellArrayArray = malloc(sizeof(float*) * dimensions); for( leftRowCounter = 0; leftRowCounter < dimensions; ++leftRowCounter ) { leftCellArrayArray[leftRowCounter] = malloc(dimensions * sizeof(float)); memset(leftCellArrayArray[leftRowCounter], 0, dimensions * sizeof(float)); if( leftRow != NULL && leftRowCounter == leftRow->RowIndex ) { leftCell = leftRow->Cell; while( leftCell != NULL ) { leftCellArrayArray[leftRowCounter][leftCell->ColIndex] = leftCell->Value; leftCell = leftCell->ColNext; } leftRow = leftRow->Next; } } for( leftRowCounter = 0; leftRowCounter < dimensions; ++leftRowCounter ) { for( leftCellCounter = 0; leftCellCounter < dimensions; ++leftCellCounter ) { result[leftRowCounter][col] += leftCellArrayArray[leftRowCounter][leftCellCounter] * right[leftCellCounter][col]; } } Sven Groot
Loop Extraction (cont’d) • Once more, in context Sven Groot
for( col = 0; col < rightWidth; ++col ) { leftRow = left.Row; leftCellArrayArray = malloc(sizeof(float*) * dimensions); for( row = 0; row < dimensions; ++row ) { if( leftRow != NULL && leftRow->RowIndex < row ) leftRow = leftRow->Next; leftCellArrayArray[row] = malloc(sizeof(float*) * dimensions); memset(leftCellArrayArray[row], 0, sizeof(float) * dimensions); if( leftRow != NULL && leftRow->RowIndex == row ) { leftCell = leftRow->Cell; for( x = 0; x < dimensions; ++x ) { if( leftCell != NULL && leftCell->ColIndex < x ) leftCell = leftCell->ColNext; if( leftCell != NULL && leftCell->ColIndex == x && leftCell->RowIndex == row ) { leftCellArrayArray[row][x] = leftCell->Value; } else leftCellArrayArray[row][x] = 0; } } } for( row = 0; row < dimensions; ++row ) { for( x = 0; x < dimensions; ++x ) { result[row][col] += leftCellArrayArray[row][x] * right[x][col]; } } for( row = 0; row < dimensions; ++row ) free(leftCellArrayArray[row]); free(leftCellArrayArray); } initialization main loop Sven Groot
for( col = 0; col < dimensions; ++col ) { leftRow = left.Row; leftCellArrayArray = malloc(sizeof(float*) * dimensions); for( leftRowCounter = 0; leftRowCounter < dimensions; ++leftRowCounter ) { leftCellArrayArray[leftRowCounter] = malloc(dimensions * sizeof(float)); memset(leftCellArrayArray[leftRowCounter], 0, dimensions * sizeof(float)); if( leftRow != NULL && leftRowCounter == leftRow->RowIndex ) { leftCell = leftRow->Cell; while( leftCell != NULL ) { leftCellArrayArray[leftRowCounter][leftCell->ColIndex] = leftCell->Value; leftCell = leftCell->ColNext; } leftRow = leftRow->Next; } } for( leftRowCounter = 0; leftRowCounter < dimensions; ++leftRowCounter ) { for( leftCellCounter = 0; leftCellCounter < dimensions; ++leftCellCounter ) { result[leftRow->RowIndex][col] += leftCellArrayArray[leftRowCounter][leftCellCounter] * right[leftCellCounter][col]; } } for( leftRowCounter = 0; leftRowCounter < dimensions; ++leftRowCounter ) free(leftCellArrayArray[leftRowCounter]); free(leftCellArrayArray); } initialization main loop Sven Groot
Transformationresult Sven Groot
voidMatrixMultiplySublimation(struct Matrix left, float** right, float **result, intrightWidth) { structRowHead *leftRow = left.Row; struct Cell *leftCell; float **leftCellArrayArray; int dimensions = left.Dimensions; intcol, row, x; leftCellArrayArray = malloc(sizeof(float*) * dimensions); for( row = 0; row < dimensions; ++row ) { if( leftRow != NULL && leftRow->RowIndex < row ) leftRow = leftRow->Next; leftCellArrayArray[row] = malloc(sizeof(float*) * dimensions); memset(leftCellArrayArray[row], 0, sizeof(float) * dimensions); if( leftRow != NULL && leftRow->RowIndex == row ) { leftCell = leftRow->Cell; for( x = 0; x < dimensions; ++x ) { if( leftCell != NULL && leftCell->ColIndex < x ) leftCell = leftCell->ColNext; if( leftCell != NULL && leftCell->ColIndex == x && leftCell->RowIndex == row ) { leftCellArrayArray[row][x] = leftCell->Value; } else leftCellArrayArray[row][x] = 0; } } } for( col = 0; col < rightWidth; ++col ) { for( row = 0; row < dimensions; ++row ) { for( x = 0; x < dimensions; ++x ) { result[row][col] += leftCellArrayArray[row][x] * right[x][col]; } } } for( row = 0; row < dimensions; ++row ) free(leftCellArrayArray[row]); free(leftCellArrayArray); } Generated declaration initialization main loop Sven Groot
voidMatrixMultiplyAlternativeSublimation(struct Matrix left, float **right, float **result, intrightWidth) { structRowHead *leftRow = left.Row; struct Cell *leftCell; int dimensions = left.Dimensions; intcol; float **leftCellArrayArray; intleftCellCounter, leftRowCounter; leftCellArrayArray = malloc(sizeof(float*) * dimensions); for( leftRowCounter = 0; leftRowCounter < dimensions; ++leftRowCounter ) { leftCellArrayArray[leftRowCounter] = malloc(dimensions * sizeof(float)); memset(leftCellArrayArray[leftRowCounter], 0, dimensions * sizeof(float)); if( leftRow != NULL && leftRowCounter == leftRow->RowIndex ) { leftCell = leftRow->Cell; while( leftCell != NULL ) { leftCellArrayArray[leftRowCounter][leftCell->ColIndex] = leftCell->Value; leftCell = leftCell->ColNext; } leftRow = leftRow->Next; } } for( col = 0; col < rightWidth; ++col ) { for( leftRowCounter = 0; leftRowCounter < dimensions; ++leftRowCounter ) { for( leftCellCounter = 0; leftCellCounter < dimensions; ++leftCellCounter ) { result[leftRowCounter][col] += leftCellArrayArray[leftRowCounter][leftCellCounter] * right[leftCellCounter][col]; } } } for( leftRowCounter = 0; leftRowCounter < dimensions; ++leftRowCounter ) free(leftCellArrayArray[leftRowCounter]); free(leftCellArrayArray); } Generated declarations initialization main loop Sven Groot
Annihilation • Alternative method of transformation • No fill-in: omitted values stay omitted • Sublimation: • Sparse loop: more iterations • Semi-dense loop: same number of iterations • Annihilation • Sparse loop: same number of iterations • Semi-dense loop: less iterations • Can require other transformations Sven Groot
Annihilation (cont’d) • Recall the innermost loop for( x = 0; x < dimensions; ++x ) { if( leftCell != NULL && leftCell->ColIndex < x ) leftCell = leftCell->ColNext; if( leftCell != NULL && leftCell->ColIndex == x && leftCell->RowIndex == row ) { result[row][col] += leftCell->Value * right[x][col]; } } Sven Groot
Annihilation (cont’d) • Initialization leftCellArraySize = 100; leftCellArray = malloc(sizeof(float) * leftCellArraySize); newDimensions = 0; leftCellCopy = leftCell; for( x = 0; x < dimensions; ++x ) { if( newDimensions >= leftCellArraySize ) { leftCellArraySize *= 2; leftCellArray = realloc(leftCellArray, sizeof(float) * leftCellArraySize); } if( leftCellCopy != NULL && leftCellCopy->ColIndex < x ) leftCellCopy = leftCellCopy->ColNext; if( leftCellCopy != NULL && leftCellCopy->ColIndex == x && leftCellCopy->RowIndex == row ) { leftCellArray[newDimensions] = leftCellCopy->Value; ++newDimensions; } } Sven Groot
Annihilation (cont’d) • Initialization (cont’d) rightArraySize = 100; rightArray = malloc(sizeof(float*) * rightArraySize); newDimensions = 0; leftCellCopy = leftCell; for( x = 0; x < dimensions; ++x ) { if( newDimensions >= rightArraySize ) { rightArraySize *= 2; rightArray = realloc(rightArray, sizeof(float) * rightArraySize); } if( leftCellCopy != NULL && leftCellCopy->ColIndex < x ) leftCellCopy = leftCellCopy->ColNext; if( leftCellCopy != NULL && leftCellCopy->ColIndex == x && leftCellCopy->RowIndex == row ) { rightArray[newDimensions] = right[x]; ++newDimensions; } } Sven Groot
Annihilation (cont’d) • Transformed main loop for( x = 0; x < newDimensions; ++x ) { result[row][col] += leftCellArray[x] * rightArray[x][col]; } Sven Groot
Annihilation (cont’d) • Inner loop (alternative) while( leftCell != NULL ) { result[leftRow->RowIndex][col] += leftCell->Value * right[leftCell->ColIndex][col]; leftCell = leftCell->ColNext; } Sven Groot
Annihilation (cont’d) • Initialization (alternative) leftCellArraySize = 100; leftCellArray = malloc(sizeof(float) * leftCellArraySize); newDimensions = 0; leftCellCopy = leftCell; while( leftCellCopy != NULL ) { if( newDimensions >= leftCellArraySize ) { leftCellArraySize *= 2; leftCellArray = realloc(leftCellArray, sizeof(float) * leftCellArraySize); } leftCellArray[newDimensions] = leftCellCopy->Value; ++newDimensions; leftCellCopy = leftCellCopy->ColNext; } rightArraySize = 100; rightArray = malloc(sizeof(float*) * rightArraySize); newDimensions = 0; leftCellCopy = leftCell; while( leftCellCopy != NULL ) { if( newDimensions >= rightArraySize ) { rightArraySize *= 2; rightArray = realloc(rightArray, sizeof(float) * rightArraySize); } rightArray[newDimensions] = right[leftCell->ColIndex]; ++newDimensions; leftCellCopy = leftCellCopy->ColNext; } leftCell right Sven Groot
Annihilation (cont’d) • Transformed main loop (alternative) for( leftCellCounter = 0; leftCellCounter < newDimensions; ++leftCellCounter ) { result[leftRow->RowIndex][col] += leftCellArray[leftCellCounter] * rightArray[leftCellCounter][col]; } Sven Groot
Post-initialization • Pre-initialization: before the main loop • Post-initialization: after the main loop • Needed when an expression that needs to be transformed is written to • Needs to use index expression • Fill-in value not needed Sven Groot
Post-initialization (cont’d) • Example • Result while( node != NULL ) { node->Value = node->Value * 2; node = node->Next; } nodeArray = malloc(size * sizeof(int)); nodeCopy= node; memset(nodeArray, 0, size * sizeof(int)); while( nodeCopy != NULL ) { nodeArray[nodeCopy] = nodeCopy->Value; nodeCopy = nodeCopy->Next; } • for( nodeCounter = 0; nodeCounter < size; ++nodeCounter ) { • nodeArray[nodeCounter] = nodeArray[nodeCounter] * 2; } nodeCopy = node; while( nodeCopy != NULL ) { nodeCopy->Value = nodeArray[nodeCopy->Index]; nodeCopy = nodeCopy->Next; } Pre-init Main loop Post-init Sven Groot
Automated transformation • Seven steps • Find candidate structures • Analyze usage of these structures in the code • Determine transformation safety • Identify data members • Generate dense data structures • Transform • Loop extraction • Code must be normalized Sven Groot
Conditions • The linked list expression must not have side effects • Loop termination control must be trivial • The linked list iteration statement may be the only statement in the loop body that modifies the linked list expression • The “next” pointer member may not be a data member • Any expression, other than the linked list iteration statement, that might be moved to an initialization loop may not have side effects, and use only constants, loop-invariant values, linked list members and loop control variables • If the linked list expression is guarded, it must be possible to move that entire guard, including both the true and false parts, to the initialization loop. • When performing annihilation on a semi-dense loop, there must be a single guard that covers all statements in the loop body except for the linked list iteration statement and its guard, and statements related to loop control (such as those that increment the counter). Sven Groot
Transformation directives • Fill in gaps in the compiler’s knowledge • Embedded in source code as comments • Examples • SAFE_CODE, UNSAFE_CODE • SAFE_LOOP, UNSAFE_LOOP • DENSE_INDEX • DENSE_DIMENSIONS • FILL_IN • Etc. Sven Groot
Transformation Directives (cont’d) • Example /***SAFE_CODE***/ /***DENSE_INDEX(node, node->Index)***/ /***DENSE_DIMENSION(node, size)***/ while( node != NULL ) { node->Value = node->Value * 2; node = node->Next; } /***UNSAFE_CODE***/ Sven Groot
Experimentation • Tested on three matrices • Sublimation code: ran through MT1 • Annihilation code: loop interchange • Used tools: Intel C Compiler, Intel FORTRAN Compiler • Test system: Dual Intel Xeon 3.06GHz, 1GB RAM Sven Groot
Experimentation (cont’d) Sven Groot
Experimentation (cont’d) Sven Groot
Sparse code optimization Automatic transformation of linked list pointer structures Sven Groot