1 / 43

Sparse code optimization

Sparse code optimization. Automatic transformation of linked list pointer structures. Sven Groot. 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

yamka
Download Presentation

Sparse code optimization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sparse code optimization Automatic transformation of linked list pointer structures Sven Groot

  2. 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1

  3. Restructuring compiler • Special type of optimizing compiler • Restructures source code (e.g. to enable vectorization or parallelization) • Techniques: • Loop interchange • Strip mining • Loop collapsing • Loop fusion and fission • Data structure transformation • Sparse compiler is a type of restructuring compiler Sven Groot

  4. Restructuringcompiler (cont’d) • Loop interchangeexample void MatrixMultiply1(float **result, float **left, float **right, int size, intrightWidth) { int row, col, x; for( row = 0; row < size; ++row ) { for( col = 0; col < rightWidth; ++col ) { for( x = 0; x < size; ++x ) { result[row][col] += left[row][x] * right[x][col]; } } } } Sven Groot

  5. Restructuring compiler (cont’d) • Loop interchangeresult void MatrixMultiply2(float **result, float **left, float **right, int size, intrightWidth) { int row, col, x; for( row = 0; row < size; ++row ) { for( x = 0; x < size; ++x ) { for( col = 0; col < rightWidth; ++col ) { result[row][col] += left[row][x] * right[x][col]; } } } } Sven Groot

  6. Restructuring compilers (cont’d) • Pointers pose problems void MatrixMultiply3(float **result, float **left, float **right, int size, intrightWidth) { int row, col; float **tempRight; float *tempLeft; for( row = 0; row < size; ++row ) { for( col = 0; col < rightWidth; ++col ) { tempRight = right; tempLeft = left[row]; while( tempRight < right + size ) { result[row][col] += *tempLeft * (*tempRight)[col]; ++tempRight; ++tempLeft; } } } } Sven Groot

  7. Col Next Next Matrix ColHead Index=1 ColHead Index=2 ColHead Index=3 Cell Cell Row Cell ColNext RowHead Index=1 Cell Cell Cell RowNext Next Cell Cell RowHead Index=3 Cell ColNext Linked list matrix Sven Groot

  8. Linked List Matrix (cont’d) struct Cell { float Value; intColIndex; intRowIndex; struct Cell *RowNext; // Cell in the next row struct Cell *ColNext; // Cell in the next column }; structRowHead { intRowIndex; struct Cell *Cell; structRowHead *Next; }; structColHead { intColIndex; struct Cell *Cell; structColHead *Next; }; struct Matrix { int Dimensions; structColHead *Col; struct RowHead *Row; }; Sven Groot

  9. Linked list matrix (cont’d) • Matrix multiplication using linked lists voidMatrixMultiply(struct Matrix left, float **right, float **result, intrightWidth) { structRowHead *leftRow = left.Row; struct Cell *leftCell; int dimensions = left.Dimensions; intcol, row, x; for( col = 0; col < rightWidth; ++col ) { leftRow = left.Row; for( row = 0; row < dimensions; ++row ) { if( leftRow != NULL && leftRow->RowIndex < row ) leftRow = leftRow->Next; if( leftRow != NULL && leftRow->RowIndex == row ) { leftCell = leftRow->Cell; for( x = 0; x < dimensions; ++x ) { if( leftCell != NULL && leftCell->ColIndex < x ) leftCell = leftCell->ColNext; if( leftCell != NULL && leftCell->ColIndex == x && leftCell->RowIndex == row ) { result[row][col] += leftCell->Value * right[x][col]; } } } } } } Sven Groot

  10. Linked list Matrix (cont’d) • Alternative matrix multiplication intMatrixMultiplyAlternative(struct Matrix left, float **right, float **result, intrightWidth) { • structRowHead*leftRow = left.Row; • struct Cell *leftCell; int dimensions = left.Dimensions; intcol; for( col = 0; col < rightWidth; ++col ) { leftRow = left.Row; while( leftRow != NULL ) { leftCell = leftRow->Cell; while( leftCell != NULL ) { result[leftRow->RowIndex][col] += leftCell->Value * right[leftCell->ColIndex][col]; leftCell = leftCell->ColNext; } leftRow = leftRow->Next; } } return 0; } Sven Groot

  11. Transformation • The goal: remove all references to the linked list from the loop • The means: move linked list references into initialization loop • Initialization copies linked list contents into array • Transformed loop uses array • Two methods, sublimation and annihilation • Must be done automatically Sven Groot

  12. Sublimation • Transforming the innermost loop for( x = 0; x < dimensions; ++x ) { if( leftCell != NULL && leftCell->ColIndex < x ) leftCell = leftCell->ColNext; if( leftCell != NULL && leftCell->ColIndex == x && leftCell->RowIndex == row ) { result[row][col] += leftCell->Value * right[x][col]; } } Sven Groot

  13. Sublimation (cont’d) • Initialization • Transformed main loop • leftCellArray = malloc(sizeof(float) * dimensions); • for( x = 0; x < dimensions; ++x ) { if( leftCell != NULL && leftCell->ColIndex < x ) leftCell = leftCell->ColNext; if( leftCell != NULL && leftCell->ColIndex == x && leftCell->RowIndex == row ) { • leftCellArray[x] = leftCell->Value; } else • leftCellArray[x] = 0; } for( x = 0; x < dimensions; ++x ) { • result[row][col] += leftCellArray[x] * right[x][col]; } Sven Groot

  14. Sublimation (cont’d) • Transforming the inner loop (alternative) • Initialization • Transformed main loop while( leftCell != NULL ) { result[leftRow->RowIndex][col] += leftCell->Value * right[leftCell->ColIndex][col]; leftCell = leftCell->ColNext; } leftCellArray = malloc(sizeof(float) * dimensions); memset(leftCellArray, 0, sizeof(float) * dimensions); while( leftCell != NULL ) { leftCellArray[leftCell->ColIndex] = leftCell->Value; leftCell = leftCell->ColNext; } for( leftCellCounter = 0; leftCellCounter < dimensions; ++leftCellCounter ) { result[leftRow->RowIndex][col] += leftCellArray[leftCellCounter] * right[leftCellCounter][col]; } Sven Groot

  15. Loop extraction • Putting it in context for( row = 0; row < dimensions; ++row ) { if( leftRow != NULL && leftRow->RowIndex < row ) leftRow = leftRow->Next; if( leftRow != NULL && leftRow->RowIndex == row ) { leftCell = leftRow->Cell; leftCellArray = malloc(sizeof(float) * dimensions); for( x = 0; x < dimensions; ++x ) { if( leftCell != NULL && leftCell->ColIndex < x ) leftCell = leftCell->ColNext; if( leftCell != NULL && leftCell->ColIndex == x && leftCell->RowIndex == row ) { leftCellArray[x] = leftCell->Value; } else leftCellArray[x] = 0; } for( x = 0; x < dimensions; ++x ) { result[row][col] += leftCellArray[x] * right[x][col]; } free(leftCellArray); } } initialization Main loop Sven Groot

  16. Loop extraction (cont’d) • Initialization for( row = 0; row < dimensions; ++row ) { if( leftRow != NULL && leftRow->RowIndex < row ) leftRow = leftRow->Next; leftCellArrayArray[row] = malloc(sizeof(float*) * dimensions); memset(leftCellArrayArray[row], 0, sizeof(float) * dimensions); if( leftRow != NULL && leftRow->RowIndex == row ) { leftCell = leftRow->Cell; for( x = 0; x < dimensions; ++x ) { if( leftCell != NULL && leftCell->ColIndex < x ) leftCell = leftCell->ColNext; if( leftCell != NULL && leftCell->ColIndex == x && leftCell->RowIndex == row ) { leftCellArrayArray[row][x] = leftCell->Value; } else leftCellArrayArray[row][x] = 0; } } } Sven Groot

  17. Loop extraction (cont’d) • Transformedmain loop for( row = 0; row < dimensions; ++row ) { for( x = 0; x < dimensions; ++x ) { result[row][col] += leftCellArrayArray[row][x] * right[x][col]; } } Sven Groot

  18. LOOP extraction (cont’d) • Putting it in context (alternative) while( leftRow != NULL ) { leftCell = leftRow->Cell; leftCellArray = malloc(sizeof(float) * dimensions); memset(leftCellArray, 0, sizeof(float) * dimensions); while( leftCell != NULL ) { leftCellArray[leftCell->ColIndex] = leftCell->Value; leftCell = leftCell->ColNext; } for( leftCellCounter = 0; leftCellCounter < dimensions; ++leftCellCounter ) { result[leftRow->RowIndex][col] += leftCellArray[leftCellCounter] * right[leftCellCounter][col]; } free(leftCellArray); leftRow = leftRow->Next; } Sven Groot

  19. Loop Extraction (cont’d) • Initialization (alternative) • Transformedmain loop leftCellArrayArray = malloc(sizeof(float*) * dimensions); for( leftRowCounter = 0; leftRowCounter < dimensions; ++leftRowCounter ) { leftCellArrayArray[leftRowCounter] = malloc(dimensions * sizeof(float)); memset(leftCellArrayArray[leftRowCounter], 0, dimensions * sizeof(float)); if( leftRow != NULL && leftRowCounter == leftRow->RowIndex ) { leftCell = leftRow->Cell; while( leftCell != NULL ) { leftCellArrayArray[leftRowCounter][leftCell->ColIndex] = leftCell->Value; leftCell = leftCell->ColNext; } leftRow = leftRow->Next; } } for( leftRowCounter = 0; leftRowCounter < dimensions; ++leftRowCounter ) { for( leftCellCounter = 0; leftCellCounter < dimensions; ++leftCellCounter ) { result[leftRowCounter][col] += leftCellArrayArray[leftRowCounter][leftCellCounter] * right[leftCellCounter][col]; } } Sven Groot

  20. Loop Extraction (cont’d) • Once more, in context Sven Groot

  21. for( col = 0; col < rightWidth; ++col ) { leftRow = left.Row; leftCellArrayArray = malloc(sizeof(float*) * dimensions); for( row = 0; row < dimensions; ++row ) { if( leftRow != NULL && leftRow->RowIndex < row ) leftRow = leftRow->Next; leftCellArrayArray[row] = malloc(sizeof(float*) * dimensions); memset(leftCellArrayArray[row], 0, sizeof(float) * dimensions); if( leftRow != NULL && leftRow->RowIndex == row ) { leftCell = leftRow->Cell; for( x = 0; x < dimensions; ++x ) { if( leftCell != NULL && leftCell->ColIndex < x ) leftCell = leftCell->ColNext; if( leftCell != NULL && leftCell->ColIndex == x && leftCell->RowIndex == row ) { leftCellArrayArray[row][x] = leftCell->Value; } else leftCellArrayArray[row][x] = 0; } } } for( row = 0; row < dimensions; ++row ) { for( x = 0; x < dimensions; ++x ) { result[row][col] += leftCellArrayArray[row][x] * right[x][col]; } } for( row = 0; row < dimensions; ++row ) free(leftCellArrayArray[row]); free(leftCellArrayArray); } initialization main loop Sven Groot

  22. for( col = 0; col < dimensions; ++col ) { leftRow = left.Row; leftCellArrayArray = malloc(sizeof(float*) * dimensions); for( leftRowCounter = 0; leftRowCounter < dimensions; ++leftRowCounter ) { leftCellArrayArray[leftRowCounter] = malloc(dimensions * sizeof(float)); memset(leftCellArrayArray[leftRowCounter], 0, dimensions * sizeof(float)); if( leftRow != NULL && leftRowCounter == leftRow->RowIndex ) { leftCell = leftRow->Cell; while( leftCell != NULL ) { leftCellArrayArray[leftRowCounter][leftCell->ColIndex] = leftCell->Value; leftCell = leftCell->ColNext; } leftRow = leftRow->Next; } } for( leftRowCounter = 0; leftRowCounter < dimensions; ++leftRowCounter ) { for( leftCellCounter = 0; leftCellCounter < dimensions; ++leftCellCounter ) { result[leftRow->RowIndex][col] += leftCellArrayArray[leftRowCounter][leftCellCounter] * right[leftCellCounter][col]; } } for( leftRowCounter = 0; leftRowCounter < dimensions; ++leftRowCounter ) free(leftCellArrayArray[leftRowCounter]); free(leftCellArrayArray); } initialization main loop Sven Groot

  23. Transformationresult Sven Groot

  24. voidMatrixMultiplySublimation(struct Matrix left, float** right, float **result, intrightWidth) { structRowHead *leftRow = left.Row; struct Cell *leftCell; float **leftCellArrayArray; int dimensions = left.Dimensions; intcol, row, x; leftCellArrayArray = malloc(sizeof(float*) * dimensions); for( row = 0; row < dimensions; ++row ) { if( leftRow != NULL && leftRow->RowIndex < row ) leftRow = leftRow->Next; leftCellArrayArray[row] = malloc(sizeof(float*) * dimensions); memset(leftCellArrayArray[row], 0, sizeof(float) * dimensions); if( leftRow != NULL && leftRow->RowIndex == row ) { leftCell = leftRow->Cell; for( x = 0; x < dimensions; ++x ) { if( leftCell != NULL && leftCell->ColIndex < x ) leftCell = leftCell->ColNext; if( leftCell != NULL && leftCell->ColIndex == x && leftCell->RowIndex == row ) { leftCellArrayArray[row][x] = leftCell->Value; } else leftCellArrayArray[row][x] = 0; } } } for( col = 0; col < rightWidth; ++col ) { for( row = 0; row < dimensions; ++row ) { for( x = 0; x < dimensions; ++x ) { result[row][col] += leftCellArrayArray[row][x] * right[x][col]; } } } for( row = 0; row < dimensions; ++row ) free(leftCellArrayArray[row]); free(leftCellArrayArray); } Generated declaration initialization main loop Sven Groot

  25. voidMatrixMultiplyAlternativeSublimation(struct Matrix left, float **right, float **result, intrightWidth) { structRowHead *leftRow = left.Row; struct Cell *leftCell; int dimensions = left.Dimensions; intcol; float **leftCellArrayArray; intleftCellCounter, leftRowCounter; leftCellArrayArray = malloc(sizeof(float*) * dimensions); for( leftRowCounter = 0; leftRowCounter < dimensions; ++leftRowCounter ) { leftCellArrayArray[leftRowCounter] = malloc(dimensions * sizeof(float)); memset(leftCellArrayArray[leftRowCounter], 0, dimensions * sizeof(float)); if( leftRow != NULL && leftRowCounter == leftRow->RowIndex ) { leftCell = leftRow->Cell; while( leftCell != NULL ) { leftCellArrayArray[leftRowCounter][leftCell->ColIndex] = leftCell->Value; leftCell = leftCell->ColNext; } leftRow = leftRow->Next; } } for( col = 0; col < rightWidth; ++col ) { for( leftRowCounter = 0; leftRowCounter < dimensions; ++leftRowCounter ) { for( leftCellCounter = 0; leftCellCounter < dimensions; ++leftCellCounter ) { result[leftRowCounter][col] += leftCellArrayArray[leftRowCounter][leftCellCounter] * right[leftCellCounter][col]; } } } for( leftRowCounter = 0; leftRowCounter < dimensions; ++leftRowCounter ) free(leftCellArrayArray[leftRowCounter]); free(leftCellArrayArray); } Generated declarations initialization main loop Sven Groot

  26. Annihilation • Alternative method of transformation • No fill-in: omitted values stay omitted • Sublimation: • Sparse loop: more iterations • Semi-dense loop: same number of iterations • Annihilation • Sparse loop: same number of iterations • Semi-dense loop: less iterations • Can require other transformations Sven Groot

  27. Annihilation (cont’d) • Recall the innermost loop for( x = 0; x < dimensions; ++x ) { if( leftCell != NULL && leftCell->ColIndex < x ) leftCell = leftCell->ColNext; if( leftCell != NULL && leftCell->ColIndex == x && leftCell->RowIndex == row ) { result[row][col] += leftCell->Value * right[x][col]; } } Sven Groot

  28. Annihilation (cont’d) • Initialization leftCellArraySize = 100; leftCellArray = malloc(sizeof(float) * leftCellArraySize); newDimensions = 0; leftCellCopy = leftCell; for( x = 0; x < dimensions; ++x ) { if( newDimensions >= leftCellArraySize ) { leftCellArraySize *= 2; leftCellArray = realloc(leftCellArray, sizeof(float) * leftCellArraySize); } if( leftCellCopy != NULL && leftCellCopy->ColIndex < x ) leftCellCopy = leftCellCopy->ColNext; if( leftCellCopy != NULL && leftCellCopy->ColIndex == x && leftCellCopy->RowIndex == row ) { leftCellArray[newDimensions] = leftCellCopy->Value; ++newDimensions; } } Sven Groot

  29. Annihilation (cont’d) • Initialization (cont’d) rightArraySize = 100; rightArray = malloc(sizeof(float*) * rightArraySize); newDimensions = 0; leftCellCopy = leftCell; for( x = 0; x < dimensions; ++x ) { if( newDimensions >= rightArraySize ) { rightArraySize *= 2; rightArray = realloc(rightArray, sizeof(float) * rightArraySize); } if( leftCellCopy != NULL && leftCellCopy->ColIndex < x ) leftCellCopy = leftCellCopy->ColNext; if( leftCellCopy != NULL && leftCellCopy->ColIndex == x && leftCellCopy->RowIndex == row ) { rightArray[newDimensions] = right[x]; ++newDimensions; } } Sven Groot

  30. Annihilation (cont’d) • Transformed main loop for( x = 0; x < newDimensions; ++x ) { result[row][col] += leftCellArray[x] * rightArray[x][col]; } Sven Groot

  31. Annihilation (cont’d) • Inner loop (alternative) while( leftCell != NULL ) { result[leftRow->RowIndex][col] += leftCell->Value * right[leftCell->ColIndex][col]; leftCell = leftCell->ColNext; } Sven Groot

  32. Annihilation (cont’d) • Initialization (alternative) leftCellArraySize = 100; leftCellArray = malloc(sizeof(float) * leftCellArraySize); newDimensions = 0; leftCellCopy = leftCell; while( leftCellCopy != NULL ) { if( newDimensions >= leftCellArraySize ) { leftCellArraySize *= 2; leftCellArray = realloc(leftCellArray, sizeof(float) * leftCellArraySize); } leftCellArray[newDimensions] = leftCellCopy->Value; ++newDimensions; leftCellCopy = leftCellCopy->ColNext; } rightArraySize = 100; rightArray = malloc(sizeof(float*) * rightArraySize); newDimensions = 0; leftCellCopy = leftCell; while( leftCellCopy != NULL ) { if( newDimensions >= rightArraySize ) { rightArraySize *= 2; rightArray = realloc(rightArray, sizeof(float) * rightArraySize); } rightArray[newDimensions] = right[leftCell->ColIndex]; ++newDimensions; leftCellCopy = leftCellCopy->ColNext; } leftCell right Sven Groot

  33. Annihilation (cont’d) • Transformed main loop (alternative) for( leftCellCounter = 0; leftCellCounter < newDimensions; ++leftCellCounter ) { result[leftRow->RowIndex][col] += leftCellArray[leftCellCounter] * rightArray[leftCellCounter][col]; } Sven Groot

  34. Post-initialization • Pre-initialization: before the main loop • Post-initialization: after the main loop • Needed when an expression that needs to be transformed is written to • Needs to use index expression • Fill-in value not needed Sven Groot

  35. Post-initialization (cont’d) • Example • Result while( node != NULL ) { node->Value = node->Value * 2; node = node->Next; } nodeArray = malloc(size * sizeof(int)); nodeCopy= node; memset(nodeArray, 0, size * sizeof(int)); while( nodeCopy != NULL ) { nodeArray[nodeCopy] = nodeCopy->Value; nodeCopy = nodeCopy->Next; } • for( nodeCounter = 0; nodeCounter < size; ++nodeCounter ) { • nodeArray[nodeCounter] = nodeArray[nodeCounter] * 2; } nodeCopy = node; while( nodeCopy != NULL ) { nodeCopy->Value = nodeArray[nodeCopy->Index]; nodeCopy = nodeCopy->Next; } Pre-init Main loop Post-init Sven Groot

  36. Automated transformation • Seven steps • Find candidate structures • Analyze usage of these structures in the code • Determine transformation safety • Identify data members • Generate dense data structures • Transform • Loop extraction • Code must be normalized Sven Groot

  37. Conditions • The linked list expression must not have side effects • Loop termination control must be trivial • The linked list iteration statement may be the only statement in the loop body that modifies the linked list expression • The “next” pointer member may not be a data member • Any expression, other than the linked list iteration statement, that might be moved to an initialization loop may not have side effects, and use only constants, loop-invariant values, linked list members and loop control variables • If the linked list expression is guarded, it must be possible to move that entire guard, including both the true and false parts, to the initialization loop. • When performing annihilation on a semi-dense loop, there must be a single guard that covers all statements in the loop body except for the linked list iteration statement and its guard, and statements related to loop control (such as those that increment the counter). Sven Groot

  38. Transformation directives • Fill in gaps in the compiler’s knowledge • Embedded in source code as comments • Examples • SAFE_CODE, UNSAFE_CODE • SAFE_LOOP, UNSAFE_LOOP • DENSE_INDEX • DENSE_DIMENSIONS • FILL_IN • Etc. Sven Groot

  39. Transformation Directives (cont’d) • Example /***SAFE_CODE***/ /***DENSE_INDEX(node, node->Index)***/ /***DENSE_DIMENSION(node, size)***/ while( node != NULL ) { node->Value = node->Value * 2; node = node->Next; } /***UNSAFE_CODE***/ Sven Groot

  40. Experimentation • Tested on three matrices • Sublimation code: ran through MT1 • Annihilation code: loop interchange • Used tools: Intel C Compiler, Intel FORTRAN Compiler • Test system: Dual Intel Xeon 3.06GHz, 1GB RAM Sven Groot

  41. Experimentation (cont’d) Sven Groot

  42. Experimentation (cont’d) Sven Groot

  43. Sparse code optimization Automatic transformation of linked list pointer structures Sven Groot

More Related