110 likes | 116 Views
Modification of a Copy Function to Reduce Average Cycles Per Element Using the Y86 Processor. By Jake Coogle And Doris Marley. Implementing iaddl. Most significant modification Replaced numerous instructions Lowered CPE by 2.93. Better Branch Prediction.
E N D
Modification of a Copy Function to Reduce Average Cycles Per Element Using the Y86 Processor By Jake Coogle And Doris Marley
Implementing iaddl • Most significant modification • Replaced numerous instructions • Lowered CPE by 2.93
Better Branch Prediction • Second most significant modification. • Reordered code to jump more often. • Reducing number of mispredicted branches means less mispredicted branch recovery. • Duplication of code necessary for functionality. • Lowered CPE by 1.85.
Eliminating Bubble • Next significant modification. • Two back-to-back memory accesses. • Remedied by inserting another instruction in between. • Eliminated one instruction though each loop iteration • Lowered CPE by 1.0
Check If Positive • Tied with last significant modification. • Earlier instruction set condition codes, so use that instruction to determine if jump. • Eliminated one instruction through each loop iteration. • Lowered CPE by 1.0
From Count++ to Count-- • Fifth most significant modification • Start the count at length instead of 0. • Decrement when negative. • Count register updated less frequently. • CPE lowered by 0.78.
Implementing ileave • Least significant modification. • Replaced one instruction per function call. • Reduced CPE by 0.07
Total CPE reduction – 7.63 Average CPE reduced to 10.52