1 / 11

Modification of a Copy Function to Reduce Average Cycles Per Element Using the Y86 Processor

Modification of a Copy Function to Reduce Average Cycles Per Element Using the Y86 Processor. By Jake Coogle And Doris Marley. Implementing iaddl. Most significant modification Replaced numerous instructions Lowered CPE by 2.93. Better Branch Prediction.

dswann
Download Presentation

Modification of a Copy Function to Reduce Average Cycles Per Element Using the Y86 Processor

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Modification of a Copy Function to Reduce Average Cycles Per Element Using the Y86 Processor By Jake Coogle And Doris Marley

  2. Implementing iaddl • Most significant modification • Replaced numerous instructions • Lowered CPE by 2.93

  3. Better Branch Prediction • Second most significant modification. • Reordered code to jump more often. • Reducing number of mispredicted branches means less mispredicted branch recovery. • Duplication of code necessary for functionality. • Lowered CPE by 1.85.

  4. Eliminating Bubble • Next significant modification. • Two back-to-back memory accesses. • Remedied by inserting another instruction in between. • Eliminated one instruction though each loop iteration • Lowered CPE by 1.0

  5. Check If Positive • Tied with last significant modification. • Earlier instruction set condition codes, so use that instruction to determine if jump. • Eliminated one instruction through each loop iteration. • Lowered CPE by 1.0

  6. From Count++ to Count-- • Fifth most significant modification • Start the count at length instead of 0. • Decrement when negative. • Count register updated less frequently. • CPE lowered by 0.78.

  7. Implementing ileave • Least significant modification. • Replaced one instruction per function call. • Reduced CPE by 0.07

  8. CPE Reduction

  9. Total CPE reduction – 7.63 Average CPE reduced to 10.52

  10. THE END

More Related