1 / 33

Cache Tuning

Learn about cache concepts, memory hierarchy, cache coherence, and specific techniques for cache tuning in the Global Cyber Bridges program. Includes exercises and discussion.

rbenjamin
Download Presentation

Cache Tuning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cache Tuning – Global Cyber Bridges CacheTuning Student: João Gabriel Gazolla Professor: Dr. S. Masoud Sadjadi

  2. Sections Cache Tuning – Global Cyber Bridges • Cache Concepts • Locality • Cache Hit and Miss • Memory Hierarchy • Kinds of Cache • Cache Coherence • Specifics • Thrashing • Cache Exercises • Conclusion • Discussion

  3. Cache Concepts Cache Tuning – Global Cyber Bridges ADD A,B,C MOVE B,A MUL A,B,C clock cycles executing instructions clock cycles waiting for memory • CPU time required to perform an operation is:

  4. Cache Concepts Cache Tuning – Global Cyber Bridges • The CPU cannot be performing useful work if it is waiting for data to arrive from memory.

  5. Cache Concepts • The memory system is a major factor in determining the performance of your program and a large part is your use of the cache.

  6. Cache Concepts • The memory system is a major factor in determining the performance of your program and a large part is your use of the cache.

  7. Cache Concepts • The memory system is a major factor in determining the performance of your program and a large part is your use of the cache.

  8. Cache Concepts Cache Tuning – Global Cyber Bridges • Other Comments:

  9. Interleaving Cache Tuning – Global Cyber Bridges bank cycle time is 4-8 times the CPU clock So if I can acess in parallel I solve the problem getting more information and putting together • Sequential Elements, are together (Fortran Style):

  10. Temporal Locality Cache Tuning – Global Cyber Bridges #include <iostream> ... Intmain(){ int a = 0; for (int i=0;i<987654;i++){ a = a+i; cout << a << endl; } return 0; } Cache It! 90% of Time 10% of THE CODE “When an item is referenced, it will be referenced again soon”

  11. Spatial Locality Cache Tuning – Global Cyber Bridges Get Data N and... N+1,N+2,N+3,N+4 Butnotsomany... “When an item is referenced, items whose addresses are nearby will tend to be referenced soon. ”

  12. Cache Hit MAXIMIZE it ! Cache Tuning – Global Cyber Bridges What is Cache Hit Rate?

  13. Cache Miss MINIMIZE it ! Cache Tuning – Global Cyber Bridges What is Cache Miss Rate? What is Cache Miss Penalty?

  14. Memory Hierarchy Sizes *1024 Bytes Cache Tuning – Global Cyber Bridges *1024 KBytes *1024 MBytes GBytes

  15. There are 3 kinds of cache: Cache Tuning – Global Cyber Bridges • Direct mapped cache • Set associative cache • Fully associative cache 21%

  16. Directed Maped Cache Cache Tuning – Global Cyber Bridges How it works? use MOD op. Direct Mapped Cache

  17. Thrashing Process has not enough pages Page-Fault is Ultra High Low CPU Usage Let’s Increase Multiprogramming Cache Tuning – Global Cyber Bridges

  18. Fully Associative Cache Cache Tuning – Global Cyber Bridges

  19. Set Associative Cache Cache Tuning – Global Cyber Bridges • This is a trade-off between direct mapped and fully associative cache.

  20. Cache Block Replacement Cache Tuning – Global Cyber Bridges • direct mapped cache

  21. Cache Block Replacement Cache Tuning – Global Cyber Bridges FIFO Random LRU “When an item is referenced, it will be referenced again soon” • set associative cache

  22. Cache Specifics Cache Tuning – Global Cyber Bridges Itanium SGI Origin 2000 Pentium III • CacheSize • Replacement • Acess Time • Commands to Measure Performance Specificsandit’stechnology Go To: tinyurl.com/gcbcache2

  23. Cache Coherence Copy 1 of Data A Copy 2 of Data A Cache Tuning – Global Cyber Bridges Data A Copy 3 of Data A

  24. P1 P2 P3 PN Cache Coherence: Snoop Protocol . . . Cache Tuning – Global Cyber Bridges WritingonLine 4 Line 4 notValidAnyMore MEMORY

  25. Cache Coherence: Directory Based Protocol Cache Tuning – Global Cyber Bridges • Directory Based Protocol • Cache lines contain extra bits that indicate which other processor has a copy of that cache line, and the status of the cache line – clean (cache line does not need to be sent back to main memory) or dirty (cache line needs to update main memory with content of cache line). • Hardware Cache Coherence • Cache coherence on the Origin computer is maintained in the hardware, transparent to the programmer.

  26. Cache Coherence: False Sharing Cache Tuning – Global Cyber Bridges struct foo { volatile int x; volatile int y; }; foo f; int sum_a() { int s = 0; for (int i = 0; i < 1000000; ++i) s += f.x; return s; } void inc_b() { for (int i = 0; i < 1000000; ++i) ++f.y; }

  27. Cache Exercises sum = 0; for (i = 0; i < n; i++) sum += a[i]; return sum; • Examples of Locality: • Data • Acess Elements in Series: • Reference to sum in each iteraction: • Instruction • Instruction done in Sequence: • Always walking through the loop: Spatial Temporal Spatial Temporal

  28. Cache Exercises int sumarrayrows(int a[M][N]) { int i, j, sum = 0; for (i = 0; i < M; i++) for (j = 0; j < N; j++) sum += a[i][j]; return sum } Does this function has Good locality ? 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1

  29. Cache Exercises int sumarraycols(int a[M][N]) { int i, j, sum = 0; for (j = 0; j < N; j++) for (i = 0; i < M; i++) sum += a[i][j]; return sum } Does this function has Good locality ? 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1

  30. Conclusions 100%

  31. Sources Cache Tuning – Global Cyber Bridges • Slides Prepared from the CI-Tutor Courses at NCSA by S. Masoud Sadjadi • Memória Cache, Simone Martins, 2008. • Wikipedia • www.ariadne.ac.uk • parasol.tamu.edu/~rwerger/Courses/654/cachecoherence1.pdf • www.cs.unc.edu/~montek/teaching/fall-05/lectures/lecture-16.ppt • http://www.ic.uff.br/~simone/sistemascomp/ • David A. Patterson; John L. Hennessy. Organização e Projeto de Computadores, A Interface Hardware/Software LTC, 2000. Página do livro em inglês .

  32. Sources • Randal E. Bryant and David R. O´Hallaron. Computer Systems: A Programmer´s Perspective. Prentice Hall 2002. Página do livro • Many Google Image Queries Cache Tuning – Global Cyber Bridges

  33. Doubts? Comments? Extras? Cache Tuning – Global Cyber Bridges • Download of the Presentation: • www.gabrielgazolla.com/gcbCT.zip

More Related