110 likes | 320 Views
ECE/CS 752: Midterm 2 Review. Instructor:Mikko H Lipasti Spring 2012 University of Wisconsin-Madison. Midterm 2 Review Topics. Replacement policies ( Lect 11) Advanced Caches ( Lect 14) Prefetching ( Lect 15) Main Memory ( Lect 16) Advanced Microarchitecture ( Lect 17)
E N D
ECE/CS 752: Midterm 2 Review Instructor:Mikko H Lipasti Spring 2012 University of Wisconsin-Madison
Midterm 2 Review Topics • Replacement policies (Lect 11) • Advanced Caches (Lect 14) • Prefetching (Lect 15) • Main Memory (Lect 16) • Advanced Microarchitecture (Lect 17) • Multiple Threads (Lect 18)
Replacement Policies • Replacement policies affect capacity and conflict misses • Policies covered: • Belady’s optimal replacement • Least-recently used (LRU) • Practical pseudo-LRU (tree LRU) • Protected LRU • LIP/DIP variant • Set dueling to dynamically select policy • Not-recently-used (NRU) or clock algorithm • RRIP (re-reference interval prediction) • Least frequently used (LFU) • Contest results
Advanced Caches • Coherent Memory Interface • Evaluation methods • Better miss rate: skewed associative caches, victim caches • Reducing miss costs through software restructuring • Higher bandwidth: Lock-up free caches, superscalar caches • Beyond simple blocks • Two level caches
Prefetching • Prefetching anticipates future memory references • Software prefetching • Next-block, stride prefetching • Markov, Global history buffer prefetching • Issues/challenges • Accuracy • Timeliness • Overhead (bandwidth) • Conflicts (displace useful data)
Main Memory • DRAM chips • Memory organization • Interleaving • Banking • Memory controller design • Hybrid Memory Cube • Phase Change Memory (reading) • Virtual memory • TLBs • Interaction of caches and virtual memory (Baer et al.)
Adv. Memory - Readings • Read on your own: • Review: Shen & Lipasti Chapter 3 • W.-H. Wang, J.-L. Baer, and H. M. Levy. Organization of a two-level virtual-real cache hierarchy, Proc. 16th ISCA, pp. 140-148, June 1989 (B6)Online PDF • D. Kroft. Lockup-Free Instruction Fetch/Prefetch Cache Organization, Proc. International Symposium on Computer Architecture , May 1981 (B6). Online PDF • N.P. Jouppi. Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and Prefetch Buffers, Proc. International Symposium on Computer Architecture , June 1990 (B6). Online PDF • Discuss in class: • Review due 3/24/2010: Benjamin C. Lee, Ping Zhou, Jun Yang, Youtao Zhang, Bo Zhao, Engin Ipek, Onur Mutlu, Doug Burger, "Phase-Change Technology and the Future of Main Memory," IEEE Micro, vol. 30, no. 1, pp. 143-143, Jan./Feb. 2010 • Read Sec. 1, skim Sec. 2, read Sec. 3: Bruce Jacob, “The Memory System: You Can't Avoid It, You Can't Ignore It, You Can't Fake It,” Synthesis Lectures on Computer Architecture 2009 4:1, 1-77.
Adv Microarchitecture • Instruction scheduling overview • Scheduling atomicity • Speculative scheduling • Scheduling recovery • Complexity-effective instruction scheduling techniques • CRIB reading • Scalable load/store handling • NoSQ reading • Building large instruction windows • Runahead, CFP, iCFP • Control Independence • 3D die stacking
AdvMicroarch -- Readings • Read on your own: • Shen & Lipasti Chapter 10 on Advanced Register Data Flow – skim • I. Kim and M. Lipasti, “Understanding Scheduling Replay Schemes,” in Proceedings of the 10th International Symposium on High-performance Computer Architecture (HPCA-10), February 2004. • SrikanthSrinivasan, Ravi Rajwar, HaithamAkkary, Amit Gandhi, and Mike Upton, “Continual Flow Pipelines”, in Proceedings of ASPLOS 2004, October 2004. • Ahmed S. Al-Zawawi, Vimal K. Reddy, Eric Rotenberg, Haitham H. Akkary, “Transparent Control Independence,” in Proceedings of ISCA-34, 2007. • To be discussed in class: • T. Shaw, M. Martin, A. Roth, “NoSQ: Store-Load Communication without a Store Queue, ” in Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture, 2006. • Erika Gunadi, Mikko Lipasti: CRIB: Combined Rename, Issue, and Bypass, ISCA 2011. • Andrew Hilton, Amir Roth, "BOLT: Energy-efficient Out-of-Order Latency-Tolerant execution," Proceedings of HPCA 2010. • Loh, G. H., Xie, Y., and Black, B. 2007. Processor Design in 3D Die-Stacking Technologies. IEEE Micro 27, 3 (May. 2007), 31-48.
Multiple Threads - Topics • Thread-level parallelism • Synchronization • Multiprocessors • Explicit multithreading • Implicit multithreading: Multiscalar • Niagara case study
Multiple Threads - Readings • Read on your own: • Shen & Lipasti Chapter 11 • G. S. Sohi, S. E. Breach and T.N. Vijaykumar. Multiscalar Processors, Proc. 22nd Annual International Symposium on Computer Architecture, June 1995. • Dean M. Tullsen, Susan J. Eggers, Joel S. Emer, Henry M. Levy, Jack L. Lo, and Rebecca L. Stamm. Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multithreading Processor, Proc. 23rd Annual International Symposium on Computer Architecture, May 1996 (B5) • To be discussed in class: • Poonacha Kongetira, Kathirgamar Aingaran, Kunle Olukotun, Niagara: A 32-Way Multithreaded Sparc Processor, IEEE Micro, March-April 2005, pp. 21-29.