210 likes | 224 Views
Arch Explorer Lecture 5. John Cavazos Dept of Computer & Information Sciences University of Delaware www.cis.udel.edu/~cavazos/cisc879. Motivation. Need for systematic quantitative comparison. [MICRO 2004, Gracia-Pérez et al.]. Computer Arch Research. IDEA. REPRODUCTION EXISTING
E N D
Arch Explorer Lecture 5 John Cavazos Dept of Computer & Information Sciences University of Delaware www.cis.udel.edu/~cavazos/cisc879
Motivation Need for systematic quantitative comparison [MICRO 2004, Gracia-Pérez et al.]
Computer Arch Research IDEA REPRODUCTION EXISTING TECHNIQUES FAIR COMPARISON EXPLORATION
Design space exploration AUTOMATIC EXPLORATION Need more than intuition and experience! execution time • Multi-objectives • Time-to-market power area
ArchExplorer archexplorer.org upload test pick design points database add results daily update Website Server-side Infrastructure FULLY AUTOMATIC simulation cluster
How to compare? F D S EX WB CM SM M • Custom simulator • Hardware compatibility • Software compatibility • Upload CustomSimulator F D S EX WB CM M SM $ TLB $ TLB $ $ MEM Wrapped Simulator& Parameter ranges CPU BP IL1 DL1 L2 MEM
Hardware compatibility Instruction caches Data caches Branch predictors Interconnects Main memory Accelerators ...
Software compatibility Isolate the hardware block, possibly by from centralized control to distributed control
Software compatibility Wrapping in SystemC-based on UNISIM communication layer Models of computation Self-Configuration and parameters legality
Case study Memory sub-system for embedded processor • PowerPC405 • 8 different cache modules available • Complex hierarchies automatically explored • Ranking designs for performance, power, energy, area,... Victim Cache Timekeeping Victim cache Stride Prefetcher Content-Directed Prefetcher Stride + Content Directed Prefetcher Tag Prefetcher Global History Prefetcher Skewed associtiative cache
Accurate comparison needs compiler tuning as well 2.62 P1 P2 < 1.23 baseline 1.09 P1 P2 > Tuned to P1, tuned to P2
Best data cache mechanisms per area CONCLUSIONS: Contrast to Gracia-Pérez et al. [MICRO 2004] No clear winner Close to tuned parametric cache
Best data cache mechanisms per area CONCLUSIONS: Contrast to Gracia-Pérez et al. [MICRO 2004] No clear winner Close to tuned parametric cache
Check out this website: ARCHEXPLORER.ORG
Conclusion • Permanent open competition(s) • Future: • superscalar processor • branch predictor repository • multi-cores • Open for your ideas! • NoC, compiler extensions,...
Check out this website: ARCHEXPLORER.ORG
Veerle Desmet – Sylvain Girbal – Olivier Temam 6th HiPEAC Industrial Workshop – Thales Nov 26th, 2008 Genetic Search Algorithm StatisticalExploration $ $ $ CPU Convergence MEM BP BP • Permanently ranks all designs • per area bucket • speedup or power • assigning higher probability to better points • Picking a point according to distribution • Mutations & crossover • Natural selection
Features for Systematic DSE configs configs configs http://unisim.org Standardized Interfaces ModuleRepository CompatibilityDatabase ParameterCheck ParameterIntrospection Compiler FlagDatabase PPC ARM Module category Module interfaces Known models Configuration validityRanges Params. relationship Probing neighbors parameters Machine description WB$ NBWB$ VC$ SP$ TVC$ compilerflags benchmarksdatasets CDP$ CDPSP$ TagP$ GHB$ BUS Compatibilitydatabase DRAM DRAM nBanks {2;4;8} tRAS+tCD<tRCD Predictive modeling Module exploration Module parameter tuning CompilerExploration focused search algorithm Design Space Exploration Selection probability Fast convergence