290 likes | 459 Views
Pitfalls of ORION-Based Simulation. Mitchell Hayenga Daniel Johnson Mikko Lipasti. Background. ORION is the most widely used NoC estimation tool Provides useful area/power estimations for on-chip networks Utilized by 2/3 of all NoC papers (ISCA, HPCA, MICRO 2010-2011) History
E N D
Pitfalls of ORION-Based Simulation Mitchell Hayenga Daniel Johnson MikkoLipasti
Background • ORION is the most widely used NoC estimation tool • Provides useful area/power estimations for on-chip networks • Utilized by 2/3 of all NoC papers (ISCA, HPCA, MICRO 2010-2011) • History • Orion 1.0 – MICRO 2002 • Initial power model • ORION 2.0 – DATE 2009 • Improved power modeling, updated technology nodes • Clock & Link • Area Models
Motivation • Surveying the literature • Large disagreement on area/power of interconnection networks • Area estimates differing by 10x • Power estimates differing by 5x • Seem to be 2 primary groups • ORION-based • Independent Models • ORION-based papers impacted by implicit assumptions/approximations/errors present in the underlying framework.
Router Power Estimates • Large differences between ORION and non-ORION estimates ORION Other
Overview • Component-wise investigation into ORION’s area/power models • Alternative tools and published designs used for validation • Observe ORION overestimating results • 5x buffer area • 1.4x buffer power • 30x crossbar area • Observe some model inconsistencies and potential pitfalls • All evaluations done at the 45nm tech node (ORION 2.0) • Investigation into how this may have impacted other’s results • Tradeoffs matter
SRAM Buffers • Evaluate SRAM-based buffers • Area and power • Utilized two memory compilers for validation • FabMem • A commercial memory compiler1 • Limited Evaluation to singlecolumn SRAMs 1. Unnamed due to a confidentiality agreement
Buffer Area 4.5x
Buffer Area Observations • ORION is modeling only data array area • Sense amp/precharge/decoder area missing • Overheads sizable for these small SRAMs • Area/power model is based upon the assumed bitcell size • sim_router_area.c:56-64 • 5.16x difference between ORION and FabMem SRAM cells
Buffer Area Model • BitCellHeight = RegHeight + 2 * WordLineSpacing • BitCellWidth = RegWidth + 2 * BitLineSpacing * (ReadPorts + WritePorts) Reg Cell
Buffer Power • Also compared power of associated SRAM structures • FabMem is not optimized for power • Future work • Precharge voltage • Body bias • Measured power required to Read+Write one flit • ORION estimates higher • 14% - 59% (geomean 40%)
Orion Crossbar Models • Two crossbar models • Tristate-based “Matrix” • Multiplexer-tree based design
Matrix Crossbar • Not addressed in the paper • Crossbars generally wire or logic dominated • This one appears to be neither
Alternative Crossbar • Similar design used in previous publication [Hayenga MICRO 2011] • Area determined by wire spacing • Similar to Balfour/Dally’s “Design Tradeoffs in Tiled CMP Networks” [ICS 2006]
Alternative Crossbar Control Control Control North0 East0 South0 West0 Local0 North1 East1 South1 West1 Local1 Output0 Output2 Output1
Crossbar Area 30x
Model Consistency • ORION’s multiplexer area and power model area differently • Power Model • Assumes the structural crossbar mentioned earlier • Area Model • Back of the envelope-like calculation • Questionable
Model Consistency • Power Model’s Area Calculation
Model Consistency • Area Model’s Calculation • Works in 2 phases • Calculates the number of MUX’s needed to realize an N:1 MUX using a tree of multiplexers • Multiplies this area by a variety of constants, input widths, etc to realize the full crossbar area • Supports 2-input and 4-input based MUX trees
Area Model Calculation • Depth = ceil( log4InputPorts) • NumMux = (4Depth-1)/3 • MuxArea = 1.5 * NumMux * AreaMux4 • CrossbarArea = FlitWidth * MuxArea * Outputs * Inputs
Model Consistency 6.25x
Impacts on the Literature • Relative cost of buffers • Network topology • Router Radix • Bus vs Multi-hop
Impacts on the Literature – Buffer Modeling • Buffered vs. Bufferless designs • Non-ORION [Michelogiannakis/Dally NOCS 2010] • Best case results in an insignificant benefit • Often results in less performance and higher power • ORION • 54.9-73.4% reduction in network power [Fallin/Mutlu HPCA 2011] • 32% reduction in network power [Jafri/Hong/VijaykumarMICRO 2010]
Impacts on the Literature – Network Design • Bus-based interconnects vs conventional NoCs • Report enormous energy savings • 26x Reduction [Carpenter/Wu ISCA 2011] • 31x Reduction [Udipi/Balasubramonian HPCA 2010] • Correcting ORION’s overestimation would somewhat reduce these papers benefits
Conclusion • Detailed evaulation of ORION’s structural modeling • SRAM Buffers • Crossbar • Saw large overestimations for both area and power • Case studies of current literature • Hopefully some good insight from the rebuttal