310 likes | 512 Views
† VSSAD Intel. ‡ CSAIL MIT. Hasim. Michael Adler † , Artur Klauser † , Angshuman Parashar † , Michael Pellauer ‡ , Murali Vijayaraghavan ‡. Joel Emer †‡. Overview. Goal Produce compelling evidence for architecture ideas Requirements Cycle accurate simulation
E N D
†VSSADIntel ‡CSAILMIT Hasim Michael Adler†, Artur Klauser†, Angshuman Parashar†, Michael Pellauer‡, Murali Vijayaraghavan‡ Joel Emer†‡
Overview • Goal • Produce compelling evidence for architecture ideas • Requirements • Cycle accurate simulation • Representative simulation length • Software development (often) • Current approach • Mostly software simulation (10 KHz to 1 KHz) • New approach • Build a performance model in an FPGA Hasim
FPGA-based approaches • Prototyping • Build a logically isomorphic representation of the design • Modeling • Build a performance simulation in gates • Hybrids • Build something that is partially a prototype and partially a model Hasim
Recreate Asim in hardware • Modularity • Inter-module communication • Functional/Timing Partitioning • Modeling Utilities Hasim
Why modularity? • Speed of model development • Shared components between products • Reuse across generations • Encourages isomorphism to design • Improved fidelity • Facilitates speed/fidelity trade-offs • Architectural experimentation • Factorial development and evaluations • Sharing Hasim
C M N F D R X C W B ASIM Module Hierarchy S Hasim
S B C M N B F D R X C W B B B ASIM Module Selection B Hasim
B C M N B F D R X C W B B B B Module Selection S S C M N F D R X C W Hasim
B C M N B B B B Module Replacement S X F D R X C W Hasim
(H)ASIM Module Hierarchy Hasim
F D R X C W N N Communication C Hasim
Named connections S A-out A-in D Hasim
Model and FPGA Cycles Port Port ModuleB Module A Port Port Hasim
Functional/Timing Decomposition • ISA semantics • Platform semantics • Micro-architecture Timing Partition Functional Partition Fetch(PC) … Instruction • Simplifies timing model • Amortize functional model design effort over many models • Can be pipelined for performance • Can be FPGA-friendly design • Can be split across hardware and software Hasim
Execute@execute phases • Fetch instruction • Speculatively execute instruction • Read memory* • Speculatively write memory* (locally visible) • Commit or Abort instruction • Write memory* (globally visible)* Optional depending on instruction type Hasim
F D X C F D X R C F D X W C W F D X R A F D X X C W Execution in phases Assertion: All data dependencies can be represented in these phases Hasim
Token Gen Fet Dec Exe Mem LCom GCom HASim: Partitioning Overview TimingPartition Memory State Register State RegFile Functional Partition Hasim
Common Infrastructure • Modules • Inter-module communication • Statistics gathering • Event logging • Debug Tracing • Simulation control • … Hasim
Bluespec (Asim-style) module module [HAsim_module] mkCache#() (Empty);Port#(Addr) req_port <- mkSendPort(‘a2cache’); Port#(Bool) resp_port <- mkRecvPort(‘cache2a’); TagArray tagarray <- mkTagArray(); rule cycle(True); Maybe#(Addr) mx = req_port.get(); if (isValid(mx)) resp_port.put(tagarray.lookup(validValue(mx))); endruleendmodule Hasim
Bluespec (Asim-style) submodule • module mkTagArray(TagArray); RegFile#(Bit#(12),Bit#(4)) tagArray<- mkRegFileFull(...); method Bool lookup(Bit#(16) a); return (tagArray.sub(getIndex(a)) == getTag(a)); endmethod • function Bit#(4) getTag(Address x); return x[15:12]; endfunction • function Bit#(12) getIndex(Address x); return x[11:0]; endfunction • endmodule Hasim
Support functions - stats module mkCache#(...) (Empty); ... cache_hits <- mkStat(...); ... hit=tagarray.lookup(...); if (hit) cache_hits.increment(); endif ...endmodule Module Stat Counter Module Stat Counter Stat Dumper Module Stat Counter Hasim
2Dreams Hasim
Support functions - events module mkCache#(...) (Empty); ... cache_event <- mkEvent(...); ... hit=tagarray.lookup(...); cache_event.report(hit); ...endmodule Module Event Reg Module Event Reg Event Dumper Module Event Reg Hasim
Support functions – global controller module mkCache#(...) (Empty); ... ctrl <- mkCntrlr(...); ... rule (ctrl.run())... endrule endmodule Module Controller Module Controller GlobalController Module Controller Hasim
FPGA-based prototype Prototyping Catch-22… Hasim
M C C F F F D D D R R R X X X C C C W W W Module Instantiation U C M N Hasim
S S RC S C C C C M M M M N N N N SM SC S RC RM SM SC RM Factorial Coding/Experiments Hasim
HAsim: Current status - models • Simple RISC functional model operating • Simple RISC ISA • Pipelined multi-phase instruction execution • Supports speculative OOO design • Physical Reg File and ROB • Small physically addressed memory • Fast speculative rewinds • Instruction-per-cycle (APE) model • Runs simple benchmarks on FPGA • Five stage pipeline • Supports branch mis-speculation • Runs simple benchmarks (in software simulation) • X86 functional model architecture under development Hasim
baz baz bar bar foo foo Connections Implement Ports PM (Module Tree w. Connections) PM (Hardware Modules w. Wrappers) Implemented via connections. Hasim
Timing Model Resources (Fast) • OOO, branch prediction, three functional units, 32KB 2-way set associative ICache and DCache, iTLB, dTLB2142 slices (15% of a 2VP30) • 21 block RAMs (15% of a 2VP30) • Configurable cache model • 32KB 4-way set associative cache with 16B cache-lines • 165 slices (1% of a 2VP30) • 17 block RAMs (12% of a 2VP30) • 2MB 4-way set-associative cache with 64B cache-lines • 140 slices (1% of a 2VP30) • 40 block RAMs (29% of a 2VP30) • Current FPGAs (4VFX140) • 142,128 slices • 552 block RAMs • 2 PowerPCs Hasim