150 likes | 257 Views
Combining Simulators and FPGAs “An Out-of-Body Experience”. Eric S. Chung , Brian Gold, James C. Hoe, Babak Falsafi {echung, bgold, jhoe, babak}@ece.cmu.edu. S IM F LEX /P ROTO F LEX. The RAMP full-system challenge. RAMP vision for studying systems w/ FPGAs
E N D
Combining Simulators and FPGAs “An Out-of-Body Experience” Eric S. Chung, Brian Gold, James C. Hoe, Babak Falsafi {echung, bgold, jhoe, babak}@ece.cmu.edu SIMFLEX/PROTOFLEX
The RAMP full-system challenge • RAMP vision for studying systems w/ FPGAs • functional & cycle-accurate simulation • scalability, speed, & flexibility on FPGAs • full-system (run unmodified binaries & OS) • • • IRQ controller DMAcontroller I/O MMUcontroller CPU CPU Terminal PCI Bus Memory Ethernetcontroller SCSIcontroller Graphics card Disk Disk ‘Full-sys’ RAMP will incur large effortyet, not all behaviors frequently used (e.g., I/O) Eric S. Chung / RAMP 2006 Summer Retreat
Combining simulators & FPGAs • Simulators already provide full-system why not simulate infrequent behaviors (e.g., I/O devices)? Simulator FPGA CPU CPU CPU CPU Ethernet Ethernet SCSI Memory SCSI Memory disk disk • Advantages • avoid impl. infreq. behaviors lowers full-sys FPGA development • low impact on scalability & perf. on FPGA Eric S. Chung / RAMP 2006 Summer Retreat
Outline • Motivation • Migration • Implementation status • Conclusion Eric S. Chung / RAMP 2006 Summer Retreat
1 1 2 2 3 3 Migration Target design FPGA Simulator “Target objects”ex: func or timing cpu • 3 ways to map target object to host FPGA-only Simulation-only Migratable • Migratable objects • switch modes between FPGA & simulator hosts • target behavior need not be 100% in FPGA mode e.g., impl. 80% target behavior in FPGA, 100% in simulator Eric S. Chung / RAMP 2006 Summer Retreat
load CPU SCSI cmd Migration example Target-to-host mappings: • CPU = migratable • Memory = FPGA-only • Devices = SW-only CPU FPGA SCSI Memory Example CPU instruction stream CPU state transfer Simulator load CPU add time multiply I/O SCSI cmd SCSI Memory add sub .. disk Eric S. Chung / RAMP 2006 Summer Retreat
Advantages • Lowers development effort • avoid bring-up of infrequent behaviors • migrate & validate ref. models from simulator • tailor impl. to workload (avoid rarely used instrs, good for CISC x86) • Fast & scalable • perf-critical objects on FPGA (eg, CPU, memory) • scalable for MPs add migratable CPUs FPGA Simulator CPU CPU CPU CPU CPU CPU SCSI Memory Memory SCSI disk Eric S. Chung / RAMP 2006 Summer Retreat
CPU Subtleties • Objects separated in simulator/FPGA interact • examples: interrupts, DMA • handle by forwarding messages between FPGA/simulator • FPGA-only & SW-only mapped objects easy to locate • migrated objects require tracking Simulator FPGA CPU CPU DMA SCSI Memory SCSI Memory disk Forwarded DMA Eric S. Chung / RAMP 2006 Summer Retreat
CPU Subtleties • Objects separated in simulator/FPGA interact • examples: interrupts, DMA • handle by forwarding messages between FPGA/simulator • FPGA-only & SW-only mapped objects easy to locate • migrated objects require tracking Option 2:Forced migration Option 1:Forwarded interrupt Simulator FPGA CPU CPU Interrupt SCSI Memory SCSI Memory disk Cross-host interactions rare low impact on FPGA perf. Eric S. Chung / RAMP 2006 Summer Retreat
Subtleties cont. • Migration cost • migrating object requires state copy e.g., migratable CPU has registers & TLBs • FPGA-to-simulator latency & sim. time limits # migrations/instr • FPGA & simulator asynchrony • simulated time “ticks” at different rates in FPGA & simulator • must synchronize for deterministic replay & accurate device timing Eric S. Chung / RAMP 2006 Summer Retreat
Outline • Motivation • Migration • Implementation in progress • Conclusion Eric S. Chung / RAMP 2006 Summer Retreat
Implementation status • Target system • Sun Fire[tm] 3800 Server (up to 24-way) • UltraSPARC III ISA • Solaris 8 • Proof-of-concept software-to-software migration • run 2 instances of Virtutech Simics • migration designed & tested in 2 weeks • can migrate on arbitrary behavior (e.g., ADD instruction) Eric S. Chung / RAMP 2006 Summer Retreat
BlueSPARC core (in progress) • In-order SPARCV9 core • supports 144 out of 170 integer instr behaviors • supports partial MMU w/ I- & D-TLBs • goal: 99.999% of instrs & behaviors in target workloads • SPEC (mostly user-level), OLTP/DB2 (high TLB misses, 40% time in priv-mode) • CPI ranges 5 to 7 cycles • synth: 15k LUTs on Virtex-II Pro 30, 85MHz, 12MIPS (worst-case) • developed in Bluespec HDL, 6000L in 6 weeks • Core validation • run RTL in lockstep w/ Simics’s UltraSPARC simulation model • workload validation w/ SPEC, OLTP/DB2, OpenSPARC verif. suite Eric S. Chung / RAMP 2006 Summer Retreat
Migration on FPGA (in progress) Virtutech Simics Xilinx XUP Virtex-II Pro 30 Simics UltraSPARC BlueSPARC PowerPC Migration& messageinterface Simulated target devices DDR memory ethernet • PowerPC functions • core & memory initialization from Simics checkpoints • facilitates migration for BlueSPARC • connects simulated devices to memory (e.g., SCSI DMA) Eric S. Chung / RAMP 2006 Summer Retreat
Conclusion • Contributions • virtualizes infrequent behaviors using simulation • simplifies full-system FPGA emulator, still fast/scalable • incremental validation from reference system • Future work • support migration in RDL? • adding cores + scaling across multiple FPGAs • We are ready for BEE2 • Thanks! Questions? echung@ece.cmu.edu • PROTOFLEX/SIMFLEX(http://www.ece.cmu.edu/~simflex) Eric S. Chung / RAMP 2006 Summer Retreat