1 / 32

SPREE Tutorial

SPREE Tutorial. Peter Yiannacouras April 13, 2006. Processors on FPGAs. You all used FPGAs (ECE241) Adders 7-segment decoders Etc. We are putting whole microprocessors on them We call these soft processors. Soft Processor Written in HDL Programmed onto chip. Hard Processors

kirkan
Download Presentation

SPREE Tutorial

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SPREE Tutorial Peter Yiannacouras April 13, 2006

  2. Processors on FPGAs • You all used FPGAs (ECE241) • Adders • 7-segment decoders • Etc. • We are putting whole microprocessors on them • We call these soft processors

  3. Soft Processor Written in HDL Programmed onto chip Hard Processors Made of transistors Costs millions to make Hard Versus Soft Processors Verilog Faster Smaller Less Power

  4. We aim to improve soft processors by customizing them Processors and FPGA Systems • FPGAs are a common platform for digital systems UART Soft Processor Memory Interface Custom Logic Ethernet • Performs coordination and even computation • Better processors => less hardware to design

  5. Our Research Problem • Soft processors have worse • Area • Speed • Power • But are • Flexible use to counteract HOW??? Customize the processor’s architecture ie. Intel vs AMD ie. Motorola 68360 vs 68010 HOW????

  6. We developed SPREE, software to help us do both Research Goals • Understand tradeoffs in soft processors • Eg. A hardware multiplier is big but can perform multiplies fast • Customize it to the application • Eg. Bubble sort doesn’t use multiplies, therefore remove hardware multiplier and save on area

  7. Processor Description ISA Datapath SPREE SPREE System(Soft Processor Rapid Exploration Environment) • Input: Processor description • SPREE System • Verify ISA against datapath • Datapath Instantiation • Control Generation Verilog • Output: Synthesizable Verilog

  8. Verilog ISA currently fixed (subset of MIPS I) Input: Instruction Set Architecture (ISA) Description • Graph of Generic Operations (GENOPs) • Edges indicate flow of data • ISA • Datapath MIPS ADD – add rd, rs, rt FETCH SPREE RFREAD RFREAD ADD RFWRITE

  9. Mul Ifetch Reg file Write Back ALU RTL Data Mem Input: Datapath Description • Interconnection of hand-coded components • Allows efficient synthesis • Described using C++ • ISA • Datapath Ifetch Reg File Ifetch Reg File SPREE Mul Data Mem Mul Shifter ALU Write Back ALU SPREE Component Library

  10. Component Selection • Select by name • Names looked up in library • Stored in cpugen/rtl_lib RTLComponent *ifetch=new RTLComponent("ifetch"); RTLComponent *reg_file=new RTLComponent("reg_file");

  11. rd rs rt offset Ifetch ALU opA result opB Datapath Wiring Example dst a_reg a_data b_reg b_data writedata Regfile proc.addConnection(ifetch,"rs",reg_file,"a_reg"); proc.addConnection(ifetch,"rt",reg_file,"b_reg");

  12. SPREE System + Backend(Soft Processor Rapid Exploration Environment) SPREE generator (spegen) Processor Description Verilog Benchmarks Mint MIPS Simulator (simulator/run) Modelsim Verilog Simulator (spebenchmark) Quartus II CAD Software (specadflow) 4. Cycle Count 1. Area 2. Clock Frequency 3. Power Compare traces  

  13. Walking through an Example (see README.txt) • Choose a pre-built processor • cpugen/src/arch lists all the processors • Let’s choose pipe3_serialshift • 3-stage pipeline with serial shifter

  14. Using SPREE on a Processor • Generate, benchmark, synthesize % spegen pipe3_serialshift % spebenchmark pipe3_serialshift % specadflow pipe3_serialshift % specompare pipe3_serialshift ← Generates Verilog ← Runs benchmarks ← Synthesizes processor ← Display results

  15. spegen – Generating Processors • Input: Processor description • Syntax: spegen <processor name> • Output: • A folder named after the processor • Hand-coded Verilog modules • system.v • Generated hookup and control • OUT.cpugen • stages per instruction • Hazard window/branch penalty • test_bench.v • test bench for Modelsim simulation

  16. Benchmarking • Run programs on the processor • Measure time taken till completion • Verify functionality • Can do this without knowing anything about the benchmarks themselves

  17. spebenchmark – Benchmarking • Input: Processor implementation • Syntax: spebenchmark <processor> • Output: (ideally) • Cycle counts of all benchmarks • Traces: /tmp/modelsim_trace.txt ******* Benchmarking pipe3_serialshift ******** Simulating bubble_sort ... Success! Cycle count=2994 Simulating crc ... Success! Cycle count=112750 Simulating des ... Success! Cycle count=5129 Simulating fft ... Success! Cycle count=5077 Simulating fir ... Success! Cycle count=1214 ...

  18. Verilog Benchmarking – under the hood C source benchmarks Compiler (gcc - MIPS) Binary Executable spebenchmark Mint MIPS Simulator (simulator/run) Modelsim Verilog Simulator (spebenchmark) Compare traces Trace Trace Cycle Count /tmp/modelsim_trace.txt applications/<benchmark name>/mint   /tmp/modelsim_store_trace.txt

  19. specompiler - Setup compiler • Choose the path to your compiler (prebuilt) • Default: /jayar/b/b0/yiannac/spe/compiler • GCC 3.3.3, software division • Another: /jayar/b/b0/yiannac/spe/compiler-softmul • GCC 3.3.3, software division and software multiplication • specompiler will: • Compile all benchmarks (and store binaries) • Simulate all benchmarks (and store traces) % specompiler /jayar/b/b0/yiannac/spe/compiler-softmul After this point, you can just run spebenchmark

  20. spebenchmark - failure • Shows discrepancy between MINT and Modelsim ******* Benchmarking pipe3_serialshift ******** Simulating bubble_sort ... Error: Trace does not match, Cycle count=381 Discrepancy found at 6800000 ps Modelsim: PC=04000064 | IR=24090001 | 05: 00000000 Mint: PC=040000b8 | IR=8c47004c | 07: 00000064 value being written Clues to where the error occurred destination register

  21. spebenchmark - waveforms • Can see any signal within the processor % sim_gui bubble_sort pipe3_serialshift

  22. Modelsim • LEARN IT!!! • Quartus Simulator is vastly inferior, and even unusable for our purposes

  23. The Testbench (test_bench.v) • What is it? • The stimulus and monitor for your circuit • SPREE automatically generates • And hence it works right away • Handcoding your own processor means • You have to interface with the test bench • Once you have the testbench you can use spebenchmark

  24. Manual Interfacing with the Testbench • Need only 6 wires • To track writes to register file and data mem test_bench.v regfile_we regfile_dst regfile_data datamem_we datamem_addr datamem_data Your soft processor

  25. SPREE System + Backend(Soft Processor Rapid Exploration Environment) SPREE generator (spegen) Processor Description Verilog Benchmarks Mint MIPS Simulator (simulator/run) Modelsim Verilog Simulator (spebenchmark) Quartus II CAD Software (specadflow) 4. Cycle Count 1. Area 2. Clock Frequency 3. Power Compare traces  

  26. specadflow – Synthesis • Input: Processor implementation • Syntax: specadflow <processor name> • Performs a “seed sweep” • Average several runs since results are noisy • Run several instances of quartus • Across several machines in parallel

  27. specadflow Output • Output: • Synthesis results (hidden) • Summary output Started Tue 6:27PM, Waiting for processes: 10.0.0.61 10.0.0.57 10.0.0.56 10.0.0.55 10.0.0.54 10.0.0.51 Finished Tue 6:33PM 1081 75.7812 0.99822 ... Waiting on eda writer Area (LEs or ALUTs) Clock Frequency (MHz) Estimated Energy/cycle dissipated (nJ/cycle)

  28. Any Questions? • Technical support, ask me

  29. EXTRAS

  30. Setup/Install • Copy and unpack the SPREE tarball: • /jayar/b/b0/yiannac/spree.tar.gz • Build all the SPREE software • Follow instructions in INSTALL.txt • If there’s any errors, email me % cd spree % make

  31. SPREE Directory Structure spree applications compiler cpugen simulator quartus modelsim binutils gcc newlib the cpu generator + processor descriptions Verilog simulator MIPS simulator Benchmarks C source synthesis

  32. Setup cluster • Choose the cluster you’re using • aenao – high performance, limited access • eecg – any eecg-connected machine • Edit quartus/machines.txt • Put a list of 11 or so good eecg machines % specluster eecg % specluster aenao OR

More Related