220 likes | 360 Views
Implementing a Rad-Hard Compact PCI Bus-based System Using Actel FPGAs. Robert H. Klenke, SAIC Robert F. Hodson, NASA LaRC Tak-kwong Ng, NASA LaRC. Gifts. G eosynchronous I maging F ourier T ransform S pectrometer. Gifts Modules. Modulators (Downlink). Control Module. Sensor
E N D
Implementing a Rad-Hard Compact PCI Bus-based System Using Actel FPGAs Robert H. Klenke, SAIC Robert F. Hodson, NASA LaRC Tak-kwong Ng, NASA LaRC
Gifts • Geosynchronous • Imaging • Fourier • Transform • Spectrometer
Gifts Modules Modulators (Downlink) Control Module Sensor Module
IC (Instrument Controller) BAE 750 Gifts Control Module IO DL EDS 6U CPCI 33MHz, 32 Bit, 3.3 Volt Fully Redundant MEM 4 MB SRAM
Gifts Control Module Data Flow SM Data I/F Synchronous Serialized LVDS 21-bit data,16 MHz MEM IC Actel PCI Core PCI Bus SM Command I/F 422 Differential X-Band 80 Mbps IO DLINK EDS Actel PCI Core Actel PCI Core Actel PCI Core SMQ-11 Spacecraft I/F 1553 Requires Sequential “Block” Readout to Downlink
GIFTS Memory Board Architecture PCI Clock Domain Core Clock Domain (PCI clock/2) LVDS Clock Domain Actel RT54SX32S FPGA (2 – bit sliced) DMA Engine Actel RT54SX32S FPGA Actel RT54SX32S FPGA LVDS FIFO 64 64 DMA FIFO Actel PCI Core 16 LVDS Data 32 PCI Bus SRAM Controller 64 64 BAE 238A792 SRAM (2) BAE 238A792 SRAM (2)
GIFTS Memory Board Architecture (cont.) • Three separate clock domains interfaced by asynchronous FIFOs • Requires 4 Actel RT54SX32S FPGAs • Core is bit-sliced across 32-bit boundaries due to I/O limitations • Includes 4 Mbytes of SRAM to implement overall FIFO function of the memory board • Receives single-clocked, 16-bit (plus type and valid) data from the LVDS interface at 16 MHz • Double clocked data at 32 MHz could also be handled with a modification to the LVDS FIFO • Status: • PCI core FPGA design complete, place and route results meet PCI specs for setup, propagation, and hold time at min, typ, and max timing • LVDS FIFO design complete • SRAM controller and bit-sliced implementation complete • System testing in progress • Brass board layout in progress
Memory Board Design Flow and Tools • Actel Libro Platinum Toolset • VHDL design entry using text editor • Pre-synthesis Functional Simulation using ModelSim and highly modified Actel-supplied PCI test bench • Synthesis using Synplicity synthesis tool • Post-synthesis simulation using ModelSim • Full licensed copy of ModelSim (as opposed to the reduced performance version supplied with Libro) necessary to complete post-synthesis and post-place & route simulations in reasonable time • Place & route with Actel’s Designer tools • Timer static timing analysis tool used to check rough timing and apply timing constraints • Post-place & route timing simulation with ModelSim
Memory Board Design Issues • Collapsing hierarchy changes synthesis output • Hold time problems on PCI bus • Glitches on post-synthesis outputs • Tri-states not elevated to uppermost level • Timing data inaccurate and difficult to debug • No timing checks in simulation test bench
Collapsing Hierarchy Changes Synthesis Output • Initial functional design of LVDS FIFO FPGA indicated correct functionality • Internal read pointers reset and functioned correctly • Post-synthesis simulation did not function correctly • Debugging efforts lead to the discovery that the FIFO read pointers were not being reset correctly • Synthesis attributes added to VHDL file to prevent synthesis tool from collapsing hierarchy – initially so that internal signals could be more easily traced attribute syn_hier : string; attribute syn_hier of MIXED : architecture is "hard"; • Problem “fixed itself” and post-synthesis simulation indicated correct functionality • Apparently, the synthesis tool was connecting signals that are on different levels of the hierarchy, but with the same name, together when the hierarchy is collapsed • Not collapsing the hierarchy in the design appears to have little impact on the size of the synthesized design
Collapsing Hierarchy Changes Synthesis Output (cont.) Pre-synthesis simulation showing correct behavior of the read pointers Post-synthesis simulation showing incorrect resetting of the read pointers
Collapsing Hierarchy Changes Synthesis Output (cont.) Post-synthesis simulation of the design with the hierarchy not flattened showing correct behavior of the read pointers
PCI CLK Cycle 30ns (at 33 MHz) Tprop 10ns max. Tsu 7ns min. Tval 11ns max. D Hold Time Problem on PCI Bus • PCI bus specification states minimum hold time is 0ns • Actel RT54SX devices have a significant delay on the Hclock tree that is greater than the typical input-to-register delay • The result is a non-zero hold time requirement on the external inputs (from the PCI bus) while we wait for the clock to arrive CLK D CLK_d CLK D CLK_d • – the clock delay becomes the required external hold time
Hold Time Problem on PCI Bus • Actel Timer tool will report hold times on external inputs (one path only), but can not use them to guide place and route • Solution was to increase the delay of the datapath by hand to allow the data to be in synch with the increased clock delay • Add delay buffers (that will not be optimized out by the synthesis tool - bufd) to the signals to get the delay close • If necessary, hand place the buffers on those signals to increase the data delay • One problem, of course, is balancing the required delay for 0ns hold time with the setup time required for the next clock cycle • Signals requiring close tolerances could require many (long) cycles of re-spinning to get the timing correct ad_bufd_loop: for x in 31 downto 0 generate B1: bufd port map(a => ad(x), y => ad_slow1(x)); B2: bufd port map(a => ad_slow1(x), y => ad_slow2(x)); end generate;
Glitches on Post-Synthesis Outputs • Address lines, chip select, write enable, and (for a memory write) data to the SRAM MUST be stable for the entire setup period to make timing for a memory read or write cycle. • The SRAM memory is being used as a FIFO, so complex arithmetic must be done on the addresses used for reading (FIFO read pointer) and writing (FIFO write pointer) to the SRAM • Complex statements to calculate the value on the address lines lead to complex logic that had multiple glitches during state transitions (where the address was not supposed to change) • This in turn lead to incorrect operation of the SRAM as shown by the accurate models of the SRAM that were used in the simulations • The solution was to recode the VHDL so that both values were always calculated and the value on the address lines was selected from the two possibilities – which synthesized as logic with a multiplexor on the output
Tri-States not elevated to upper level • In order to maximize code reuse and minimize testing difficulties, it was decided to develop a single 64-bit behavioral description of the SRAM core and then instantiate it in 2, 32-bit wrappers to implement the two bit-sliced SRAM controllers • The tri-state drivers on the bi-directional data busses were coded inside the 64-bit behavioral description • The syn_hier attribute of hard was used to avoid the synthesis problems previously encountered • A problem occurred when importing this design into the Designer tool for place and route that was later determined to come from the inclusion of internal tri-state drivers in the EDIF netlist by the synthesis tool – Actel FPGAs do not have internal tri-state drivers Actel tri-state I/O pad driver ¹ VHDL Behavioral Core VHDL structural wrapper Gate-level post-synthesis netlist
Tri-States not elevated to upper level (cont.) • The two possible solutions were to: • Re-code the behavioral description and the wrappers so that the tri-states were explicitly included at the top level of the design • Remove the syn_hier, hard attribute to allow the tri-state drivers to be promoted to the external pins as long as the synthesis sill produced correct results
Timing not accurate and difficult to interpret • Values of timing data as well as flip-flop models changed with different versions of the Libro toolset • This occurred in the middle of the design process and caused quite a bit of extra work as well as uncovering the hold time problem on the PCI bus signals • Because of the explosion of gate-level signals in the post-synthesis, post-layout design, tracing important signals from the behavioral description can be difficult • ModelSim’s features for finding signal names and being able to combine signals into a bus for display are invaluable • New signals created in the post-synthesis design make finding “cause and effect” relationships harder • ModelSim’s “dataflow” window is somewhat helpful here • Failure mechanisms may be different for the different relationships in min., typ., and max., timing simulations • Do not underestimate the difficulty and time consumption of obtaining timing closure!
No timing checks in simulation testbench • Static timing analysis results from the Timer tool are helpful, but in addition to reporting false paths, they are quite voluminous and difficult to interpret • Simulation with back-annotated SDF timing is necessary to be completely satisfied that timing constraints have been met • Unfortunately, the only timing values in the Actel-supplied test bench were an 11ns delay on the PCI outputs of the master controller • The following timing checks were added to the Actel-supplied testbench: • A 23ns delay on PCI outputs of the master controller to check the 7ns setup time on the DUT’s PCI inputs • A ‘stable(19ns) check on the PCI inputs of the PCI bus monitor to check the 11ns maximum output delay of the DUT’s PCI outputs. • An option to set all PCI signals to X’s a user-specified time period after the rising edge of the clock to check the hold time requirements on the DUT’s PCI inputs