180 likes | 537 Views
Field Programmable Graphical Arrays. Ability to reconfigure its circuitry for a desired application or function at any time after manufacturing Adaptive hardware that continuously changes in response to the input data or processing environment Combination of general-purpose processors and ASICs
E N D
Field Programmable Graphical Arrays • Ability to reconfigure its circuitry for a desired application or function at any time after manufacturing • Adaptive hardware that continuously changes in response to the input data or processing environment • Combination of general-purpose processors and ASICs • Quick reconfiguration time, in order of 100 S to 1 mS
A myriad of Configurable Logic Blocks The CLBs may have functionality of either adding or comparing two numbers Basic FPGA Design and Structure • Connection between CLBs are established through signal controlled grid connections • Current FPGAs have more than 100,000 logic gates
Advantages of FPGA • Reconfiguration ability enables performing specific computational tasks at will • Higher flexibility for adaptive coding for multimedia requirements such as: • Bandwidth availability • Quality of Service requirements • Channel characteristics • Rapid prototyping and design iteration • Certain function implementations lead to reduction in die area
Disadvantages of FPGA • Hardware is not ASIC which can lead to non-optimized performance and density • Reconfiguration time is longer compared to loading software • High power consumption during reconfiguration
Codec implemented on 2 or more FPGAs Each FPGA has all parts of the codec Enables multiple data to be processed simultaneously Advantages: Easy to implement Die area is not a constraint High data throughput due to parallelism Disadvantages: Too much hardware Lead to non-optimized configuration Parallel Banks Technique
Compile-Time Reconfiguration • Entire chip is configured once for the target application • Advantages: • Easy control signals • Disadvantages: • More than 1 FPGA may be needed
Chip is reconfigured to perform different functions during an application Advantages: Reduced Hardware Critical Path is small Disadvantages: Reconfiguration causes significant delay (can be compensated by partial reconfiguration) May lead to difficulty in control system implementation Run-Time Reconfiguration
Prototype Video Codec from UCLA • Transformation scheme (i.e DCT) • Quantization • Entropy Coding • No Motion Compensation performed
Detailed Description of UCLA Video Codec • Utilizes RTR implementation • Partitioned into 3 separate configuration • Discrete Wavelet Transform, Addressing, and Control Logic • Quantization and Run Length Coding • Entropy Coding • RTR uses partial reconfiguration technique • QCIF Resolution • 60-600 kbs • CDMA for RF-Link
Configuration One • Discrete Wavelet Transform • Short filter with integer coefficients • Requires 318 gates and 241 flip-flops • Corresponds to 681 CLBs • Addressing and Control Logic • Correct data retrieval from RAM • Provides access to peripheral system
Configuration Two • Quantization and Run Length Coding • Requires 2500 gates • Addressing and Control Logic • Same as configuration 1 • Never reconfigured • Data from previous configuration stored in another RAM
Configuration Three • Entropy Coding • Provides 2:1 lossless compression • Addressing and Control Logic • Same as configuration 1 and 2 • Never reconfigured • Data from previous configuration stored in another RAM
Experiment Results • RTR provides lowest silicon area • Partial reconfiguration decreases reconfiguration delay by 50% on Global reconfiguration • Critical Path is 220 ns (5 MHz system) • Load and ready time approximately 1.6ms • Compression rate of 15:1 was achieved • Independentof frame size
Alternate Implementation:FPGA-VSP Co-Processor • Allows more operations: • 7 x 7 Mask 2D Filter (13.3 f/sec) • 8 x 8 Block DCT (55 f/sec) • 4 x 4 Block VQ at 0.5 bpp (7.4 f/sec) • 1 level WT (35.7 f/sec) • Max FPGA clock of 20 MHz • Max VSP clock of 50 MHz
Other Notable Implementations and Techniques • Dual FPGA, One RTR at any time • FPGA and General Processor Co-Processing • Systolic • Look Up Table for transform coefficients
Documentations • J. Villasenor and W.H. Mangione-Smith, "Configurable Computing,” Scientific American, pp. 66-71, June, 1997. • J. Villasenor, C. Jones, and B. Schoner, "Video Communications using Rapidly Reconfigurable Hardware," IEEE Transactions on Circuits and Systems for Video Technology, vol. 5, pp. 565-567, December 1995. • B. Schoner, C. Jones and J. Villasenor, "Issues in Wireless Video Coding Using Run-time-reconfigurable FPGAs,” Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines, pp. 85-89, Napa, CA, Apr. 1995. • B. Schoner, J. Villasenor, S. Molloy, R. Jain, "Techniques for FPGA Implementation of Video Compression," ACM/SIGBA International Symposium on Field-Programmable Gate Arrays, 1995.
Related Sites • FPGA Based Codec Site • www.icsl.ucla.edu/~ipl • Techniques and Implementations • www.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15828-s98/www/index.html • www.ece.cmu.edu/research/piperench/ • Hardware Sites • www.xilinx.com • www.altera.com
Question and Answers • How does FPGA compare to direct hardware implementation? • Compared to video cards of today, FPGA’s performance would be slower compared to them. I believe this is because today’s semiconductor technology is still insufficient to process FPGAs wiring and density to be optimal. • Frame rate of the UCLA video codec? • Frame rate of the codec depends upon which hardware implementation used. In the co-processing method, the frame rate is variable (from 7-35). The pure FPGA implementation runs at 20 frames/second. Although the comparison may look “funny” one also must take into account that the pure FPGA implementation much more simplified codec than the co-processing method. • How fast a FPGA re-configure itself? • Initial design download is 1.6 ms. Global reconfiguration is 3 ms. Partial reconfiguration is 1.5 ms.