120 likes | 304 Views
Experiments with the Peripheral Virtual Component Interface. Roman L. Lysecky, Frank Vahid*, Tony D. Givargis Dept. of Computer Science & Engineering University of California, Riverside *also with the Center for Embedded Computer Systems, UC Irvine.
E N D
Experiments with the Peripheral Virtual Component Interface Roman L. Lysecky, Frank Vahid*, Tony D. Givargis Dept. of Computer Science & Engineering University of California, Riverside *also with the Center for Embedded Computer Systems, UC Irvine This work was supported by the National Science Foundation under grant # CCR-9811164 , and by a Design Automation Conference graduate scholarship.
Core library Peripheral core Peripheral core Peripheral core ... To other systems System-on-a-chip Microprocessor Memory On-chip system bus Bridge On-chip peripheral bus Peripheral core Peripheral core Introduction • Advent of Systems-on-a-Chip (SOC’s) and cores • Peripheral cores • Microprocessor support components • UART’s, DMA controllers, CODECs, off-chip bus interfaces, etc. • Problem: how integrate cores into different SOC’s having different on-chip peripheral buses?
Core library Peripheral bus X Peripheral core Peripheral core for X Peripheral bus X Peripheral core for X Peripheral core for X Peripheral core for Y Peripheral core for Z Standard bus( std) Peripheral core for std Peripheral core for std Peripheral bus X Bus wrapper Peripheral core Bus wrapper for X Peripheral core Introduction: The Core Integration Problem • Solution 1: User modifies core for specific bus • Could accidentally change the core’s functionality • Solution 2: Different core version per bus • Can’t consider all buses • Solution 3: Standard bus • Not likely [VSIA] • Solution 4: Bus wrappers • Promising -- but how much overhead?
System-on-a-chip Microprocessor Memory Bridge On-chip system bus On-chip peripheral bus Bus wrapper Bus wrapper PVCI PVCI Peripheral core internals Peripheral core internals Peripheral core Peripheral core Introduction • Bus wrapper approach • Proposed by Virtual Socket Interface Alliance • Separate core into internals and bus wrapper • PVCI: Peripheral Virtual Component Interface -- standard between wrapper and internals • Eases integration • Only bus wrapper need be modified for different buses • What overhead comes with a bus-wrapper solution?
Setup for evaluating PVCI overhead • Digital camera example • Synthesizable RTL VHDL • Synopsys synthesis, simulation and power analysis • About 100,000 cells • 3 versions of the CCD and CODEC peripherals • Integrated • Non-PVCI wrapper (bi-direct.) • Designed before PVCI • PVCI wrapper (uni-direct.) • 2 peripheral buses • ISA • Custom Digital camera MIPS MEM. BIOS System bus BRIDGE On-chip peripheral bus CODEC CCD
PVCI general structure • Two uni-directional buses • Handshake control • Synchronous On-chip peripheral bus Bus wrapper wdata rdata clock val ack read address PVCI Peripheral core internals Peripheral core
Experiments with the ISA bus Bus Master • 23-bit address bus • 32-bit bi-directional data bus • 4-cycles per access minimum • Slower peripherals can extend access time using iochrdy signal isa_iochrdy isa_addr isa_data isa_ior isa_iowi ack_data isa_ale Peripheral (Bus Slave) clock isa_addr isa_ale isa_data isa_ior isa_iow isa_iochrdy start transfer data ready
Version Ex. Size of Size of Time for Power for wrapper internals 1 frame 1 frame ccd 0 34367 Integrated 82955 7.88 codec 0 1968 Non-PVCI ccd 1684 34556 82955 8.11 wrapper codec 1679 1904 PVCI ccd 1478 33978 82955 7.97 wrapper codec 1474 1588 Experiments with the ISA bus • Size overhead of about 1000 gates per peripheral PVCI vs. Integrated • Power overhead of about 0.05 milliwatts (<1%) • No performance overhead • Since ISA has 4-cycle minimum access delay
Integrated version asserted by core data ready Performance overhead clock bus_addr bus_data bus_ior bus_rdy wrp_addr wrp_data wrp_read wrp_ack asserted by bus wrapper Wrapper version asserted by core internals data ready Experiments with a custom peripheral bus • Similar to ISA, but... • No 4-cycle minimum • Handshake clock bus_addr bus_data bus_ior bus_rdy • Performance overhead on reads can occur
Version Ex. Size of Size of Time for Power for wrapper internals 1 frame 1 frame ccd 0 34320 Integrated 75175 7.90 codec 0 1926 Non-PVCI ccd 1661 34556 79054 8.11 Wrapper codec 1674 1904 PVCI ccd 1439 33978 79054 7.98 Wrapper codec 1434 1588 Experiments with a custom peripheral bus PVCI vs. Integrated • Size overhead of about 1000 gates per peripheral • Power overhead of about 0.05 milliwatts (<1%) • Performance overhead of about 5% in this example
Experiments • 1000 gates per core overhead is fairly small • Typical peripheral core may have from 5000-20000 gates [Inventra library] • 0.05 milliwatts per core overhead is also small • No performance overhead with ISA bus • Performance overhead of 5% on reads with faster bus • Essentially due to reads taking 4 cycles instead of 2 cycles
Conclusions • Overheads in size, power and performance of PVCI vs. Integrated core were small • Only significant overhead was performance in certain case • Our earlier work on pre-fetching can reduce or eliminate this overhead [ISSS’99, DATE’00] • Remerging the bus wrapper with core internals can also reduce this overhead • PVCI and non-PVCI cores were competitive • Integration advantages of bus-wrapper approach seem to come with acceptable overhead