470 likes | 644 Views
Device Driver for Generic ASC Module Project Presentation. Sponsored by: High Speed Digital Systems Lab Parallel Systems Lab. By: Yigal Korman Erez Fuchs Instructor: Evgeny Fiksman. Abstract ASC – A Stream Compiler Project Goal Development Platform Hardware Platform
E N D
Device Driver for Generic ASC ModuleProject Presentation • Sponsored by: • High Speed Digital Systems Lab • Parallel Systems Lab By: Yigal Korman Erez Fuchs Instructor: Evgeny Fiksman
Abstract ASC – A Stream Compiler Project Goal Development Platform Hardware Platform Software Platform Project Research & Development Part A: Hardware Side Part B: Software Side Table of Contents
Abstract • Problem • There are many complex functions which require a lot of CPU resources • Find the best way to implement these functions under the constrains of resources and cost • Possible Solutions • Pure software implementation • Low cost • Low performance • Pure hardware implementation • High cost • High performance • Combination of software & hardware • ASC technology • Fine balance between cost and performance
Vision - Future Workflow combining hardware & software design • Write a conventional software implementation • Locate critical code sections • Convert these sections into hardware implementation • Compile the software program and create a hardware specific netlist • Communication between the software and the hardware will be automatically added during compilation • Load the netlist into the FPGA and execute the program alongside High Performance, Low Cost !!!
Abstract ASC – A Stream Compiler Project Goal Development Platform Hardware Platform Software Platform Project Research & Development Part A: Hardware Side Part B: Software Side Table of Contents
ASC – A Stream Compiler • Combinational (SW/HW) code • Write the software code in C/C++ • Add hardware sections in between using C++ code • Hardware is described by special C++ libraries • Hardware is complied into standard NetList output (.edif) • Supported by standard CAD tools • Supported by Xilinx FPGA architectures • Provides HW optimization Internal Design
ASC Code Example #include "asc.h" main(int argc, char **argv) { printf("Hello World\n"); STREAM_START; // ASC code start // Hardware Variable Declarations HWint a(IN, 32),b(OUT, 32); STREAM_LOOP(SIZE); b = a + 1; STREAM_END; // ASC code end } Software Hardware
ASC – Not Everything Is Perfect • No automatic addition of communication between software and hardware • ASC creates the hardware Netlist and the software program but doesn’t provide an interface for communication between them • The designer is left with the difficult task of implementing an interconnection between the HW & SW and integrating it into the design This is where we come in play...
Abstract ASC – A Stream Compiler Project Goals Development Platform Hardware Platform Software Platform Project Research & Development Part A: Hardware Side Part B: Software Side Table of Contents
Project Goal: Complete The Equation • Create a generic communication system between the software side and the hardware side • Demonstrate a working system Embedded OS Hardware System (FPGA) Software Hardware Logic (ASC) Stream Driver
Device Driver for Generic ASC Module: Stream Driver • Define a simple generic dual-sided interface through which the hardware and the software will communicate • Implement the interconnect device driver • The hardware will use specific ports as inputs/outputs • The software will use specific function calls to send data into and out of the hardware core • The device driver will manage the transport of data in between and all other communication
Abstract ASC – A Stream Compiler Project Goals Development Platform Hardware Platform Software Platform Project Research & Development Part A: Hardware Side Part B: Software Side Table of Contents
Xilinx ML310 Development Board Hardware Platform • Xilinx ML310 development board • Based on the Virtex-II Pro FPGA chip which includes a versatile programmable logic array and 2 PowerPC CPUs • Includes all the current hardware components found on computer systems: IDE, NIC, USB, SDRAM, PCI and more.
Hardware Tools • Xilinx Platform Studio 6.3 • IDE for creating programmable platform designs • Runs library generation, compiler tool chains and linker script generation • Creates implementation and simulation netlists • Development, debug and verification tools • Board support package for the ML310
Abstract ASC – A Stream Compiler Project Goals Development Platform Hardware Platform Software Platform Project Research & Development Part A: Hardware Side Part B: Software Side Table of Contents
Software Platform • MontaVista Embedded Linux • Based on open source Linux kernel 2.4.x • Independent file system • Can develop hardware specific drivers for the operating system • Supports the Xilinx ML310 board: Device drivers written specifically for ML310 peripherals
Software Tools • MontaVista DevRocket 1.1 • Eclipse-based IDE to write code projects in C/C++ for the MV Embedded Linux • Configure target settings • Develop & debug platform code • Create deployable platform images • Includes a cross-compiler for the PowerPC CPUs on the ML310 board
Abstract ASC – A Stream Compiler Project Goals Development Platform Hardware Platform Software Platform Project Research & Development Part A: Hardware Side Part B: Software Side Table of Contents
Project Goals – Part A: Hardware side • Study the Xilinx development platform • Study the use of interrupts & DMA with IP core • Design a communication protocol between the ASC IP core and the software • Implement an example for direct communication between a specific IP core and its software wrapper running on the CPU (non OS)
Hardware Interface:ASC Backend • ASC couples the hardware logic it creates with a generic backend: • Provides a way to communicate with the logic • Read and Write FIFOs for transferring data into and from the logic • Data registers for internal status queries as well as fine tuning the logic • We will create a driver that will communicate with this backend
Write FIFO Data In Pipe Flushing Read FIFO Flush Clock Processing Pipe Registers Access Reset ASC Stream Configuration Registers Data Out ASC Backend Hardware Interface:ASC BackendContinued…
# # 5 3 2 4 # # 5 3 # # 9 25 * * 4 11 10 9 25 # # 7 * * 4 Write FIFO Write FIFO Write FIFO Write FIFO Write FIFO Write FIFO Processing Pipe Processing Pipe Processing Pipe Processing Pipe Processing Pipe Processing Pipe Read FIFO Read FIFO Read FIFO Read FIFO Read FIFO Read FIFO 9 25 * * 49 11 10 data overwritten ASC – Pipe Workflow garbage data • Data is inserted into the Write FIFO and pushed through the Processing Pipe • The result of the calculations is pushed from the pipe to the Read FIFO in a cyclic order • Garbage data is inserted to the pipe during flush operation # # flush
Part A - Investigate • PLB IPIF Core • Provides a bi-directional interface between a user IP core and the PLB 64-bit bus standard • Supports various services and features that can be optioned in or out: • Local IP interrupt connection with user software • Programmable enables/disables • User software triggered reset generator for localized reset of user’s core • User configured RdFIFO and WrFIFO with optional IP packet support • DMA function with optional Scatter/Gather mechanization • Configurable user address ranges, which can be used to directly address the user IP on the PLB bus
PLB Bus Clock Divider PLB Clock ASC Clock IP Reset Reset PLB to IP Address Device Interrupt PLB Requests PLB to IP Rd/Wr Request Slave Reply PLB to IP Data Load Transfer Request Load Transfer Reply IP to PLB Data Transfer Request Master PLB Requests Request Status PLB Master Reply PLB IPIF ASC Backend Byte Steering S/W Reset Interrupt Device ISC Slave Attachment IPIF Interrupts Master Attachment Master Request Arbiter DMA/ Scatter Gather Part A – Hardware DesignConnecting the ASC to the PLB Bus • Use IPIF facilities to implement the hardware interface to the ASC • DMA controller for data transfer • Interrupt lines for asynchronous transfers • Reset Lines • The ASC core will connect to the IPIF and be accessed through the user addressable memory the IPIF provides • Use PLB clock signal + clock divider for the ASC stream clock
Interrupt Controller DMA Done PPC 405 CPU PLB IPIF Interrupt Support ASC Logic Set DMA DMA PLB Bus ASC Backend Memory Data to ASC R/W FIFOs Data to Memory Part A – Hardware Design Data Flow
Part A – Hardware Test • Configure and test the hardware: • Load ML310 with default configuration • SDRAM support – large memory capacity • Local Area Network support – data transfer between development environment and target board via FTP (for software development part) • PCI support – for future extension • Add a simple ASC core to the system
Part A – Hardware Test Continued… • Write a simple test to transfer data between the OS memory and ASC core • Use Xilinx standalone OS API and code examples: • Data IO • DMA Transfer • Interrupt handling • Check that the underlying hardware works correctly • The test program will provide a conceptual foundation for the driver
Part A – Hardware Test Flow Diagram Start • Initialization: • Init DMA status • Init IPIF core • Enable Interrupts: • DMA interrupts on IPIF • CPU exceptions • Interrupt controller • Register interrupt handler in interrupt controller • Prepare DMA Transaction • DMA write data to ASC Wait on DMA interrupt
Interrupt Handler Interrupt Handler Part A – Hardware Test Flow Diagram Continued… • ACK DMA interrupt handling • Return transaction status DMA Done? No Yes • Prepare DMA Transaction • DMA read data from ASC Wait on DMA interrupt • ACK DMA interrupt handling • Return transaction status DMA Done? No Yes Check read data Finish Error
Abstract ASC – A Stream Compiler Project Goals Development Platform Hardware Platform Software Platform Project Research & Development Part A: Hardware Side Part B: Software Side Table of Contents
Project Goals – Part B: Software side • Hardware is ready and functional • Install Linux • Study the MontaVista development environment • Study writing Linux device drivers • Design device driver specification • Write the device driver
Design: A Stream Driver • The driver’s functionality is to communicate with the ASC backend • The ASC core always work as a stream hardware – a pipeline of data is streamed through the core – being evaluated, calculated and outputted • The driver will allow flexibility to the user and comply with Linux driver standards
Device Driver: Module vs. Kernel Our choice: implement the driver as MODULE
Custom & Third Party Applications Test Application Middleware & Application Services MontaVista Graphics Advanced Embedded Features Networking & Application Packages Real-time Functionality Robust Security High Performance Networking High Reliability MontaVista Linux Kernel Device Drivers ASC Stream Driver Reference Hardware ASC Module Linux Device Driver System Stack
The Stream Driver Features • Read & Write to ASC FIFOs • ASC Register access • Fast asynchronous data transfers (Interrupts & DMA) • ASC pipe flush & garbage avoidance • Status information • Mapping hardware memory to user space • Support of multiple ASC modules
Load the device driver module Create a file in the file system to represent the device (ASC core) Open the device file Configure the driver with device parameters Write data into the device Flush processing pipe to receive results Read results from the device Close the device file # insmod stream_driver.o Stream Driver Workflow The best way to explain something is by example • This will load the device driver into the kernel and will allow users to use the driver’s functionality • The driver has a unique hard-coded code (major number) that identifies device files that should use this driver (our number is 120) • The driver can be used simultaneously with more than one device
# mknod /dev/stream_device0 c 120 0 Load the device driver module Create a file in the file system to represent the device (ASC core) Open the device file Configure the driver with device parameters Write data into the device Flush processing pipe to receive results Read results from the device Close the device file Stream Driver Workflow • This create a single device file that is called ‘stream_drevice0’ and will connect it to our driver using the major number variable (120) • More that one file can be created with different name and different minor number (0) to represent more than one ASC core in the system
Load the device driver module Create a file in the file system to represent the device (ASC core) Open the device file Configure the driver with device parameters Write data into the device Flush processing pipe to receive results Read results from the device Close the device file device_file = open(‘/dev/stream_device0’, O_RDWR); Stream Driver Workflow • This is part of a c program using the Linux kernel API • Supply the location of the file and the mode in which to operate – usually read & write (O_RDWR) • This will load the driver initialization code in the background • Our driver will not fully initialize until some configuration information will be passed so it will be able to correctly identify the device
Load the device driver module Create a file in the file system to represent the device (ASC core) Open the device file Configure the driver with device parameters Write data into the device Flush processing pipe to receive results Read results from the device Close the device file ioctl(device_file, STREAM_IOCTL_CONFIG, &device_conf); Stream Driver Workflow • Use a special function to communicate with the driver (ioctl) pass the device file (device_file) and command (STREAM_IOCTL_CONFIG) • The last argument is the configuration info itself: struct stream_config device_conf = { module_base_address: XPAR_TEST_PLB_CORE_0_BASEADDR, module_high_address: XPAR_TEST_PLB_CORE_0_HIGHADDR, dma_channels_offset: PLB_STREAM_IPIF_DMA_SG_SPACE_OFFSET, irq: XPAR_OPB_INTC_0_TEST_PLB_CORE_0_IP2INTC_IRPT_INTR, stream_base_address: XPAR_TEST_PLB_CORE_0_AR0_BASEADDR, stream_high_address: XPAR_TEST_PLB_CORE_0_AR0_HIGHADDR, stream_num_inputs: NUM_INS, stream_num_outputs: NUM_OUTS, stream_letancy: STREAM_LATENCY, stream_cycle: STREAM_CYCLE, };
Load the device driver module Create a file in the file system to represent the device (ASC core) Open the device file Configure the driver with device parameters Write data into the device Flush processing pipe to receive results Read results from the device Close the device file write(device_file, data_in, sizeof(data_in)); Stream Driver Workflow • Use the write command to pass a data buffer (data_in) to the device • This command is blocking – it will lock the program until the transfer is complete BUT it will not lock the system because we use interrupts to notify the system when the transfer is finished
Load the device driver module Create a file in the file system to represent the device (ASC core) Open the device file Configure the driver with device parameters Write data into the device Flush processing pipe to receive results Read results from the device Close the device file fsync(device_file); Stream Driver Workflow • This command requests the driver to activate the flush mechanism in the ASC module. The driver also calculates the location and quantity of garbage data inserted during the flush • Reading is as simple as writing – the user has no need to locate the correct position in the read FIFO – all done automatically in the driver read(device_file, data_out, sizeof(data_out));
Load the device driver module Create a file in the file system to represent the device (ASC core) Open the device file Configure the driver with device parameters Write data into the device Flush processing pipe to receive results Read results from the device Close the device file close(device_file); Stream Driver Workflow • Always close the device after use. This will clear valuable resources occupied by the driver • You can also unload the driver from the kernel with • There are more operations in the driver such as direct memory allocation (mmap), register access (STREAM_IOCTL_REGISTER_READ/WRITE) and others – see project book for more detail # rmmod stream_driver
Demonstration • Load Driver • Run Test Application • Test Results • Unload Driver
The Future • Loading Custom Hardware during runtime – no need to boot and reload the netlists • All you will need is to tell the driver the location of the ASC module hardware implementation and it will load it into the FPGA chip: struct module myAsc = { netlist: ‘my_asc.netlist’, device_file: ‘’, }; ioctl(module_generator, STREAM_IOCTL_GENERATE, &myASC); write(myAsc.device_file, data_in, sizeof(data_in)); ...
The FutureContinued… • Fully automated platform for writing and running ASC-oriented applications • Write the program, run the compiler and receive a a running application that uses custom made hardware acceleration • An application that will identify critical code sections in the program, convert them into ASC hardware module and insert the appropriate calls to the device driver Software Software Operating System Hardware
Thanks… • Great thanks to our instructor Envgeni for his teaching, ideas and help • Many thanks to the High Speed Digital Systems Laboratory for their support and patience: Eli, Ina, Mony and of course Bruriya