320 likes | 585 Views
High Speed Digital Systems Lab. Dynamic Hardware Reconfiguration Controlled by LINUX OS On ZYNQ . Performed By: Itamar Niddam and Lior Motorin Instructor: Inna Rivkin Bi- Semesterial . Winter 2012/2013. Department of Electrical Engineering Electronics Computers
E N D
High Speed Digital Systems Lab. Dynamic Hardware Reconfiguration Controlled by LINUX OS On ZYNQ Performed By: ItamarNiddam and LiorMotorin Instructor: Inna Rivkin Bi-Semesterial. Winter 2012/2013 Department of Electrical Engineering Electronics Computers Communications Technion Israel Institute of Technology
SOPC on ZYNQ running LINUX Application performance acceleration TODO : TERMINAL Dynamic Hardware Programmable Logic Core 0 : A9 ARM Peripherals Controllers Hardware accelerator Core 1 : A9 ARM Software application AXI4 BUS Processing System running LINUX • Standard SOPC approach • The Hardware is constant • The user can only switch between application • Task specific - Software
SOPC on ZYNQ running LINUX Performance Acceleration of applications Using Dynamic Partial Hardware Reconfiguration TODO : TERMINAL Dynamic Hardware Programmable Logic Core 0 : A9 ARM Peripherals Controllers Core 1 : A9 ARM AXI4 BUS Processing System running LINUX New approach The user controls the software and the hardware Task specific - Software & Hardware • Partial dynamic hardware reconfiguration by OS (LINUX) • Hardware system dynamically changed & adapted to a specific application. • The hardware change is done at runtime by application & OS Custom IP Hardware Acceleratior#1 Custom IP Hardware Acceleratior #2 Application #1 Application #2
Sobel Edge Detection Filter Example for(i = 0; i < height, i++){ for(j=0; j < width; j++){ x_dir= 0; y_dir= 0; if((i > 0 && (i < (height-1)) && (j > 0) && (j < (width-1))){ for(rowOffset= -1; rowOffset <= 1; rowOffset++){ for(colOffset= -1; colOffset <=1; colOffset++){ x_dir= x_dir + input_image[i+rowOffset][j+colOffset]* Gx[1+rowOffset][1+colOffset]; y_dir= y_dir + input_image[i+rowOffset][j+colOffset]* Gy[1+rowOffset][1+colOffset]; } } edge_weight= ABS(x_dir) + ABS(y_dir); output_image[i][j] = edge_weight; }
SobelFilter Software Implementation > ./filter_cmdsobel_software Sobel Filter
Sobel Filter Software Implementation
Sobel Filter Software & Hardware Specific Accelerator Dynamic Hardware Programmable Logic Core 0 : A9 ARM Peripherals Controllers Sobel Hardware module Core 1 : A9 ARM Sobel Software application AXI4 BUS Processing System running LINUX > ./filter_cmdhw /home/sobel.bin Sobel Filter
Sobel Filter Using Hardware Specific Accelerator
SYSTEM COMPONENTS HDMI Processing System Core 0 : A9 ARM Core 1 : A9 ARM FMC AXI4 Programmable Logic LogicBricks HDMI Controller Custom IP UART USB 0
Processing System Solution architecture Linux OS applications use shared objects to invoke functions that were not compiled with the application itself. We use that concept to create SO files with the same function symbol. But different function implementation Software impl SO Main App code Hardware impl SO Linux Kernel Xylon GPU Driver Programming Logic Zynq ZC-702 ARM CPU0 Xylon Hardware
Processing System Solution architecture The Hardware implementation SO. Burns on-the-fly the required bit stream. And invokes the generated hardware in order to process the data. The main app gets hardware acceleration for that computation without be aware of that. Main App code Hardware impl SO Linux Kernel Xylon GPU Driver Programming Logic Zynq ZC-702 ARM CPU0 Xylon Hardware
Synthesized Deign PR block
APP Application WITH SOFTWARE implementation flow Load SO file & symbols • User invokes regular application. Which using some heavy computations functions. • Int main() • { • dlopen(). • some_heavy_function • } • Regular function call to the software implementation file • Heavy_computation() • { • … • } • SO file contains a list of function symbols and their implementation • Output example : • 000000000009a850 T regcompW • 0000000000085560 T regerrorA • 000000000009a4d0 T regerrorW • 0000000000085130 T regexecA • 000000000009a0a0 T regexecW • 0000000000084ee0 T regfreeA • 000000000009a000 T regfreeW • U sprintf@@GLIBC_2.2.5 • U strcat@@GLIBC_2.2.5 • U strcmp@@GLIBC_2.2.5 Software impl
APP Application WITH HARDWARE implementation flow Load SO file & symbols • User invokes regular application. Which using some heavy computations functions. • Int main() • { • dlopen(). • some_heavy_function • } • The hardware implementation burns on-the-fly our ZC-702 PL layer with the supplied bitstream. • Hardware is being initialized & started by this function • SO file contains the same symbol name BUT with different implementation • The hardware implementation code is being invoke • Hardware logic is running & controlled by the hardware implementation code. Hardware impl code Hardware
IMPLEMENTATION EXAMPLE : video & image processing filters • Apply filters on image & video data • The filters should be have both hardware & software implementations • Measure speedups between hardware and software implementations • Don’t need to change the main function code. Only the inner functions should be invoked as presented.
IMPLEMENTATION EXAMPLE : video & image processing filters • Created a new software : • ./filter_cmd [MODE] [FILE] [TIME_MEASURE / DISPLAY FILTER RESULT][VIDEO/IMAGE] • Mode : • 0 - No filtering , the "source" image will be displayed on the screen • 1 - Hardware filtering • 2 - Software filtering • File : • the file path of the software / hardware function • if the mode is on hardware filtering then it expects to get hardware-filter file (for example sobel.bin) • if the mode is on software filtering it expects to get software-filter file (must be shared object file which have only one function with the following signature : • void (*f_img_sw_filter)(ZNQ_S32 *rgb_data_in, ZNQ_S32 *rgb_data_out, int height, int width, int stride); • TIME_MEASURE / DISPLAY_FILTER RESULT : • 0 – Time measure of processing • 1 - Display process result to screen. • VIDEO_IMAGE : • 0 - work on an image pattern as input • 1 – work on as video pattern as input
SOBEL FILTER • The Sobel operator is used in image processing, particularly within edge detection algorithms. Technically, it is a discrete differentiation operator, computing an approximation of the gradient of the image intensity function. SOBEL
SEPIA FILTER • In photography, toning is a method of changing the color of black-and-white photographs. The effects of these processes can be emulated with software in digital photography. SEPIA
INSTAGRAM FILTER • Mapping colors as follows: • Green -> Blue. • Blue -> Red. • Red -> Green.
SECURITY FILTER • A common technique of a data filtering for security purpose. • Changes the inner representation of the picture (byte array), randomly. • Doesn’t change the picture as shown to the user (the human’s eye can’t notice the change). • Follows each pixel of the pictures and randomly increases the brightness by 1 or decreases the brightness by 1.
INSTAGRAM FILTER EXAMPLE #include <stdio.h> #include “instagram.h" // RGB to instagramConversion RGB instagram_operator(RGB *rgb) { RGB instagram; short B; short R; short G; R = rgb->R.to_int(); G = rgb->G.to_int(); B = rgb->B.to_int(); instagram.R = (unsigned char) B; instagram.G= (unsigned char) R; instagram.B= (unsigned char) G; return instagram; }
INSTAGRAM FILTER EXAMPLE – Cont. //Main function for Sobel Filtering //This function includes a line buffer for a streaming implementation void instagram_filter(AXI_PIXEL inter_pix[MAX_HEIGHT][MAX_WIDTH],AXI_PIXEL out_pix[MAX_HEIGHT][MAX_WIDTH], int rows, int cols) { //Place the 8-bit color components in a 24-bit container. R is least significant byte //Create AXI streaming interfaces for the core AP_BUS_AXI_STREAMD(inter_pix,INPUT_STREAM); AP_BUS_AXI_STREAMD(out_pix,OUTPUT_STREAM); AP_INTERFACE(rows,ap_none); AP_INTERFACE(cols,ap_none); AP_BUS_AXI4_LITE(rows, CONTROL_BUS); AP_BUS_AXI4_LITE(cols, CONTROL_BUS); AP_CONTROL_BUS_AXI(CONTROL_BUS); inti; int j;
INSTAGRAM FILTER EXAMPLE – Cont. for(i = 0; i < rows; i++){ for(j=0; j < cols; j++){ #pragma AP PIPELINE II = 1 RGB to_instagram; RGB instagram; AXI_PIXEL input_pixel; AXI_PIXEL output_pixel; ap_uint<8> padding = 0xff; input_pixel = inter_pix[i][j]; to_instagram.B= input_pixel.data.range(7,0); to_instagramG= input_pixel.data.range(15,8); to_instagram.R= input_pixel.data.range(23,16); instagram= instagram_operator(&to_instagram); output_pixel.data = (instagram.R, instagram.G); output_pixel.data = (output_pixel.data, instagram.B); output_pixel.data = (padding,output_pixel.data); if(j == (cols-1)) output_pixel.last = 1; else output_pixel.last = 0; out_pix[i][j] = output_pixel; }}}
Android STEPS • we have set up a development environment to modify and compile Android OS & Linux kernel. • We configured a Fully working Android OS system on ZYNQ with a touch screen. • We have implemented custom IP cores that worked fine on Android in a “static” hardware mode. • We have developed a simple Linux device driver (char device) which can be accessed (Read / Write) by the Android on Zynq. • We have enabled a smooth access to the custom hardware from any typical Android JAVA code. • we have implemented a PR driver on the Android OS. • Partial reconfiguration has failed on Android – during the reconfiguration the system collapses.
Xilinx Platform Studio • Xilinx SDK • VIVADO HLS • XPS & SDK - Setup and configure the base system (which can run Android OS). • Xilinx Vivado HLS - implement a custom IP module using a native programming language (C). • XPS & SDK - Integrate the custom IP with the system.
Developing the custom device driver in C. • Developing the HAL (Hardware Abstraction Layer) which supplies a simple interface between the user App and the custom hardware. • Customizing the Android Kernel in order to provide the partial reconfiguration OS support. • C/C++ for Android Kernel
Developing a custom android application in java, which can use the HAL in order to get the services provided by the custom IP we implemented. • Developing the system service which is a part of the HAL. • Java Eclipse