310 likes | 393 Views
A SELF-ORGANIZING LEARNING ARRAY AND ITS HARDWARE-SOFTWARE CO-SIMULATION. Janusz Starzyk and Yongtao Guo School of Electrical Engineering and Computer Science Ohio University, Athens, OH 45701, U.S.A. September, 2003. ONTLINE. 1. Introduction SOLAR Principle Simulation Results
E N D
A SELF-ORGANIZING LEARNING ARRAY AND ITS HARDWARE-SOFTWARE CO-SIMULATION Janusz Starzyk and Yongtao Guo School of Electrical Engineering and Computer Science Ohio University, Athens, OH 45701, U.S.A. September, 2003
ONTLINE • 1. Introduction • SOLAR Principle • Simulation Results • HW/SW Co-Simulation • Hardware Organization • 6. Conclusion
Self Organizing Learning ArraySOLAR • New learning algorithm • Multi layer structure and on-line learning; • local and sparse interconnections; • entropy based self-organized learning • Superior performance • Parallel computing organization; • Low power dissipation; • Efficient communication; • High chip utilization rate; • Potential to be a leading technology in machine learning • pave the way to machine intelligence application areas including pattern recognition, intelligent control, signal processing, robotics and biological research.
DARPA: Cognitive Information Processing Technology • Wanted: machine that can reason, using substantial amounts of knowledge • Can learn from its experiences so that its performance improves with knowledge and experience • Can explain itself and can accept direction • Is aware of its own behavior and reflects on its own capabilities • Responds in a robust manner to a surprise
Dowling, 1998, p. 17 Self-Organizing Learning ARray (SOLAR )
Here, , , represent the probabilities of each class, attribute probability and joint probability respectively. Self-organizing Principle • Neuron self-organization includes: • Selection of inputs • Choosing transformation function • Setting threshold • Providing output probabilities • Setting output control
Self-organizing Process Matlab Simulation Learning process Initial interconnection
Credit Card Data Set SOLAR self organizing structure
Virtex XCV800FPGA dynamic configuration PCI Bus Software run in PC JTAG Programming Hardware Board SW/HW codesign of SOLAR
Cosimulation - What and Why? • Cosimulation • Simulation of heterogeneous systems whose hardware and software components are interacting • Benefits of cosimulation • Verifying correct functionality of the target even before hardware is built • Profiling the dynamic behavior • Identifying the performance bottleneck • Preventing problems such as over-design or under-design related to system integration • Saving the system development cost and cycle
Software Model (C-program) Hardware Model (VHDL) IPC routines Foreign IPC procedures IPC Traditional Cosimulation Environment Two simulators • A software process • Written in high-level language, such as C/C++ • A simulation process of hardware model • Hardware description language, such as VHDL • Inter-process communication (IPC) routine • Connect the hardware process and software process
Traditional Cosimulation • To perform cosimulation, two simulators should be combined and complex IPC should be developed. These IPCs are error-prone routines requiring to handle various formats of data and processed signals • Especially, when focusing on hardware part, we hope that the software part is minimized and the HW/SW communication is simple and reliable
SOLAR Cosimulation One simulator • A software process • Written in behavioral VHDL which is not synthesizable • A hardware process • Written in RTL VHDL which is synthesizable • HW/SW communication • FSM and FIFOs Software Model (Behavioral VHDL) Hardware Model (RTL VHDL) FSM and FIFOs
SOLAR Cosimulation • To perform SOLAR cosimulation, one single VHDL simulator is applied. So complex error-prone IPC is avoided. Data formats and other problems can be easily handled. • The interface between HW/SW is implemented by several FIFOs controlled by a FSM, which is simple, reliable and easily modified. • File I/O functions are used to simplify software part design when focusing on hardware part implementation.
Interface modeling (RTL VHDL System architecture modelling (Behavioral VHDL) SOLAR Input FIFO Training No Over Yes FSM Self-organizing learning architecture(Structural VHDL) MEM OP Interface Control FIFO EBE REG Output FIFO Main Initialization File I/O Co-simulation System Decomposition
SW Organization VHDL Model • All functions and signal variables in the packages are shared, and program execution is functionally interleaved. • lower level package is the description for system input and output, initialization and update of the memory element in the network. • The higher level packages encapsulate new system functions based on the functions described by the lower level packages. • The highest design level function representing the software part in the overall system implements the system organization and management.
Single Neuron’s Hardware Architecture I N T E R F A C E R REG CTRL D R R R M M OP ALU MAIN CONTROLLER FIFO/DMA CTRL 1024X32 FIFO Figure 4: Single neuron’s learning architecture
conf done Interface Process time HW … send command send data read registers dma request start wait command over configuration Receive data … SW time
Software (behavioral VHDL) Hardware (structural VHDL) Interface memory module training 4 FIFOs Ctrl class 5 1 2 3 other Others 6 Figure 5: Interface modeling using FSM&FIFO Interface Modeling
Interface Simulation Small Training Data Set
Software Work Time Hardware Work Interface Work System Synchronized Work
HW function HW function is completely defined and prototyped t Overall system design can be accelerated by replacing HW subcomponent with real hardware once successfully simulated. Incremental Prototyping VHDL- simulated (incremental part)
EBE Simulation Main Procedures contain: • Sending data from software to Chip Memory • Trigger start signal • ALU calculation for all data • Moving calculated results to intermediate memory • Threshold scanning & ID calculation • Updating the intermediate values • Data Movement if the current ID is optimal • Repeating from 3 to 6 untill all functions are scanned • Sending data from Chip to software In this simulation waveform, the signal “Opt_Threshold” and “ID” represent the optimal threshold and the corresponding information index deficiency for this particular training neuron in its learning subspace.
EBE Prototyping SOLAR Training Minimum period: 23.140ns (Maximum Frequency: 43.215MHz) Minimum input arrival time before clock: 11.036ns Maximum output required time after clock: 13.758ns Map onto Virtex (57.8% logic, 60.3% route)
Run Time For instance, a particular neuron has 1024 subspace data. PC to Chip: 38x1024 = 38912 CLKs ALU calculation: 16x1024=16384 CLKs Threshold scan & ID calculation (maximum): (4x1024+7)x1024=4201472 CLKs Data Movement (Maximum) 1x1024=1024 CLKs Chip to PC: 1x1024=1024 CLKs Other: (starting sequence, wait, handshaking, etc.) 20x1024 =20480 CLKs Total: 38912+(16384+4201472+1024)x7+1024+20480= 29592576 CLKs x7 functions
System (16 cabinets, 4X4) Rack (4 boards,1x4) Board (6 chips,2x3) Half of a billion gates 24 Million gates 6 Million gates 1 Million gates SOLAR will grow Chip VIRTEXCV1000