10 likes | 274 Views
BUS Slave (receive commands from RISC&DSP). BUS Master (to read from intra-tile memories). BUS Master (simultaneous intra-tile memory write). 3DT X+ (forward/receive inter-tile off chip packets). DNP. 3DT X-. 3DT Y+. 3DT Y-. 3DT Z+. 3DT Z-. Collective communication.
E N D
BUS Slave (receive commands from RISC&DSP) BUS Master (to read from intra-tile memories) BUS Master (simultaneous intra-tile memory write) 3DT X+ (forward/receive inter-tile off chippackets) DNP 3DT X- 3DT Y+ 3DT Y- 3DT Z+ 3DT Z- Collective communication NoC (to forward/receive inter-tile on-chippackets) SHAPES scalable Software Hardware Architecture Platform for Embedded Systems Hardware Architecture Atmel Roma, INFN Roma, ST Microelectronics Grenoble, Università di Cagliari, Università di Pisa SHAPES Tiled Architecture The key objectives for deep sub-micron technologies is the minimization of wire delay problems and the managementof the design complexity in projects characterized by several hundreds of million gates. The challenge is to find a scalable HW/SW design style for future CMOS technologies. Tiled architectures suggest a possible path: “small” processing tiles connected by “short (nextneighbour) wires”. We propose a tiled architectural strategy that employs building blocks that are scalable on future silicon technologies. The DNP receives commands issued by the masters (RISC or DSP processors) on the AMBA AHB slave port and uses up two AHB master ports to sustain the data flow between the communication source and destination. For intra-tile communications the set of master and slave interface ports would suffice. For inter-tile communication, which requires the cooperation of at least two DNPs hosted on different tiles, the DNP is equipped with a set of inter-tile interfaces. A first set of those interfaces are used for off-chip communications on the 3DT (3D toroidal next-neighbors topology) and on a collective communication tree (CTV). A second set of interfaces will be used for on-chip communication through a dedicated NOC architecture. DNP is packet based: it sends, receives and routes packets with a fixed size header and a variable size payload. mAgicV Floating-Point VLIW DSP Spidergon-STNoC Spidergon-STNoC (S-STNoC) is the Network on Chip (NoC) technology currently developed in STMicroelectronics and it is made of a network of micro-routers interconnected in a Spidergon topology. The main task of S-STNoC in SHAPES is to provide the inter-tile on-chip communication services. Different tiles can be connected on the same silicon chip by the DNP to S-STNoC interconnection. In fact the DNP has a port connected to S-STNoC Network Interface (S-STNoC NI), a block which is responsible to map ingoing DNP to S-STNoC packets in a S-STNoC compatible packet format and outgoing S-STNoC to DNP packets in DNP packet format. SHAPES tiled hardware architecture • The RDT Tile • The basic SHAPES tile, RDT (Risc Dsp Tile) , will be equipped with: • one RISC microcontroller; • one VLIW FP DSP (mAgicV); • one DNP (Distributed Network Processor); • a DPM (Distributed Program Memory); • a DDM (Distributed Data Memory); • the POT (a set of Peripherals On Tile); • one interface for the DXM (Distributed External Memory) owned by each tile. The Interfaces of the Distributed Network Processor The DNP uses deterministic routing policy to implement communications on the 3D torus network. In this case, there is a fixed rule for a packet to traverse the network, hopping from one DNP to the other, until the final destination is reached. DNP routing is dead-lock free, implementing Virtual Channels. The DNP may host a small processor to implement some advanced features in an easy way, without resorting to complex VHDL coding while allowing for easy upgrade and bug fixing. About programming models, the DNP will support three different kinds of network API: Systolic Communications, Remote Data Memory Access (RDMA) and Message Passing Interface (MPI). Spidergon NoC Topology • Networks-on-Chip are mainly based on three kinds of • component: • Routers • Network Interfaces • Physical Link • Each Router is point-to-point connected to three • routers and to one Network Interface. • NoCs hide the interconnect specific implementation • details to the IP resources interfaced. All that is • needed for an external IP to transmit data through • the network is a specific designed Network Interface. mAgicV VLIW DSP ATMEL mAgicV VLIW DSP is a low power high performance numerical processor operating on IEEE 754 40-bit extended precision floating-point (10 operations per cycle) and 32-bit integer data (16 operations per cycle). Power consumption is expected to be 200 mW/GFlops at 65 nm. mAgicV is equipped with one AHB master port and one AHB slave port for system-on-chip integration. It has 256 data registers, 64 address registers, 10 independent arithmetic operating units, 2 independent address generation units and a DMA engine driving the AHB Master Port. The SHAPES RDT Tile Contacts Alessandro.Lonardo@roma1.infn.it Mersia.Perra@roma1.infn.it Davide.Rossetti@roma1.infn.it References www.shapes-p.org Paolucci, P. S., Jerraya, A. A., Leupers, R., Thiele, L., and Vicini, P. 2006. "SHAPES: a tiled scalable software hardware architecture platform for embedded systems." In Proceedings of the 4th international Conference on Hardware/Software Codesign and System Synthesis (Seoul, Korea, 2006). CODES+ISSS '06. ACM Press, 167-172. Distributed Network Processor The INFN Distributed Network Processor (DNP) provides data transport functionalities to the tile, performing inter-tile communications both on-chip and off-chip. The DNP also acts as a DMA controller for intra-tile communications (e.g. between the DSP internal data memory DDM and the tile memory DXM).