1 / 21

Message Passing Systems Packaging Design Space

Message Passing Systems Packaging Design Space. Integration of processor and network interface on one chip Board-level integration System-level integration. Integration of Processor and Network Interface. nCube/2, Transputer (INMOS), Blue Gene processor (IBM)

hina
Download Presentation

Message Passing Systems Packaging Design Space

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Message Passing Systems Packaging Design Space • Integration of processor and network interface on one chip • Board-level integration • System-level integration

  2. Integration of Processor and Network Interface • nCube/2, Transputer (INMOS), Blue Gene processor (IBM) • Enables low latency communication

  3. Board-level Integration • A node is implemented on a single board. • Boards include network interface. • Based on commodity processors. • Boards might also include multiple processors. • Multiple nodes (CM5) • Four Sun Sparc1nodes • First level of4-ary tree network • SMP nodes (Altix) • Most current architectures belong to this class.

  4. System-level Integration • Nodes are individual workstations. • Network cards are plugged into IO-Bus. • Networks used are high-speed commodity networks, Myrinet, Quadrix and Infiniband. • Such clusters based on conventional SMP nodes are also used in high-availability services, such as databases, due to the independence of the nodes.

  5. Communication Architecture Design Space • Physical DMA • User-level access • Dedicated message passing processing

  6. Physical DMA Length Length Address Address Ready Ready NETWORK Status/Inter Status/Inter CMD CMD MEMORY Proc MEMORY Proc

  7. Properties • DMA: address, length, and status registers are memory mapped or privileged instructions are used to access them. • Usually physical addresses. • If a send request is done, a trap to the OS is executed. • Incoming messages are blindly deposit into memory. Input channels have to be open all the time to avoid deadlock. • Message arrival will cause an interrupt. • Message handling is usually implemented in kernel to ensure protection. • Messages are copied into system buffer. Kernel adds route information etc. • Some information like error correction code can be added by hardware • Protocol overhead very high for shared address space. A context switch occurs for each remote access.

  8. User-Accessible FIFOs Status/Inter Status/Inter MEMORY MEMORY Proc Proc NETWORK User/System

  9. Properties • Distinguishes user and system-level messages. • No DMA • Processor writes into FIFOs that are memory mapped. • Network interface performs protection check, translation of logical to physical node number, and error checking. • User-level messages are delivered without kernel intervention. • Separate FIFO for user and system messages • User messages remain in FIFO until handled (polling) while system messages are handled via interrupts. • In case of back pressure, also user level messages have to be handled via interrupts. • Some state of the parallel application is in the FIFOs and has to be saved if programs are checkpointed or swapped.

  10. Dedicated Communication Coprocessor MEMORY MEMORY Status/Inter Status/Inter ComputeProc ComputeProc CommProc CommProc NETWORK System mode User mode

  11. Dataflow between Main and Communication Processor MEMORY MEMORY Status/Inter Status/Inter ComputeProc ComputeProc CommProc CommProc NETWORK System mode User mode

  12. Properties • Communication processor can run in privileged mode. • Clean abstraction since all hardware details are handled by communication processor. • Complex protocols can be implemented, e.g. virtual shared memory • Efficiency is influenced by cache coherency protocol • Example: Intel Paragon, ASCI Red

  13. Communication Processor integrated in Network Interface MEMORY MEMORY CP CP Compute Proc Compute Proc NETWORK

  14. VIA and Infiniband • VIA (Virtual Interface Architecture) • Standardized user-level networkinterface • Specificationof the softwareinterface not the NIC implementation • Can befullyimplemented in the hardware NIC ormajorpartsof the protocollprocessingcanbe on-loaded on the hostprocessor. • Allowstobypass the OS on the datapath • Consumersacquireoneormorevirtualinterfaces via the kernelagent (controlpath) • Efficientsharingof NICs • Gettingmoreimportant in multicoreprocessors

  15. Virtual Interface Stack

  16. Virtual Interface • VI consistsof • Send and receivequeue (Queue Pair – QP) • Consumer putsworkrequests (WR) in the queuesinsteadofdirectlyaccessing the networkadapter • Send requests • Usuallycontain a virtualaddress and a length • Multiple blockscanbespecified for hardware-assistedscatter/gather • Short messages: requestmightcontain the payloadalready • Receiverequests • Onlycontainvirtualaddressreferences

  17. Notificationofnewrequests • Eachqueuehas a doorbellregister in the VI networkadapter • A storeto the doorbellsignalsnewwork. • The adapterkeepstrackof all outstandingrequests and processesthemautonomously. • VIs areasynchronousinterfaces

  18. Notificationoffinishedrequests • Completionqueue • Every workqueuecanbeassociatedwith a completionqueue • Consumer canrequest a completionqueueelement (CQE)

  19. Kernel agent • Device driverprovidedby NIC vendor • Responsible for • settingup, managing and terminatingnetworkconnectionsassociatedwith a VI • Error handling and interruptprocessing • Management ofsystemmemoryusedby the NIC

  20. Zero-Copy Interface • WR includevirtualaddressesto the buffers in userspace • Applicationprogramscannottranslate the virtualaddressesintophysicaladdresses. • TCP/IP stacks use ownmemorybuffers and copy the datainto the buffers. • IO devices needtobeabletotranslatevirtualaddressesintophysicaladdresses need for own MMU • VIA requires the memorytobepinned. • Applicationsregister the addressrangeswithkernelagent • Itpins the memoryand • setsup the translationtables. • Thus settingup a communicationis expensive!

  21. VIA Communication

More Related