1 / 21

Fault-Tolerant Network-Interface for Spatial Division Multiplexing Based Network-on-Chip

This study delves into developing a fault-tolerant network interface for Network-on-Chip systems, highlighting centralized and distributed approaches with results and conclusions. Various fault-tolerance techniques are explored to address increasing transistor density and the need for graceful performance degradation. Experimental setups and comparisons between centralized and distributed designs are presented, emphasizing the importance of fault-tolerance in enhancing system reliability.

larriaga
Download Presentation

Fault-Tolerant Network-Interface for Spatial Division Multiplexing Based Network-on-Chip

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Fault-Tolerant Network-Interface for Spatial Division Multiplexing Based Network-on-Chip By Anup Das

  2. Content • NoC Overview • TDM-Based • SDM-Based • Existing NI Architecture • New Area Optimized Architecture • Need for Fault-Tolerance • Fault-Tolerant NI Architectures • Centralized Approach • Distributed Approach • Results • Conclusion

  3. Network-on-Chip Switch Switch Switch • Increasing Number of IPs/PEs per die • Communication bottleneck with shared bus • Need for a scalable alternative • Use of networking concepts • NoC proposed by Benini et al. NI NI NI IP IP IP Switch Switch Switch NI NI NI IP IP IP

  4. Network-on-Chip (contd.) Switch Switch Switch Switch A A B C B C • Two techniques for communication • Time Division Multiplexing • Spatial Division Multiplexing NI NI NI NI IP IP IP IP TDM-based NoC SDM-based NoC

  5. Network Interface Architecture • N to 1 bit serializers – one for each outgoing wire • Data Distributor to send data from output queues to one of the serializers • Each distributor can send data to each of the serializers • Not all the distributors are loaded all the time • A single distributor can serve all the serializers

  6. 32 32 Network Interface Architecture Switch PE 32 out[0] Distributor 1 n to 1 Queue 1 out[1] n to 1 32 Distributor 2 Queue 2 32 Distributor 3 Queue 3 out[7] n to 1

  7. New Area Optimized NI • Single distributor for all the serializers • New component called “requester” added for interfacing with the queue • 2 IDs introduced – serializer ID (sID) and queue ID (qID) • At connection setup time – each serializer assigned to a queue • Serializer requests for data which is then forwarded to corresponding queue • Data from queues travels back to the requesting serializer

  8. 32 Queue 1 PE 32 32 32 32 32 32 32 New Area Optimized NI 32 to 1 out[0] Switch out[1] 32 to 1 Distributor Requester 32 Queue 2 32 Queue 3 32 to 1 out[7]

  9. Need for Fault-Tolerance • Transistor density on the rise • Shrinking feature size • Increasing number of faults manifesting post fabrication • Yield Loss • Need for fault-tolerance • IP/PE level • Interconnect Level • Idea is to provide graceful degradation of performance in event of faults

  10. 32 32 32 32 32 32 NI Fault-Tolerance - Centralized Switch PE 32 Controller out[0] Distributor 1 n to 1 Queue 1 • Controller introduced between distributor and IP queues • Changes data mapping dynamically when fault occurs with load balancing out[1] n to 1 32 Distributor 2 Queue 2 32 Distributor 3 Queue 3 out[7] n to 1

  11. S1 Queue 1 Controller D1 S2 S3 Queue 2 D2 Centralized NI Operation S4 S5 D3 Queue 3 S6 S7 S8 S1 Controller Queue 1 D1 S2 S3 D2 Queue 2 S4 S5 D3 Queue 3 S6 S7 S8 S1 Controller Queue 1 D1 S2 S3 D2 Queue 2 S4 S5 D3 Queue 3 S6 S7 S8

  12. NI Fault-Tolerance - Distributed • Multiple Distributors and Requestors –each capable of fault recovery • Two other IDs included – dID (distributor ID) and rID (requester ID) • When forwarding request to requester, distributor forwards dID, sID and qID • qID – used by requester to forward request to a queue • dID – used by requester to send back data from the queue to the requesting distributor • sID – used by the distributor to send data to the requesting serializer

  13. S1 D1 Queue 1 R1 S2 S3 Queue 2 D2 R2 Distributed NI Operation S4 S5 Queue 3 S6 S7 S8 S1 R1 Queue 1 D1 S2 S3 Queue 2 D2 R2 S4 S5 Queue 3 S6 S7 S8 S1 D1 R1 S2 Queue 1 S3 D2 R2 S4 Queue 2 S5 Queue 3 S6 S7 S8

  14. Results

  15. Experimental Setup • NoC considered with 8 links per node • Data packets of size 32 bits • Centralized Design coded in VHDL • Distributed Design in Verilog • Synopsys Design Compiler for ASIC synthesis • UMC 65nm Standard Cells • Area and Power number from the synthesis tool • Area number converted to gate count for comparison across technologies

  16. Area Breakup Centralized Design Distributed Design

  17. Area and Power Comparison

  18. Increasing Fault-Tolerance

  19. Throughput

  20. Summary • Distributed Design more area and power efficient but centralized design becomes more efficient with more distributors • Single fault in the controller of centralized design will render it useless • No single fault will affect distributed NI behavior • Next Step – • Increase granularity of load balancing • Fault-tolerance of Serializer

  21. Thank you

More Related