1 / 20

M. Modarressi, H. Sarbazi-Azad, and A. Tavakkol

Performance and Power Efficient On-Chip Communication Using Adaptive Virtual Point-to-Point Connections. M. Modarressi, H. Sarbazi-Azad, and A. Tavakkol Computer Engineering Department, Sharif University of Technology, Tehran, Iran modarressi@ce.sharif.edu. Outline.

wan
Download Presentation

M. Modarressi, H. Sarbazi-Azad, and A. Tavakkol

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Performance and Power Efficient On-Chip Communication Using Adaptive Virtual Point-to-Point Connections M. Modarressi, H. Sarbazi-Azad, and A. Tavakkol Computer Engineering Department, Sharif University of Technology, Tehran, Iran modarressi@ce.sharif.edu

  2. Outline • Introduction and Motivations • Virtual Point-to-Point (VIP) Connections • Static VIP Construction Scheme • Dynamic VIP Construction Scheme • Setup Network • Evaluation Results • Conclusions and Future Work Sharif University of Technology 2

  3. On-Chip Communication Mechanisms • Packet-Switched NoCs • Good Resource Utilization • Modest Design Effort/Time Due to Structured and Predictable Links • Some Power and Performance Overheads Due to Multi-Stage Pipelined Routers • Dedicated Point-to-Point Links • Ideal Power and Performance • Poor Scalability: Significant Area Overhead for Large Systems • Significant Design Effort/Time Due to Non-Predictable Link Properties Virtual Point-to-Point Connections in a Packet-Switched NoC

  4. VIP Connections • VIP: VIrtual Point-to-point Connections • Over One VC (Virtual Channel) of Each Physical Channel • Bypass Some Router Pipeline Stages • Inexpensive Extensions to a Traditional Wormhole Router • Router Control Unit, Arbiter, Buffer of the VIP Virtual Channels

  5. Router Architecture • Buffer at the VIP Virtual Channels Is Replaced by a Register (1-Flit Buffer) • VIP Paths Are Kept by VIP Allocator Units at Output Ports • Determines Which Input Is Connected to This Port Along the VIP • Allocates Output Port to VIP When Control Signals Indicate That the VIP Has an Incoming Flit to Forward • A Flow-Control Mechanism Prevents Starvation in Packet-Switched Flits

  6. VIP Connections • A VIP Is Constructed by Chaining the VIP Registers in the Routers Between the Source And Destination Nodes of a Communication Flow • Provides a Virtual Dedicated Pipelined Link With 1-flit VIP Buffers as Staging Registers • Flits Only Travel Over the Crossbars and Links Which Cover the Actual Physical Distance Between Their Source and Destination Nodes • Skip Through Buffer Read, Buffer Write, and Allocation Operations 6

  7. VIP Connections • VIPs Are Not Allowed to Share a Common Link • To Remove Buffering, Arbitration,… • A Limited Number of VIPs in a Network • But VIPs Cover a Significant Portion of On-Chip Traffic Due to Communication Locality • In Most Multi-Core SoC Applications Each Core Communicates With a Few Other Cores • In CMP Workloads Each Node Tends to Have a Small Number of Favored Destinations for Its Messages

  8. VIP Construction Algorithm - Static • Based on Application Traffic Pattern • Input Applications Are Described by a Task-Graph (TG) • A Heuristic Algorithm • Map the TG Cores into the Nodes of a Mesh-based NoC • Construct VIP for TG Edges in Order of Their Communication Volumes • Find a Path Through Packet-Switched Network for a TG Edge If There Are Not Sufficient Free Resources to Build a VIP for It

  9. VIPs for the VOPD Application • VIPs Cover 100% of the On-Chip Traffic for This Application • Static VIP Construction Scheme: • Benchmarks: VOPD, MWD, MPEG, MP3+H263 • Up to 58% Reduction in Message Latency (39% on Average) • Up to 65% Reduction in Power Consumption (49% on Average)

  10. VIPs vs. Physical Point-to-Point Connections • VIPs Offer: • Power and Performance Close to Dedicated Physical Point-to-Point Connections • More Flexibility • Dynamically Reconfigurable Based on the Traffic Pattern of the Running Application • Less Design Effort • Customized Dedicated Connections Over Regular Components

  11. Dynamic VIP Construction • An Alternative VIP Construction Scheme • Dynamically Changes the VIP Connections in Response to Communication Requirements Imposed By the Running Application • Monitoring the NoC Traffic • Detecting High-Volume Communications and Constructing a VIP for Them • Select the Best Route for a VIP Using a Simple Setup Network

  12. Setup Network • Setup Network Structure • A Light-Weight Control Network • Simple Node Structure and Small Bit-Width • The Same Topology as the Main Data Network • Setup Network Operation • Keep the Track of the Number and Destination of Packets Sent by Each Node • Select Traffic Flows Weighting Higher Than a Threshold (Bit/Sec.) • Finds a Path Along One of the Shortest Paths Between the Source and Destination Nodes of the Traffic Flow to Construct a VIP

  13. Dynamic VIP Construction • Establishing a New VIP May Tear Down Some Existing VIPs • Cost of a VIP: The Cumulative Weight (bit/sec.) of the VIPs That Will Be Torn Down By This New VIP • Setup Network: • Finds the Path With Minimum Cost • Sends the Cost to the Source Node to Decide on Establishing the New VIP • A New VIP Is Established If the Cumulative Weight of the Torn Down VIPs Is Less Than the Weight of the Requesting Traffic Flow 13

  14. Setup Network • VIP Setup Procedure: • Arbitrating Among VIP Setup Requests • Running the Distributed VIP Setup Algorithm • Setting Up a VIP in the Data Network By Configuring the VIP Allocator of the Nodes Along the VIP Path • Tearing Down Conflicting VIPs • Each Setup Network Node Contains the Configuration Information of Its Corresponding Data Network Node • Due to the Distributed Nature of the Algorithm Short Reconfiguration Time 14

  15. D S Select the Minimum Cost and Keep the Port from Which the Smaller Cost Is Received 12 21 9 2 9 10 15 4 5 7 5 8 0 3 9 5 12 1. Add the Received Cost (4) to the Weight of Ports Along the Shortest Path (the W and N Ports) toward the Destination Node 2. Send the New Costs (9 and 12) to the Neighboring Nodes Along the Destination Node 5 0 5 4 12 4 8 Port Cost ( Weight of the VIP Using It ) 15

  16. Dynamic VIP Construction • The Setup Network Operates in Parallel with Packet Transmission in Packet-switched Network • Hide the Setup Time • The Setup Network Has a Small Bit-width and Operates Infrequently (Only When a High-volume Flow Is Detected) • Negligible Power and Area Overhead 16

  17. Evaluation Results • XMulator NoC Simulator (www.xmulator.org) • A C# -based Simulator • Orion Power Library • Comparison with a Conventional NoC (5-Stage Pipelined Wormhole Switch) • Multi-Core SoC Traffic: • H.263 Decoder+MP3 Decoder, H.263 Decoder+ MP3 Encoder, MP3 Decoder+ MP3 Encoder 38% Reduction in Message Latency, 46% Reduction in Power Consumption 17

  18. Evaluation Results Synthetic Traffic: N-Hot Traffic: 80% of Messages to Exactly N Destination, 20% to Randomly Chosen Nodes Message Latency (cycles for 8-flit packets) Power (nJ/Cycle) 18

  19. Summary and Future Work • Adaptable Virtual Point-to-Point Connections in a Packet-Switched NoC • Benefit from the Advantages of Both Communication Methods • Two Static and Dynamic VIP Construction Schemes • Significant Power/Latency Reduction • Future Work • Comparing the Method with Related Work; Express Virtual Channels, Single-Cycle Routers, … • Precise Area/Power Results by Implementing the NoC in Hardware • Analytical Models Show Small Area Overhead 19

  20. Thank You Questions? modarressi@ce.sharif.edu 20

More Related