240 likes | 317 Views
Introspective 3D Chips S. Mysore, B. Agrawal, N. Srivastava, S. Lin, K. Banerjee, T. Sherwood (UCSB), ASPLOS 2006. Shimin Chen (LBA Reading Group Presentation). Motivation. Focus: run-time monitoring for development Tool overhead a mount of analysis at test-time
E N D
Introspective 3D ChipsS. Mysore, B. Agrawal, N. Srivastava, S. Lin, K. Banerjee, T. Sherwood (UCSB), ASPLOS 2006 Shimin Chen (LBA Reading Group Presentation)
Motivation • Focus: run-time monitoring for development • Tool overhead amount of analysis at test-time • Previous research: specialized on-chip h/w modules • At odds with economics of consumer microprocessors • May require significant amount of area • Often introduce interconnect congestion • Replicated on every processors whether used or not • Challenge: enabling these techniques with a minimum of impact on typical end-user systems
Solution: Add-On using 3D • Optionally adding a layer to a processor specifically for analysis • Developers: processors with this layer • End users: processors without this layer
Outline • Introduction • Benefits of Introspection in 3D • Quantifying the Technology (Methodology) • Architectural Ramifications (Evaluation) • Conclusion
Benefits of Introspection in 3D • Cutting interconnect impact • Reducing cost for commodity parts • Enabling more powerful software analysis
Cutting Interconnect Impact • Previous: gathering data from all over chip for centralized analysis • Global interconnect • Cross almost every design block • Consume significant top metal layer • Run at high speed • Require wire buffering & even pipeline latches • Reserve silicon for buffers
Cutting Interconnect Impact • Previous: global interconnect • 3D: Area for inter-layer vias localized to positions of taps
Reducing Cost for Commodity Parts • 225 million PCs in use vs. 0.7 million programmers • Need to consider two costs: • Cost of a consumer system: • cirtuit that drives the post and the vertical column of vias • Cost of a developer system: • adding an extra layer
Enabling More Powerful SW Analysis • More h/w resources allocated to analysis • Area • power
Outline • Introduction • Introspection in 3D • Quantifying the Technology • Architectural Ramifications • Conclusion
Posts: 5um x 5um cross 30 - 40 um high (compare normal metal wire: 1um x 1um) Cross Section of 3D Chip
Estimating Interconnect Overhead • Optimal buffer size and inter-buffer separation • 2D interconnect overhead • 3D interconnect overhead • Metalization area
Number of Vertical Posts Estimate that 1024 bits of profile data will be generated per cycle (?)
Example HW Monitor 16KB 32KB 32KB 32KB RISC ARM 16KB 130nm technology, area: 16mm2, power: 2.7W
Outline • Introduction • Introspection in 3D • Quantifying the Technology • Architectural Ramifications • Conclusion
Four Types of Systems to Compare • Basic System (Sbase) • System with integrated profiling HW (Sintegrated) • System with profiling HW stacked (Sstacked) • System with profiling stubs (Sstubs)
Routability • Based on Pentium 4 analysis • Sintegrated: • Total wire length=5682.3 mm • Total buffers=~20,000 • Sstacked: • Total buffers=1024 (one per post)
Conclusion • Economic argument: cost of specialized H/W is decoupled from consumer market • H/W stubs add only 0.021 mm2 area and 0.9% power