250 likes | 399 Views
Minimal Skew Clock Embedding Considering Time-Variant Temperature Gradient. Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu Hu Partially supported by NSF and UC MICRO funds. Outline. Backgrounds and Motivations Modeling and Problem Formulation Algorithms
E N D
Minimal Skew Clock Embedding Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu Hu Partially supported by NSF and UC MICRO funds.
Outline • Backgrounds and Motivations • Modeling and Problem Formulation • Algorithms • Experimental Results • Conclusions
PLL Disp Source Intel AUDIO MEM-ctrll VIDEO Sys Clock Tree Synthesis in Synchronous Circuits • Clock signals synchronize data transfer between functional elements in synchronous design • Different clock structures exist [Tree, Mesh, Hybrid, etc] • Clock skew is the delay difference between two sinks of clock tree • Clock skew becomes one of the most significant concerns in clock tree synthesis for high performance designs
v s0 a b s2 s0 s4 s1 s3 s2 s3 a b s1 v s4 Methodologies for Clock Skew Minimization • The sources of skew • Un-balanced clock distribution • Process, supply voltage and temperature (PVT) variation • Uncertainty from loading • Methodologies • Active de-skew circuit using micro-controller [Rusu’00] • Passive balanced embedding by CAD algorithms [Tsay'91] [Edahiro'91] [Chao'92] [Boese'92] [Cong’98] Variation-induced skew needs to be considered! Embedding Topo-Gen
Existing work and Our Contributions • This work is focused on reducing the temperature variation induced skew • The existing work for temperature aware clock skew minimization[Cho:ICCAD’05] • Considered only spatial temperature variations • The time-variant temperature variation was ignored • Assumed the worst case temperature map was given • The major contributions of this work • Build a parameterized macro model for temperature variations • Present an effective algorithm PECO, which consider the time-variant temperature variation with correlation • PECO reduces worst case skew by up to 5x compared with the ZST/DME algorithm
Outline • Backgrounds and Motivations • On-chip Temperature Variation Modeling • Variation Sources: Spatial & Temporal • Temperature Correlations • Algorithms • Experimental Results • Conclusions
Spatial Temperature Variation Induced Skew • Spatial variant: Non-uniform power density generates on-chip temperature gradient • Clock tree embedding considering the spatial temperature variation: TACO[Cho:ICCAD’05] • Ignore the time-variant temperature under different workloads
Temporal Temperature Variation Induced Skew • Significant different temperature maps from two SPEC2000 applications: Ammp, Gzip Dilemma: Optimizing skew for one application hurts the other….
Problem Formulation • Given: • The source, sinks andan initial embeddingof the clock tree • Each region is modeled by mean and variance for temperature, and correlation between variations • To find: • An re-embedding of the clock tree • To Minimize the worst case skew under all temperature variations
Considering temperature correlations during optimization can compress searching space! Correlations in Temperature Variation • Spatial and Temporal Correlation: Strong correlations exist between temperature for different workloads and different regions on chip • Resource sharing between workloads cause temporal correlation (i,j) Correlation between area i and j
Outline • Backgrounds and Motivations • Modeling and Problem Formulation • Re-embedding Algorithm • Experimental Results • Conclusions
v x y a b c d Re-embedding Process (An example) Perturbation option Sink Original merging point
v x y a b c d Re-embedding Process (An example) New merging point
Delay, Skew Calculation for Clock Tree • The clock tree is a SIMO linear system • Cares impulse responds in each sinks • Perturbed Modified Nodal Analysis (MNA) • x is for source, sinks and merging point • L selects sink responses • Defining a new state variable with both nominal (x) and perturbed state variables (Δx) • Structured and parameterized state matrix The number of perturbation configurations I=5N is huge! (N is number of merging points)
Compressing State Matrix by Temperature Correlation • Motivations • Spatial and temporal correlation of the temperature values excludes the need to exhaustively calculate all perturbation combinations • Highly correlated merging points should be perturbed in the same fashion • Solution • Clustering merging points based on correlation strength • Perform the same perturbation for all points within one cluster
Merging Points Clustering by Temperature Correlation • Objective • Given correlation matrix C of them, a low-rank matrix, N >> K • Partition N merging points into K clusters • Maximize the correlation strength within each of K clusters C
Low-Rank Approx. Merging Points Clustering by Temperature Correlation • Objective • Given correlation matrix C of them, a low-rank matrix, N >> K • Partition N merging points into K clusters • Decide the clustering number K • Singular Value Decomposition (SVD) reveal the real rank (K) information from C • Partition the merging points into K clusters • K-Means clustering algorithm is employed. • K = 4, N = 70 • Reduced from 570 to 54
Cluster based reduction (SVD + K-Means) Structural reduction [Hao Yu, DAC’06] Transient time analysis (Back-Euler) Structural Reduction & Transient Time Analysis
Outline • Backgrounds and Motivations • Modeling and Problem Formulation • Algorithms • Experimental Results • Conclusions
Experimental Settings • Temperature variation profiles obtained by micro-architecture level power-temperature transient simulator [Liao,TCAD’05] with 6 SPEC2000 applications • 100 temperature profiles are collected under every 10 million clock cycles • Compare two algorithms: • DME method: minimize wire-length for zero-skew under Elmore delay model with nominal temperature • Our PECO: minimize skew under a more accurate high-order macromodel with temperature variations
Skew Distribution • Under 100 temperature maps, and PECO reduces worst-skew and the mean skew
Experimental Results (cont.) • PECO reduces the worst-case skew by up to 5X (i.e., for net r5) • Skew measured in higher-order delay model considering temperature variations for all applications • Skew reduction increases for larger clock nets • PECO increases wire-length by less than 1% • Runtime • Optimization time of PECO is less than DME • Model building time is still long but more accurate
Outline • Backgrounds and Motivations • Modeling and Problem Formulation • Algorithms • Experimental Results • Conclusions
Conclusions • Studied the clock optimization for workload dependent temperature variation • Reduced the worst-case skew by up to 5X with only 1% wire-length overhead compared to best existing method • The methodologies can be extended to handle • PVT variations with spatial correlations • Other design freedoms such as, floorplanning, power/ground optimization, etc
Thank you! ACM International Symposium on Physical Design 2007 Hao Yu (graduated), Yu Hu, Chun-Chen Liu and Lei He Minimal Skew Clock Embedding Considering Time Variant Temperature Gradient