160 likes | 512 Views
Branch Prediction using Advanced Neural Methods. Sunghoon Kim CS252 Project. Introduction. Dynanmic Branch Prediction No doubt about its importance in Speculation performance Given history of branch behaviors, predict branch behaviors at the next step
E N D
Branch Prediction using Advanced Neural Methods Sunghoon Kim CS252 Project
Introduction • Dynanmic Branch Prediction • No doubt about its importance in Speculation performance • Given history of branch behaviors, predict branch behaviors at the next step • Common solutions: gshare, bimode,hybrid… • Replace saturating counters with neural methods?
Neural methods • Capable of classification (predicting into which set of classes a particular instance will fall) • Learns correlations between inputs and output, and generalized learning to other inputs • Potential to solve problems of most two-levels predictors
Simulation Models - Gshare • 20-bit Global history shift register • Per-address history table with 2-bit saturation counter GHT Predict 2-bit saturation counter BrachPC>>2 Update counters PHT
Simulation Models - Perceptron PHT GHT • 14-bit Global history shift register • Per-address history table with 8-bit weights and bias • Indexed by Gshare or BranchPC alone BrachPC>>2 OR Predict BrachPC>>2 Training weights & bias
Simulation Models - Backpropagation GHT • 10-bit GHR • Sigmoid transfer function • Floating point computation • Floating point weights and biases • 20 neurons one hidden layer BrachPC>>2 OR Predict BrachPC>>2 Training weights & bias PHT
Simulation Models – Radial Basis Networks GHT PHT • Transfer function for radial basis neuron exp(-n2) • Distance function between an input vector and a weight vector BrachPC>>2 OR Predict BrachPC>>2 Training weights & bias
Simulation Models – Elman Networks GHT PHT • Feedback from the hidden layer outputs to the first layer BrachPC>>2 OR Predict BrachPC>>2 Training weights & bias
Simulation Models – Learning Vector Quantization Networks GHT PHT • Distance function as Radial but without biases • Competitive function gives one only to an winning input (biggest value) and zeroe to the other BrachPC>>2 OR Predict BrachPC>>2 Training weights & bias
Simulation Environment • SimpleScalar Tool • Some of SPEC2000 benchmarks • Execute 100,000,000 instructions and dump conditional branch histories • 5000 branch instructions are used for training • Make all budgets for PHTs the same • Floating point is 4 byte
Hardware constraints • Predictors must predict within a (few) cycle • Gshare : easy to achieve • Perceptron : Integer adders, possible alternative, more accurate if more layers • Other advanced neural net : Hard to implement, Floating point functional units,
Future Works • Replace floating point weights and biases with scaled integer ones? • Replace floating point function with approximately equivalent integer function, using Taylor’s series? • Without budget consideration, what will be the best performance of advanced neural network methods? • Look at codes carefully if there are mistakes
Conclusions • There is not much of benefit using advanced neural networks on the same budget as Gshare, sometimes worse • Elman Networks method is the best • Hard to implement in hardware unless floating point computation are easy to do • NN can be alternative predictors if well designed