Branch Prediction using Advanced Neural Methods

Branch Prediction using Advanced Neural Methods Sunghoon Kim CS252 Project

Introduction • Dynanmic Branch Prediction • No doubt about its importance in Speculation performance • Given history of branch behaviors, predict branch behaviors at the next step • Common solutions: gshare, bimode,hybrid… • Replace saturating counters with neural methods?

Neural methods • Capable of classification (predicting into which set of classes a particular instance will fall) • Learns correlations between inputs and output, and generalized learning to other inputs • Potential to solve problems of most two-levels predictors

Simulation Models - Gshare • 20-bit Global history shift register • Per-address history table with 2-bit saturation counter GHT Predict 2-bit saturation counter BrachPC>>2 Update counters PHT

Simulation Models - Perceptron PHT GHT • 14-bit Global history shift register • Per-address history table with 8-bit weights and bias • Indexed by Gshare or BranchPC alone BrachPC>>2 OR Predict BrachPC>>2 Training weights & bias

Simulation Models - Backpropagation GHT • 10-bit GHR • Sigmoid transfer function • Floating point computation • Floating point weights and biases • 20 neurons one hidden layer BrachPC>>2 OR Predict BrachPC>>2 Training weights & bias PHT

Simulation Models – Radial Basis Networks GHT PHT • Transfer function for radial basis neuron exp(-n2) • Distance function between an input vector and a weight vector BrachPC>>2 OR Predict BrachPC>>2 Training weights & bias

Simulation Models – Elman Networks GHT PHT • Feedback from the hidden layer outputs to the first layer BrachPC>>2 OR Predict BrachPC>>2 Training weights & bias

Simulation Models – Learning Vector Quantization Networks GHT PHT • Distance function as Radial but without biases • Competitive function gives one only to an winning input (biggest value) and zeroe to the other BrachPC>>2 OR Predict BrachPC>>2 Training weights & bias

Simulation Environment • SimpleScalar Tool • Some of SPEC2000 benchmarks • Execute 100,000,000 instructions and dump conditional branch histories • 5000 branch instructions are used for training • Make all budgets for PHTs the same • Floating point is 4 byte

Results

Hardware constraints • Predictors must predict within a (few) cycle • Gshare : easy to achieve • Perceptron : Integer adders, possible alternative, more accurate if more layers • Other advanced neural net : Hard to implement, Floating point functional units,

Future Works • Replace floating point weights and biases with scaled integer ones? • Replace floating point function with approximately equivalent integer function, using Taylor’s series? • Without budget consideration, what will be the best performance of advanced neural network methods? • Look at codes carefully if there are mistakes

Conclusions • There is not much of benefit using advanced neural networks on the same budget as Gshare, sometimes worse • Elman Networks method is the best • Hard to implement in hardware unless floating point computation are easy to do • NN can be alternative predictors if well designed

Branch Prediction using Advanced Neural Methods

Branch Prediction using Advanced Neural Methods

Presentation Transcript

Branch Prediction

Branch Prediction

Neural Prediction Challenge

Computer Architecture Advanced Branch Prediction

Branch Prediction

Computer Architecture Advanced Branch Prediction

Computer Structure Advanced Branch Prediction

Branch Prediction using Artificial Neurons John Mixter

Dynamic Branch Prediction

Neural Methods for Dynamic Branch Prediction

Branch prediction

Branch Prediction

Branch Prediction

Neural Methods for Dynamic Branch Prediction

Branch Prediction

Branch prediction

Branch Prediction Techniques

Branch Prediction

Branch Prediction Logic