280 likes | 286 Views
This report provides the results of BDT training in the HH 2l+2h channel using different randomization and oversampling methods, with and without re-scaling. The ROC curves for different ntree, depth, mcw, and lr settings are also presented.
E N D
Status of the bdt Training in HH 2l+2h channel(part-11) R. K. Dewanjee KBFI Group Meeting
ROC Curves ntrees = 1000, depth = 2, mcw = 1, lr = 0.01 dR03mvaLoose Randomization method w/o Re-scaling 300 GeV dR03mvaLoose Randomization method w/o Re-scaling 800 GeV dR03mvaLoose Randomization method w/o Re-scaling 500 GeV 1.6% 8.4% 9.1% 1.9% 13.8% 6.8% dR03mvaLoose Over-sampling method w/o Re-scaling 500 GeV dR03mvaLoose Over-sampling method w/o Re-scaling 300 GeV dR03mvaLoose Over-sampling method w/o Re-scaling 800 GeV 0.7% 5.7% 4.2% 1.2% 4.8% 3.6%
ROC Curves ntrees = 1000, depth = 2, mcw = 1, lr = 0.01 dR03mvaLoose Randomization method with Re-scaling 800 GeV dR03mvaLoose Randomization method with Re-scaling 500 GeV dR03mvaLoose Randomization method with Re-scaling 300 GeV 1.3% 2.8% 4.2% 1.8% 3% 4.5% dR03mvaLoose Over-sampling method with Re-scaling 300 GeV dR03mvaLoose Over-sampling method with Re-scaling 800 GeV dR03mvaLoose Over-sampling method with Re-scaling 500 GeV 0.1% 0.2% 1.2% 0.1% 0.4% 1.8%
ROC Curves ntrees = 1000, depth = 2, mcw = 1, lr = 0.01 dR03mvaVLoose Randomization method w/o Re-scaling 800 GeV dR03mvaVLoose Randomization method w/o Re-scaling 300 GeV dR03mvaVLoose Randomization method w/o Re-scaling 500 GeV 2% 10% 9.2% 2% 9.5% 6.7% dR03mvaVLoose Over-sampling method w/o Re-scaling 300 GeV dR03mvaVLoose Over-sampling method w/o Re-scaling 500 GeV dR03mvaVLoose Over-sampling method w/o Re-scaling 800 GeV 0.8% 5% 3% 0.8% 4% 3%
ROC Curves ntrees = 1000, depth = 2, mcw = 1, lr = 0.01 dR03mvaVLoose Randomization method with Re-scaling 500 GeV dR03mvaVLoose Randomization method with Re-scaling 300 GeV dR03mvaVLoose Randomization method with Re-scaling 800 GeV 2% 3.5% 5% 1.6% 3.4% 4% dR03mvaVLoose Over-sampling method with Re-scaling 500 GeV dR03mvaVLoose Over-sampling method with Re-scaling 800 GeV dR03mvaVLoose Over-sampling method with Re-scaling 300 GeV 0.1% 0.4% 1.5% 0.1% 0.6% 2%
ROC Curves ntrees = 1000, depth = 2, mcw = 1, lr = 0.01 dR03mvaVVLoose Randomization method w/o Re-scaling 300 GeV dR03mvaVVLoose Randomization method w/o Re-scaling 500 GeV dR03mvaVVLoose Randomization method w/o Re-scaling 800 GeV 2% 9.9% 10% 5.7% 11% 2% dR03mvaVVLoose Over-sampling method w/o Re-scaling 800 GeV dR03mvaVVLoose Over-sampling method w/o Re-scaling 500 GeV dR03mvaVVLoose Over-sampling method w/o Re-scaling 300 GeV 3.2% 5.1% 1% 2.5% 0.8% 5.1%
ROC Curves ntrees = 1000, depth = 2, mcw = 1, lr = 0.01 dR03mvaVVLoose Randomization method with Re-scaling 500 GeV dR03mvaVVLoose Randomization method with Re-scaling 300 GeV dR03mvaVVLoose Randomization method with Re-scaling 800 GeV 5.4% 2% 2.7% 3.7% 4.4% 1.8% dR03mvaVVLoose Over-sampling method with Re-scaling 800 GeV dR03mvaVVLoose Over-sampling method with Re-scaling 500 GeV dR03mvaVVLoose Over-sampling method with Re-scaling 300 GeV 0.2% 1.7% 0.3% 0.1% 2.5% 0.7%
ROC Curves ntrees = 1000, depth = 2, mcw = 1, lr = 0.01 dR03mvaLoose (Low Masses only) Randomization method with Re-scaling 300 GeV dR03mvaLoose (Low Masses only) Randomization method w/o Re-scaling 300 GeV dR03mvaLoose (High Masses only) Randomization method with Re-scaling 500 GeV dR03mvaLoose (High Masses only) Randomization method w/o Re-scaling 500 GeV 6% 1.9% 27% 5% 8% 2.1% 33% 4.3% dR03mvaLoose (Low Masses only) Over-sampling method with Re-scaling 300 GeV dR03mvaLoose (High Masses only) Over-sampling method w/o Re-scaling 500 GeV dR03mvaLoose (Low Masses only) Over-sampling method w/o Re-scaling 300 GeV dR03mvaLoose (High Masses only) Over-sampling method with Re-scaling 500 GeV 6% 0.1% 0.5% 2% 8% 0.2% 0.9% 1%
dR03MVALoose INPUT VARIABLES: TRAIN DATASET-1 OVER-SAMPLING METHOD (w/o RE-SCALING)
dR03MVALoose INPUT VARIABLES: TEST DATASET-1 OVER-SAMPLING METHOD (w/o RE-SCALING)
dR03MVALoose INPUT VARIABLES: TRAIN DATASET-2 OVER-SAMPLING METHOD (w/o RE-SCALING)
dR03MVALoose INPUT VARIABLES: TEST DATASET-2 OVER-SAMPLING METHOD (w/o RE-SCALING)
dR03MVALoose INPUT VARIABLES: TRAIN DATASET-1 RANDOMIZATION METHOD (w/o RE-SCALING)
dR03MVALoose INPUT VARIABLES: TEST DATASET-1 RANDOMIZATION METHOD (w/o RE-SCALING)
dR03MVALoose INPUT VARIABLES: TRAIN DATASET-2 RANDOMIZATION METHOD (w/o RE-SCALING)
dR03MVALoose INPUT VARIABLES: TEST DATASET-2 RANDOMIZATION METHOD (w/o RE-SCALING)
ROC Curves ntrees = 1000, depth = 2, mcw = 1, lr = 0.01 dR03mvaVLoose Randomization method (all masses except 300 GeV) w/o Re-scaling dR03mvaVVLoose Randomization method (all masses except 300 GeV) w/o Re-scaling dR03mvaLoose Randomization method (all masses except 300 GeV) w/o Re-scaling dR03mvaVLoose Randomization method (only 300 GeV) w/o Re-scaling dR03mvaLoose Randomization method (only 300 GeV) w/o Re-scaling dR03mvaVVLoose Randomization method (only 300 GeV) w/o Re-scaling
ROC Curves ntrees = 1000, depth = 2, mcw = 1, lr = 0.01 dR03mvaVVLoose Oversampling method (all masses except 300 GeV) w/o Re-scaling dR03mvaVLoose Oversampling method (all masses except 300 GeV) w/o Re-scaling dR03mvaLoose Oversampling method (all masses except 300 GeV) w/o Re-scaling dR03mvaVLoose Oversampling method (only 300 GeV) w/o Re-scaling dR03mvaVVLoose Oversampling method (only 300 GeV) w/o Re-scaling dR03mvaLoose Oversampling method (only 300 GeV) w/o Re-scaling
ROC Curves ntrees = 1000, depth = 2, mcw = 1, lr = 0.01 dR03mvaVVLoose Randomization method (all masses except 500 GeV) w/o Re-scaling dR03mvaVLoose Randomization method (all masses except 500 GeV) w/o Re-scaling dR03mvaLoose Randomization method (all masses except 500 GeV) w/o Re-scaling dR03mvaLoose Randomization method (only 500 GeV) w/o Re-scaling dR03mvaVLoose Randomization method (only 500 GeV) w/o Re-scaling dR03mvaVVLoose Randomization method (only 500 GeV) w/o Re-scaling
ROC Curves ntrees = 1000, depth = 2, mcw = 1, lr = 0.01 dR03mvaVVLoose Oversampling method (all masses except 500 GeV) w/o Re-scaling dR03mvaVLoose Oversampling method (all masses except 500 GeV) w/o Re-scaling dR03mvaLoose Oversampling method (all masses except 500 GeV) w/o Re-scaling dR03mvaVLoose Oversampling method (only 500 GeV) w/o Re-scaling dR03mvaLoose Oversampling method (only 500 GeV) w/o Re-scaling dR03mvaVVLoose Oversampling method (only 500 GeV) w/o Re-scaling
ROC Curves ntrees = 1000, depth = 2, mcw = 1, lr = 0.01 dR03mvaVLoose Randomization method (all masses except 800 GeV) w/o Re-scaling dR03mvaVVLoose Randomization method (all masses except 800 GeV) w/o Re-scaling dR03mvaLoose Randomization method (all masses except 800 GeV) w/o Re-scaling dR03mvaVVLoose Randomization method (only 800 GeV) w/o Re-scaling dR03mvaVLoose Randomization method (only 800 GeV) w/o Re-scaling dR03mvaLoose Randomization method (only 800 GeV) w/o Re-scaling
ROC Curves ntrees = 1000, depth = 2, mcw = 1, lr = 0.01 dR03mvaVVLoose Oversampling method (all masses except 800 GeV) w/o Re-scaling dR03mvaVLoose Oversampling method (all masses except 800 GeV) w/o Re-scaling dR03mvaLoose Oversampling method (all masses except 800 GeV) w/o Re-scaling dR03mvaLoose Oversampling method (only 800 GeV) w/o Re-scaling dR03mvaVLoose Oversampling method (only 800 GeV) w/o Re-scaling dR03mvaVVLoose Oversampling method (only 800 GeV) w/o Re-scaling
The similar performance of the upper half test ROC curves as a function of increasing “gen_mHH” is an indication that the BDT interpolation might not be working as expected. • This is in spite of the fact that the higher (500 , 800 GeV) gen_mHH signals have better S/B separation than the lower (300 GeV) signals (as evident in the lower half ROC curves). • To check this apparent insensitivity of parametrized BDT to changes in gen_mHH, Christian asked me to compute raw BDT output values for fixed events from different signal MCs for two cases: • When they have the correct gen_mHHalloted to them. • When we deliberately assign them wrong gen_mHH keeping the rest of the input variables same as above.
Randomization Method: Case-1 dR03mvaLoose Randomization Method: Case-2 Over-sampling Method: Case-1 Over-sampling Method: Case-2
Randomization Method: Case-1 dR03mvaVLoose Randomization Method: Case-2 Over-sampling Method: Case-1 Over-sampling Method: Case-2
Randomization Method: Case-1 dR03mvaVVLoose Randomization Method: Case-2 Over-sampling Method: Case-1 Over-sampling Method: Case-2