110 likes | 123 Views
Hands-on Session RooStats and TMVA Exercises. Getting Started. A few advice words: Install version 5.25.02 locally setting up . root_Installaton_directory/bin/thisroot.sh (or .csh) recommend to compile your macros and run with: .x macro.C+; or .L macro.C+ then macro();,
E N D
Getting Started • A few advice words: • Install version 5.25.02 locally • setting up • . root_Installaton_directory/bin/thisroot.sh (or .csh) • recommend to compile your macros and run with: • .x macro.C+; or .L macro.C+ then macro();, • include: using namespace RooFit; using namespace RooStats;, • avoid running a macro multiple times in the same ROOT session • If not working locally: • setting ROOT 5.25.02 using the CERN AFS installation: • on SLC4: • /afs/cern.ch/sw/lcg/app/releases/ROOT/5.25.02/slc4_amd64_gcc34/root/bin/thisroot.csh or .sh
RooStats Exercises (1) • Generate a Poisson Model (use RooFit factory) • Use RooStats calculators class to find limits, significances and produce plots // Use a RooWorkspace to store the PDF models, prior // informations, list of parameters, ... RooWorkspace myWS("myWS"); // Number of signal and background events myWS.factory("S[2,0,10]"); // default value 2 and range [0,10] myWS.factory("B[1]"); // value fixed to 1 // Observable (dummy observable used) myWS.factory("x[0,1]"); // arbitrary range [0,1] myWS.var("x")->setBins(1); // Signal and background distribution of the observable myWS.factory("Uniform::sigPdf(x)"); myWS.factory("Uniform::bkgPdf(x)"); // S+B and B-only models (both extended PDFs) myWS.factory("SUM::model(S*sigPdf,B*bkgPdf"); myWS.factory("ExtendPdf::modelBkg(bkgPdf,B)"); // generate binned data with fixed number of events RooAbsData* data = myWS.pdf("model")->generateBinned(*myWS.var("x"),myWS.var("S")->getVal()+myWS.var("B")->getVal(),Name("data")); //plot the data
ProfileLikelihood Exercise • ProfileLikelihoodCalculator • get interval: • get significance: RooRealVar * S = myWS.var("S");RooArgSet POI(*S);RooAbsPdf * model = myWS.pdf("model");ProfileLikelihoodCalculator plc(*data,*model,POI); //set thest sizeplc.SetTestSize(0.10); model->fitTo(*data,SumW2Error(kFALSE));LikelihoodInterval * interval = plc.GetInterval();const double lowerLimit = interval->LowerLimit(*S);const double upperLimit = interval->UpperLimit(*S);LikelihoodIntervalPlot lplot(interval);lplot.Draw() // Create a copy of the POI parameters to set the values to zeroRooArgSet nullparams;nullparams.addClone(*myWS.var("S"));((RooRealVar *) (nullparams.first()))->setVal(0);plc.SetNullParameters(nullparams); //get significance HypoTestResult* plcResult = plc.GetHypoTest();const double significance = plcResult->Significance();
Result • for S=2 you should get: Significance = 1.60987
Gaussian Model • Generate Gaussian signal over flat background • systematics in sigma of mass and background RooWorkspace myWS("myWS"); // ObservablemyWS.factory("mass[0,500]"); // range [0,500]// Signal and background distribution of the observable myWS.factory("Gaussian::sigPdf(mass,200,sigSigma[0,100])") ;myWS.factory("Uniform::bkgPdf(mass)") ;myWS.factory("SUM::model(S[5,0,30]*sigPdf,B[10,0,100]*bkgPdf") ;// Background only pdfmyWS.factory("ExtendPdf::modelBkg(bkgPdf,B)") ; // Prior for signalmyWS.factory("Uniform::priorPOI(S)") ;// Priors for nuisance parameters (signal + backg) myWS.factory("Gaussian::prior_sigSigma(sigSigma,50,5)") ;myWS.factory("Gaussian::prior_B(B,10,3)") ;myWS.factory("PROD::priorNuisance(prior_sigSigma,prior_B)") ; // generate unbinned data myWS.defineSet("observables","mass"); RooAbsData * data = myWS.pdf("model")->generate(*myWS.set("observables"),Extended(),Name("data"));
RooStats Exercise • Compute: • 68% CL 2-sided confidence interval and significance from the (profiled-) likelihood ratio • plot profile log-likelihood ratio • Frequentist p-value in the S+B and B-only hypotheses, signal significance, CL_S ratio (HybridCalculator) • Repeat including systematic uncertainty in the background and signal width • Use also Bayesian calculator , MCMCCalculator and Neyman construction for finding limits (with Poisson or Gaussian example) • More examples and code available in • https://twiki.cern.ch/twiki/bin/view/RooStats/TutorialsOctober2009
HybridCalculator HybridCalculator hc("hc","HybridCalculator",*data,*modelSB,*modelB); hc.SetNumberOfToys(5000); hs.SetTestStatistics(1); //Run and retrieve the results HybridResult* hcResult = hc.GetHypoTest(); double p_value_sb = hcResult->AlternatePValue(); double p_value_b = hcResult->NullPValue(); double cl_s = hcResult->CLs(); double significance = hcResult->Significance(); //Making a plot of the results HybridPlot* hcPlot = hcResult->GetPlot("hcPlot","p-Values plot",100); hcPlot->Draw();
Bayesian Calculator BayesianCalculator bc(data,*model,RooArgSet(*POI),*priorPOI,&nuisanceParameters); //Compute the credibility interval //Set the confidence level of the credibility interval and compute it. Returns a SimpleInterval. bc.SetTestSize(0.05); SimpleInterval* interval = bc.GetInterval(); double lowerLimit = interval->LowerLimit(); double upperLimit = interval->UpperLimit(); // The code below produce a plot: RooAbsPdf* fPosteriorPdf = bcalc.GetPosteriorPdf(); RooPlot* plot = POI->frame(); plot->SetTitle(TString("Posterior probability of parameter \"")+TString(POI->GetName())+TString("\"")); fPosteriorPdf->plotOn(plot,RooFit::Range(interval->LowerLimit(),interval->UpperLimit(),kFALSE),RooFit::VLines(),RooFit::DrawOption("F"),RooFit::MoveToBack(),RooFit::FillColor(kGray)); fPosteriorPdf->plotOn(plot); plot->GetYaxis()->SetTitle("posterior probability"); plot->Draw();
RooStats Exercises (2) • Generate a multi-dimensional model • have both S and mass as parameter of interest • Compare results of Profile likelihood, Neyman construction and MCMC calculator • examples are to be downloaded from: • http://www.cern.ch/moneta/temp/roostats2.tar • run to generate data: • rs500e_PrepareWorkspace_GaussOverFlat_withSystematics_floatingMass.C • to run exercise: • rs501_ThreeTypesOfLimits.C
TMVA exercises • Download macro code from • http://www.cern.ch/moneta/temp/tmva_exercises.tar • Try first generating different data-sets (use RooFit for doing it ) • macro makesample.C • usage: makesample(int sampleID) • ID=0,1,2,3,4 depending on type of data to generate • Try and play with the various methods • macro driver.C • driver(Int_t sampleID = 0, TString obsList = "x,y,z", TString myMethodList = "Fisher,Likelihood,MLP,BDT") • Obtaine TMVA GUI from macro • produce plots of the signal and background distribution • look at the result, classifier outputs and performances (ROC curve )