Fourier Studies: Looking at Data

Fourier Studies:Looking at Data A. Cerri

Outline • Introduction • Data Sample • Toy Montecarlo • Expected Sensitivity • Expected Resolution • Frequency Scans: • Fourier • Amplitude Significance • Amplitude Scan • Likelihood Profile • Conclusions

Introduction • Principles of Fourier based method presented on 12/6/2005, 12/16/2005, 1/31/2006, 3/21/2006 • Methods documented in CDF7962 & CDF8054 • Full implementation described on 7/18/2006 at BLM • Aims: • settle on a completely fourier-transform based procedure • Provide a tool for possible analyses, e.g.: • J/ direct CP terms • DsK direct CP terms • Perform the complete exercise on the main mode () • All you will see is restricted to . Focusing on this mode alone for the time being • Not our Aim: bless a mixing result on the full sample

Data Sample • Full 1fb-1 • Ds, main Bs peak only • ~1400 events in [5.33,5.41] consistent with baseline analysis • S/B ~ 8:1 • Background modeled from [5.7,6.4] • Efficiency curve measured on MC • Taggers modeled after winter ’05 (cut based) + OSKT

Toy Montecarlo • Exercise the whole procedure on a realistic case (see BML 7/18) • Toy simulation configured to emulate sample from previous page • Access to MC truth: • Study of pulls (see BML 7/18) • Projected sensitivity • Construction of confidence bands to measure false alarm/detection probability • Projected m resolution

Toy Montecarlo: sensitivity • Rem: Golden sample only • Reduced sensitivity, but in line with what expected for the statistics • All this obtained without t-dependend fit • Iterating we can build confidence bands

Distribution of Maxima • Run toy montecarlo several times • “Signal”default toy • “Background”toy with scrambled taggers • Apply peak-fitting machinery • Derive distribution of maxima (position,height) Max A/: limited separation and uniform peak distribution for background, but not model (&tagger parameter.) dependent Min log Lratio: improved separation and localized peak distribution for background

Toy Montecarlo: confidence bands • Signal or background depth of deepest minimum in toys • Tail integral of distribution gives detection & false alarm probabilities

Toy Montecarlo: m resolution • Two approaches: • Fit pulls distributions and measure width • Fit two parabolic branches to L minimum in a toy by toy basis RMS~0.5 Negative Error Positive Error

Data All the plots you are going to see are based on Fourier transform & toy montecarlo distributions, unless explicitely mentioned

Data: Fourier and Amplitude

Compare with standard A-scan

Data: Where we look for a Peak • Automated code looks for –log(Lratio) minimum • Depth of minimum compared to toy MC distributions gives signal/background probabilities Background Signal

Data Results • Peak in L ratio is: -2.84 (A/=2.53) • Detection (signal) probability: 53% • False Alarm (background fake) probability: 25% • Likelihood profile:

Conclusions • Worked the exercise all the way through • Method: • Assessed • Viable • Power equivalent to standard technique • Completely independent set of tools/code from standard analysis, consistent with it! • Tool is ready and mature for full blown study • Next: document and bless result as proof-of-principle

Backup

Data Configuration Parameters Signal (ms,,ct,Dtag,tag,Kfactor), Background (S/B,A,Dtag,tag, fprompt, ct, prompt, longliv,), curves (4x[fi(t-b)(t-b)2e-t/]), Toy MC Bootstrap Ascii Flat File (ct, ct, Dexp, tag dec., Kfactor) • Functions: (Re,Im) (+,-,0, tags)(S,B) Re(~[ms=])() Fourier Transform Amplitude Scan Tool Structure Ct Histograms Same ingredients as standard L-based A-scan • Consistent framework for: • Data analysis • Toy MC generation/Analysis • Bootstrap Studies • Construction of CL bands

Validation: • Toy MC Models • “Fitter” Response

Ingredients in Fourier space Resolution Curve (e.g. single gaussian) m (ps-1) Ct (ps) Ct (ps) m (ps-1) m (ps-1) Ct efficiency curve, random example

Toy Montecarlo Data+Toy • As realistic as it can get: • Use histogrammed ct, Dtag, Kfactor • Fully parameterized curves • Signal: • m, ,  • Background: • Prompt+long-lived • Separate resolutions • Independent curves Toy Data Ct (ps) Realistic MC+Toy Toy Data Ct (ps)

Flavor-neutral checks Realistic MC+Model Realistic MC+Toy Ct efficiency Resolution Ct (ps) m (ps-1) Realistic MC+Wrong Model • Re(+)+Re(-)+Re(0) Analogous to a lifetime fit: • Unbiased WRT mixing • Sensitive to: • Eff. Curve • Resolution …when things go wrong m (ps-1)

“Lifetime Fit” on Data Data vs Prediction Data vs Toy m (ps-1) Ct (ps) Comparison in ct and m spaces of data and toy MC distributions

“Fitter” Validation“pulls” • Re(x) or =Re(+)-Re(-) predicted (value,) vs simulated. • Analogous to Likelihood based fit pulls • Checks: • Fitter response • Toy MC • Pull width/RMS vs ms shows perfect agreement • Toy MC and Analytical models perfectly consistent • Same reliability and consistency you get for L-based fits Mean m (ps-1) RMS m (ps-1)

Unblinded Data • Cross-check against available blessed results • No bias since it’s all unblinded already • Using OSTags only • Red: our sample, blessed selection • Black: blessed event list • This serves mostly as a proof of principle to show the status of this tool! M (GeV) Next plots are based on data skimmed, using the OST only in the winter blessing style. No box has been open.

From Fourier to Amplitude Fourier Transform+Error+Normalization • Recipe is straightforward: • Compute (freq) • Compute expected N(freq)=(freq | m=freq) • Obtain A= (freq)/N(freq) • No more data driven [N(freq)] • Uses all ingredients of A-scan • Still no minimization involved though! • Here looking atDs() only (350 pb-1, ~500 evts) • Compatible with blessed results m (ps-1) m (ps-1)

Same configuration as Ds() but ~1000 events Realistic toy of sensitivity at higher effective statistics (more modes/taggers) Toy MC Fourier Transform+Error+Normalization m (ps-1) m (ps-1) Able to run on data (ascii file) and even generate toy MC off of it

Confidence Bands

Peak Search Minuit-based search of maxima/minima in the chosen parameter vs m Two approaches: • Mostly Data driven: use A/ • Less systematic prone • Less sensitive • Use the full information (L ratio): • More information needed • Better sensitivity (REM here sensitivity is defined as ‘discovery potential’ rather than the formal sensitivity defined in the mixing context) • We will follow both approaches in parallel

“Toy” Study • Based on full-fledged toy montecarlo • Same efficiency and ct as in the first toy • Higher statistics (~1500 events) • Full tagger set used to derive D distribution • Take with a grain of salt: optimistic assumptions in the toy parameters • The idea behind this: going all the way through with our studies before playing with data

Distribution of Maxima • Run toy montecarlo several times • “Signal”default toy • “Background”toy with scrambled taggers • Apply peak-fitting machinery • Derive distribution of maxima (position,height) Max A/: limited separation and uniform peak distribution for background, but not model (&tagger parameter.) dependent Min log Lratio: improved separation and localized peak distribution for background

Maxima Heights • Separation gets better when more information is added to the “fit” • Both methods viable “with a grain of salt”. Not advocating one over the other at this point: comparison of them in a real case will be an additional cross check • ‘False Alarm’ and ‘Discovery’ probabilities can be derived, by integration

Integral Distributions of Maxima heights Linear scale Logarith. scale

Determining the Peak Position

Measuring the Peak Position • Two ways of evaluating the stat. uncertainty on the peak position: • Bootstrap off data sample • Generate toy MC with the same statistics • At some point will have to decide which one to pick as ‘baseline’ but a cross check is a good thing! • Example: ms=17 ps-1

Error on Peak Position • “Peak width” is our goal (ms) • Several definitions: histogram RMS, core gaussian, positive+negative fits • Fit strongly favors two gaussian components • No evidence for different +/- widths • The rest, is a matter of taste…

Next Steps • Measure accurately for the whole fb-1 the ‘fitter ingredients’: • Efficiency curves • Background shape • D and ct distributions • Re-generate toy montecarlos and repeat above study all the way through • Apply same study with blinded data sample • Be ready to provide result for comparison to main analysis • Freeze and document the tool, bless as procedure

Conclusions • Full-fledged implementation of the Fourier “fitter” • Accurate toy simulation • Code scrutinized and mature • The exercise has been carried all the way through • Extensively validated • All ingredients are settled • Ready for more realistic parameters • After that look at data (blinded first)

Fourier Studies: Looking at Data