1 / 13

Statistics Forum Follow-up info for Physics Coordination 24 June, 2011

Statistics Forum Follow-up info for Physics Coordination 24 June, 2011. Glen Cowan, Eilam Gross. Main questions. What do we see as the main way forward with CMS? What do we recommend in the short term (summer 2011)? What do we recommend after summer 2011?. The way forward with CMS.

tanek
Download Presentation

Statistics Forum Follow-up info for Physics Coordination 24 June, 2011

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistics Forum Follow-up info for Physics Coordination 24 June, 2011 Glen Cowan, Eilam Gross Follow-up from the Statistics Forum / CERN, 24 June 2011

  2. Main questions What do we see as the main way forward with CMS? What do we recommend in the short term (summer 2011)? What do we recommend after summer 2011? Follow-up from the Statistics Forum / CERN, 24 June 2011

  3. The way forward with CMS We met again with CMS in the evening of 23 June 2011 (ATLAS: Cowan, Gross, Murray, Read, Cranmer; CMS: Cousins, Lyons, Dorigo, Demortier) Cousins more or less ruled out supporting either CLs or PCL as a long-term recommendation for CMS. We tried to clarify if this was his view or that of CMS. He believes his own view, which is to use Feldman-Cousins unified (two-sided) intervals would be followed in CMS. We replied that the prevailing view in ATLAS has been to quote a one-sided upper limit, and it was difficult to envisage adopting F-C in place of this. So at present there is no single frequentist method that would have long-term support from both ATLAS and CMS. In the short term, there is support for CLs in both collaborations as an interim solution to allow for comparison of limits. Follow-up from the Statistics Forum / CERN, 24 June 2011

  4. The way forward with CMS (2) Bayesian methods emerged as a solution with support from both sides. On the one hand this had always been viewed as a useful complement to the frequentist limit. Furthermore, one can study and report the frequentist properties of Bayesian intervals (i.e., the fraction of times they would cover the true parameter value), and in many examples this turns out to be very good. Both sides agreed to consider Bayesian methods with priors chosen to have good frequentist properties as a common method. At a more detailed level it will take some more time to agree on and implement the procedures. So in the short term this is not a realistic solution for analyses where Bayesian methods have not already been developed. Follow-up from the Statistics Forum / CERN, 24 June 2011

  5. Recommendation on minimum power for PCL from 16% to 50% For summer 2011 (and beyond), we recommend quoting PCL limits with the minimum power of 50%. The reasons for moving the minimum power to 50% are both theoretical and practical: 50% avoids the possibility of having a conservative treatment of systematics lead to a stronger limit. Some computational issues related to low-count analyses are less problematic with 50%. There is a slight reduction in the burden on the analyst, since the 50% quantile (median) needed for the power constraint is easier to find than the 16% quantile (-1 sigma error band). Follow-up from the Statistics Forum / CERN, 24 June 2011

  6. Recommendation on minimum power for PCL from 16% to 50% (2) 50% minimum power gives a slight reduction in the “psychological burden” on conference speakers, in that the fraction of times one sees a sizable difference between PCL and CLs would be less, and then only in cases where a strong downward fluctuation leads to a stronger CLs limit (see graph on next page and recall that under the background-only model, muHhat lives 68% of the time between -1 and 1). Owing to the short notice before EPS, it may be desirable to leave the minimum power at 16% for the short term. This should depend on whether groups feel they need more time to shift from 16% to 50%. In practice this step should not take any more time, and in some cases will save time. Follow-up from the Statistics Forum / CERN, 24 June 2011

  7. Upper limits for Gaussian problem (unknown) true value → measurement → Follow-up from the Statistics Forum / CERN, 24 June 2011

  8. Conclusions We recommend using PCL with a minimum power of 50% as the primary result. For the short term, we support also reporting CLs provided to allow for comparison with CMS. In the longer term, the Bayesian approach appears to have common support in both ATLAS and CMS. This will take some time to implement for many analyses; for others it is already available. Search analyses should also report the discovery significance (p-value of the background-only hypothesis). Follow-up from the Statistics Forum / CERN, 24 June 2011

  9. Extra material (repeated from 23 June talk) Follow-up from the Statistics Forum / CERN, 24 June 2011

  10. ATLAS/CMS discussions on one-sided limits Some prefer to report one-sided frequentist upper limits (CLs, PCL); others prefer unified (Feldman-Cousins) limits, where the lower edge may or may not exclude zero. The prevailing view in the ATLAS Statistics Forum has been that in searches for new phenomena, one wants to know whether a cross section is excluded on the basis that its predicted rate is too high relative to the observation, not excluded on some other grounds (e.g., a mixture of too high or too low). Among statisticians there is support for both approaches. Follow-up from the Statistics Forum / CERN, 24 June 2011

  11. Discussions concerning flip-flopping One-sided limits (CLs, PCL) can suffer from “flip-flopping”, i.e., violation of coverage probability if one decides, based on the data, whether to report an upper limit or a measurement with error bars (two-sided interval). This can be avoided by “always” reporting: (1) An upper limit based on a one-sided test. (2) The discovery significance (equivalent to p-value of background-only hypothesis). In practice, “always” can mean “for every analysis carried out as a search”, i.e., until the existence of the process is well established (e.g., 5σ). I.e. we only require what is done in practice to map approximately onto the idealized infinite ensemble. Follow-up from the Statistics Forum / CERN, 24 June 2011

  12. Discussions on CLs and F-C CLs has been criticized as a method for preventing spurious exclusion as it leads to significant overcoverage that is in practice not communicated to the reader. This was the motivation behind PCL. We have also not supported using the upper edge of a Feldman- Cousins interval as a substitute for a one-sided upper limit, since when used in this way F-C has lower power. Furthermore F-C unified intervals protect against small (or null) intervals by counting the probability of upward data fluctuations, which are not relevant if the goal is to establish an upper limit. Follow-up from the Statistics Forum / CERN, 24 June 2011

  13. Discussions concerning PCL PCL has been criticized as it does not obviously map onto a Bayesian result for some choice of prior (CLs = Bayesian for special cases, e.g., x ~ Gauss(μ, σ), constant prior for μ ≥ 0). We are not convinced of the need for this. The frequentist properties of PCL are well defined, and as with all frequentist limits one should not interpret them as representing Bayesian credible intervals. Further criticism of PCL is related to an unconstrained limit that could exclude all values of μ. A remnant of this problem could survive after application of the power constraint (cf. “negatively biased relevant subsets”). PCL does not have negatively biased relevant subsets (nor does our unconstrained limit, as it never excludes μ = 0). On both points, debate still ongoing. Follow-up from the Statistics Forum / CERN, 24 June 2011

More Related