80 likes | 90 Views
This article provides an introduction to the CALICE software review, discussing the background of the review and the need for an analysis model. It also highlights the goals of improving structure and centralizing MC production and reconstruction.
E N D
CALICE Software Review: Introduction Paul Dauncey Paul Dauncey
Some background to the review • CALICE has made, or is planning, several “physics prototype” calorimeters • Silicon-tungsten (Si-W) electromagnetic calorimeter (ECAL) • Scintillator-tungsten (Sc-W) ECAL • Scintillating tile analogue hadron calorimeter (AHCAL) • RPC digital hadron calorimeters (DHCAL) • Scintillating strip tail catcher and muon tagger (TCMT) • These have been tested in beams over the last two years • The first run was at DESY in spring 2006 with just the Si-W ECAL • The AHCAL and TCMT were added in summer 2006 for the CERN runs • These systems were extended to be more complete for the 2007 CERN runs • The Sc-W ECAL ran at DESY early in 2007 • It is assumed that we will run in FNAL in 2008/9, comparing the ECALs, and the AHCAL and DHCAL, all run with the TCMT • In the second half of 2009, there are likely to be further beam tests involving ILC-like “technical prototype” calorimeters • The analysis of all these data is likely to go on beyond the end of the decade Paul Dauncey
The need for an analysis model • Initially, when only ECAL data were available, the central software was mainly written by a small group (essentially Roman and Götz Gacken) working closely together • This allowed an informal model to be created through discussion between them • There was no particular need to document it • Since the AHCAL and TCMT became involved and the analyses of the data became much more sophisticated, then the number of people involved has increased significantly, by at least an order of magnitude • This led to an increasing proportion of Roman’s effort being taken handling individual requests for information, checking code submissions, processing data runs, etc, effectively saturating his time. Paul Dauncey
Some definitions • “Central ILC code” = LCIO, LCCD, Marlin, Mokka/GEANT4, etc, code written outside of the collaboration • Not specific to CALICE; requires requests to other developers for changes • Can lead to significant time from identification of need to implementation • “Reconstruction” = process of producing the reco files from the raw data files • In bulk, usually done centrally by expert(s) • Experts or semi-experts contribute code • Some user studies are done on raw data (e.g. calibrations); those are also considered to be reconstruction for this review as the results are used for reconstruction • “Digitisation” = conversion of SimXxxHits in Mokka files to “something” which can be used by reconstruction • Usage and comments as for reconstruction • Usually run as part of reconstruction jobs for MC events • “Analysis” = studies done on reco files • Usually done by semi- or non-experts Paul Dauncey
How we hoped to improve the structure • Make the reconstruction much more idiot-proof • So idiots like me can run it without messing up... • This should allow reprocessings to be handled by other people • Centralise the MC production, digitisation and reconstruction • Remove (most of) the need for user MC generation/reconstruction • Release standardised files for direct comparison with data • Make access to (particularly conditions) data more streamlined • Different developers have implemented different solutions • Try to standardise so users only have to learn one access route • Make the connection with the ILC detector studies more transparent • Use what we learn from the data, both within ILD and SiD • An analysis questionnaire was circulated to collaboration over the summer • To find out what is difficult, what works well, what is needed for the future, etc, so we could hope to improve these things • I hope we will have some discussion of the findings later Paul Dauncey
The Charge to the Review Committee • The CALICE collaboration is studying calorimetry for ILC detectors. The collaboration has acquired a large dataset from calorimeter beam tests in 2006 and 2007 and expects to approximately double this during 2008. The total dataset so far is around 300M events, occupying 25TBytes. The dataset has significant complexity, being taken at different locations with differing beam conditions, energies and detectors. • The ILC detectors have been charged with producing Letters Of Intent by Oct 2008 and initial Engineering Design Reports are expected by 2010. Hence, it is imperative that the collaboration extracts results from these data and publishes them in a timely manner. However, it is also expected that the final analyses of all the data will not be complete until three or four years from now. • The main aim of the data analysis is fourfold. Firstly, it is to measure the performance of the prototype calorimeters used in the beam tests. Secondly, it is to compare Monte Carlo models with data so as to measure the degree of accuracy of the models. Thirdly, it is to apply the knowledge gained so as to optimise the ILC detector calorimeters with a verified, realistic and trustworthy simulation. Fourthly, it is to develop calorimeter jet reconstruction algorithms and test them on real data as well as simulation. Paul Dauncey
The Charge (Cont) • A significant offline software structure has already been put together to accomplish these aims, built on a previously determined conceptual model. The purpose of the review is to examine the implementation of this structure and comment on whether it does (or can in future) meet the aims of the collaboration. Some important points are: • If missing or ineffective areas can be identified, the review should suggest possible solutions or alternatives. • Recommendations to streamline the reconstruction, simulation or analysis of the data, to save effort or time, should be made. • The review should examine how well suited is the structure for the connection to the longer term detector studies and the development of jet reconstruction algorithms. • Comments on whether the organisational structure is appropriate would be useful. • There are limited numbers of people involved in the collaboration and so any recommendations from the review need to made with this in mind. In particular, some aspects of the software structure, such as the use of general ILC software, are probably too widely used to be realistically changed at this point. However, as a major user of the central ILC software, our experience should be useful to help improve it. If the review identifies constraints or bottlenecks arising from the use of this central software, comments on these would very welcome. Paul Dauncey
The Review Committee • The members of the Review Committee today are: • David Bailey (Manchester) • Günter Eckerlin (DESY) • Steve Magill (ANL) • George Mavromanolakis (Cambridge/FNAL) • Vishnu Zutshi (NIU) • I will act as organiser, provocateur and review secretary • The process will be: • Go through the various aspects of the analysis model • Meet to discuss the views and recommendations of the committee • Present some preliminary feedback at the end of the afternoon • Draft a written report over the next few weeks • Present this to the Technical Board in early 2008 Paul Dauncey