200 likes | 345 Views
AliRoot survey: Analysis. P.Hristov 11/06/2013. Are you involved in analysis activities? (85.1% Yes, 14.9 % No). Involved since 4.5±2.4 years Dedicated time (51±27)%. What is your current level of experience with ALICE analysis tools?. Mean 3.4. User Profile. Types of analysis.
E N D
AliRoot survey:Analysis P.Hristov 11/06/2013
Are you involved in analysis activities?(85.1% Yes,14.9% No) Involved since 4.5±2.4 years Dedicated time (51±27)%
What is your current level of experience with ALICE analysis tools? Mean 3.4
How often do you run your analysis in the following way? ( 1 - rarely, 5 - very often) • Locally: 3.4 • CAF or other analysis facility: 2 • On a batch system: 2.7 • On GRID: 3.4 • In a LEGO train: 2.7
Quality of services (1 - lowest, 5 - highest): • easiness to develop a new task: 3.5 • documentation of the analysis framework: 2.4 • documentation of services: 2.3 • alien handler functionality: 3.1 • easiness to deploy analysis and run on large data sets: 3.2 • analysis tutorials: 2.9 • existing example code or code of others: 4.2 • available analysis pages: 2.9 • analysis mailing list: 3.4
Time to complete analysis (hours) • Achieved: 22±19 • Expected: 10±10 • Is it fast enough?
Are you aware of the LEGO framework? Yes: 77.8% No: 6.4% No answer: 15.8%
Do you use the LEGO framework? Yes: 53% No: 42.9% No answer: 4.1%
Reasons why people do not use LEGO trains • Code under development/ not stable/ not ready for the LEGO trains. • Missing documentation/instructions • No ESD trains/Only AOD centric analysis • Don't believe in AOD analysis and use a batch farm • So far because my needs could not be satisfied • For new analysis code, I am running the analysis task in 'let-me-start' way, i.e. the output of my jobs is a big tree. I would check as much as possible cutting variables.
Reasons why people do not use LEGO trains • Local batch farm is more performing, easier to use • Better debugging possibilities • higher flexibility • faster processing • better control • Because most of what I do can be done off-grid, on (S)AF. • My analysis activities mostly involve detector studies and are mostly done on my laptop or batch systems • Running on CAF; there was not yet the lego train framework; new analysis testing • Merging of large outputs, more flexibility needed • Frequent change of parameters and code
How useful do you consider the LEGO framework (1 - not useful, 5 - very useful)? Mean: 4.7
Do you need extra functionality from the analysis framework?
Required functionality • Easy possibility to write custom streams. • more details for basic selection • Larger size of the buffer for event mixing ? (Do not really know if this is doable easily) • Mainly I need better documentation. I think it's all written but the documentation directly linked to on the offline pages is all >3 years old. Several features are completely undocumented - or at least the documentation is so obscure as to effectively not exist. • Trigger information at analysis level. Smarter merging. Now it only starts once all jobs are finished. Why not start merging when the first files are created? And then just do bookkeeping. Now we are in a situation where people doe their own merging on local clusters since that saves you 2 days!
Required functionality • Have access from within AODs of general things useful for all analysis, like trigger configuration (to get scalers, downscaling, etc..) and LHC information (to get lumi, etc...). Those things are mostly run-based and not event-based. • While running the merging (with a user job) a subjob can fail due to a corrupted/inaccessible file. This implies a significant loss of statistics that could be avoided if this error could be automatically detected and the job resubmitted skipping that specific file. • Embedding of real data and MC digits is missing. We run custom jobs which simulates events, read ESD of real data, convert ESD objects to digits, merge data digits with MC digits, run reconstruction, produce a new ESD, then filter to AOD. Only after that we can use the analysis framework to analyzed such embedded AOD.
Priorities for the analysis framework (1 - low priority, 5 -high priority) • better user support: 3.6 • more functionality: 2.8 • development of common analysis tools: 3.6 • better speed: 3.5 • better documentation and examples: 4.4 • better PROOF support: 3.0 • better LEGO support: 3.1
Conclusions • The most important issue is the documentation • Some of the requests show that not everybody is ready to use AODs in the analysis. This needs additional investigations