1 / 26

ATLAS Computing Status, early results and issues

ATLAS Computing Status, early results and issues. James Shank April, 2010. 2010 Collision startup ATLAS Computing model and resource estimates Highlights of the recent ATLAS Distributed Computing meeting at BNL Computing issues in the near future. The ATLAS Computing Model.

aviva
Download Presentation

ATLAS Computing Status, early results and issues

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ATLAS Computing Status, early results and issues James Shank April, 2010 J. Shank DOSAR April 2010, Johannesburg

  2. 2010 Collision startup • ATLAS Computing model and resource estimates • Highlights of the recent ATLAS Distributed Computing meeting at BNL • Computing issues in the near future. J. Shank DOSAR April 2010, Johannesburg

  3. J. Shank DOSAR April 2010, Johannesburg

  4. J. Shank DOSAR April 2010, Johannesburg

  5. J. Shank DOSAR April 2010, Johannesburg

  6. J. Shank DOSAR April 2010, Johannesburg

  7. J. Shank DOSAR April 2010, Johannesburg

  8. J. Shank DOSAR April 2010, Johannesburg

  9. J. Shank DOSAR April 2010, Johannesburg

  10. The ATLAS Computing Model • Latest round of revision driven by the Computing Resources Scrutiny Group (CRSG) • A committee of the LHCC, reporting to CERN management • Much pressure from some funding agencies to reduce computing resources pledges since the LHC is delayed. • This forced us to revise our data replication policy. • No full AOD at T1s in initial distribution • Can be revised by “thermodynamic” data distribution • Still on-going discussion. Will culminate in the Resource Review Board meeting J. Shank DOSAR April 2010, Johannesburg

  11. The ATLAS Computing Model(2) J. Shank DOSAR April 2010, Johannesburg

  12. PanDa Usage for User Analysis jobs J. Shank DOSAR April 2010, Johannesburg

  13. PanDa Usage for User Analysis jobs (2) J. Shank DOSAR April 2010, Johannesburg

  14. PanDa Usage for User Analysis jobs (3) J. Shank DOSAR April 2010, Johannesburg

  15. ATLAS Distributed Computing • Workshop held last week at BNL • http://indico.cern.ch/conferenceDisplay.py?confId=84669 • Some highlights in the following slides… J. Shank DOSAR April 2010, Johannesburg

  16. J. Shank DOSAR April 2010, Johannesburg

  17. Overview • PanDA based production system is ~3 years old • Core system has remained unchanged for past ~2 years • Many bug fixes, feature requests, and tuning • Some recent major changes • Panda: Moved servers to CERN, migrated to Oracle, pilot changes • AKTR: bulk task submission, job error management • Monitoring: DaTRI • Info system: AGIS Kaushik De

  18. Goals for 2010-11 • Entering LHC operations period • Stability of production system is very important • Changes should be operations driven – bug fixes, tuning… • Production system support team • We expect increased support load with LHC data • Alas, support team is shrinking • Also, fewer experts with deep knowledge of system • Need more automation to keep functioning smoothly • Need better documentation of procedures, errors, checklists • Some big software updates still needed • Motivation: new feature requests, better automation • These changes must be tested outside running production system Kaushik De

  19. Somewhat Random Wishlist • Pilot Factory integration • Glexec implementation • SchedConfigDB automation • AGIS evolution • Error code database – improve task completion • Software installation integrated with Panda • Documentation, documentation, documentation… • Automation, automation, automation… • Relaxing cloud constraint • Generic merge trf: automatic merge of _sub by Panda • Implement shorter time limits for holding, starting… • Debug ‘tail’ of task completion • “Additional Production” implementation Kaushik De

  20. New ProdSys? • Do we need to rewrite ProdSys from scratch? • Eventually, all software systems become to old • Some components could benefit from re-write • Plan for major upgrade in 2012? Kaushik De

  21. DDM: current status and possible future plans Simone Campana on behalf of DDM team

  22. FAQs • Are there known limitations of the SW we are using now • What SW products shall we  use after 2011 • How existing SW will evolve • Do we need revolution of any of the existing products or it will be evolution • Do we need better integration of ADC SW with non-ADC SW • What are our dependencies from grid SW • For a long time our slogan was 'Operation needs are driving SW development’ • How will we address increasing user's requests • What level of support do we expect from the Facilities • How can we automate our SW

  23. The DDM stack

  24. Storage Management In response to concerns expressed by LHC experiments. • “The LHC experiment managements have expressed concern over the performance and scalability of access to data, particularly for their analysis use cases.” • “…focused on setting the scope and goals for work that that would address these issues with a tentative timescale of 2013 for large scale use.” • “user access to data and the resulting system should hide the details of the back end mass storage systems and their implementations.” • Work Areas: • Data Archives and Storage Cloud • Data Access Layer • Output Datasets • Global home directory facilities • Catalogues • Authorization mechanisms • Workshop in early June for next steps… J. Shank DOSAR April 2010, Johannesburg

  25. Conclusions • ATLAS Distributed Computing can survive early data-taking. • Long runs in 2010 and 2011 will put a strain on many systems • Still many questions about future • Evolution or Revolution? J. Shank DOSAR April 2010, Johannesburg

  26. Other stuff… J. Shank DOSAR April 2010, Johannesburg

More Related