1 / 23

A full demonstration based on a “real” analysis scenario

A full demonstration based on a “real” analysis scenario. Tadashi Maeno (BNL). Distributed Analysis in the US. PANDA is the distributed processing system for distributed analysis designed and developed to support analysis as well as production pathena delivers analysis jobs to PANDA

wilmasexton
Download Presentation

A full demonstration based on a “real” analysis scenario

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A full demonstration based on a “real” analysis scenario Tadashi Maeno (BNL)

  2. Distributed Analysis in the US • PANDA is the distributed processing system for distributed analysis • designed and developed to support analysis as well as production • pathena delivers analysis jobs to PANDA • A command line tool with easy I/F familiar to Athena users • Athena-based analysis • ATLAS has a two stage analysis model, i.e., Athena and ROOT • Recently PROOF has been investigated and deployed as a distributed analysis tool for semi-interactive ROOT-based analysis • PANDA/xrootd integration supports easy downstream analysis using PROOF on PANDA/pathena outputs

  3. PANDA/xrootd/PROOF Integration Panda Direct Writing or DQ2 Subscription SE CASTOR/dCache xrootd Job AOD DPD DPD PROOF End-user

  4. Analysis using PANDA/pathena (1/2) • Athena-based analysis • Panda can run batch-style ROOT job. But no plan to support it officially  PROOF for ROOT-based analysis • Support all sorts of Athena job types • All production steps (evgen,simul,pileup,digi,reco,merge,analysis) • Arbitrary package configuration • Add new packages • Modify cmt/requirements in any package • Multiple-input streams • E.g., Signal + Minimum-bias • TAG/AANT-based analysis • Back-Navigation • Production releases, AtlasPoint1, nightlies (dev/bugfix), pcache nightlies • Full support for Configurable • ByteStream Reading/Writing

  5. Analysis using PANDA/pathena (2/2) • Requirements for pathena • Athena • Any release version • AFS for BNL/CERN • Kit for other sites • GRID UI • OSG, glite,NG UI • Join ATLAS VO No exotic dependencies • US and non-US users can submit jobs even from laptop

  6. What happens when submitting a job User buildJob x 1 runAthena x N PANDA source.tgz Storage buildJob libraries.tgz DDM libraries.tgz runAthena outputs inputs runAthena outputs inputs output dataset

  7. Analysis Sites • BNL is primary analysis site • AGLT2, UTA, OU, LYON, TRIUMF are also in use • Deploying at US/FR/CA T2s • Testing in other clouds with lower priority unless someone from the cloud explicitly helps “pathena –-site=AUTO” automatically sends jobs to the site which holds greatest dataset content  users don’t have to care that Panda works for many sites

  8. Example with pathena (1/7) • Setup Athena and CVS

  9. Example with pathena (2/7) • Check-out PandaTools from CVS

  10. Example with pathena (3/7) • Make PandaTools

  11. Example with pathena (4/7) • Run athena to make sure jobO works

  12. Example with pathena (5/7) • Submit the job using pathena

  13. Example with pathena (6/7) • Check job status using Panda Monitor

  14. Example with pathena (7/7) • E-mail notification should come

  15. Trouble shooting • Panda Monitor • Error dialog • Log file https://twiki.cern.ch/twiki/bin/view/Atlas/DAonPanda#6_How_to_debug_jobs_when_they_fa • FAQ in Wiki https://twiki.cern.ch/twiki/bin/view/Atlas/DAonPanda#FAQ • HyperNews • Operation problems like site outage should be informed in HN https://hypernews.cern.ch/HyperNews/Atlas/get/pandaPathena.html • Savannah for bug-reports https://savannah.cern.ch/bugs/?func=additem&group=panda

  16. Writing pathena outputs to xrootd • pathena/PANDA writes outputs to the same SE as input by default • “pathena –-destSE” allows users to write outputs to different SE --destSE BNL_XRD direct writing to BNL xrootd instead of dCache --destSE SLACXRD transfer via DQ2 subscription to SLAC

  17. Analysis using PROOF • ATLAS has a two-stage analysis model • The first stage occurring in Athena and the second stage occurring in ROOT • pathena is designed as a tool for Athena-based analysis • Complement service/tool is needed for ROOT-based analysis • Requirements for ROOT-based analysis • Parallel processing, interactivity and fast turnaround • PROOF has been investigated at GLOW-ATLAS and BNL • Good results so far

  18. Example with PROOF (1/4) • Setup ROOT

  19. Example with PROOF (2/4) • Run ROOT

  20. Example with PROOF (3/4) • Start PROOF session and access to DQ2

  21. Example with PROOF (4/4) • Plot a variable using Draw()

  22. TSelector and PyROOT • TSelector for more complicated analysis Example with TSelector • PyROOT in PROOF • One has to write a wrapper for TSelector to hook Python functions • ROOT will provide something to generate the wrapper automatically Using PyROOT in PROOF

  23. Conclusions • Distributed Analysis in the US follows the two stage analysis model for ATLAS • PANDA/pathena for Athena-based analysis • PROOF for ROOT-based analysis • PANDA/xrootd/PROOF are integrated for seamless analysis sequence

More Related