230 likes | 244 Views
Learn about PANDA for Athena-based analysis and PROOF for ROOT-based analysis, including deployment, setup steps, and troubleshooting resources.
E N D
A full demonstration based on a “real” analysis scenario Tadashi Maeno (BNL)
Distributed Analysis in the US • PANDA is the distributed processing system for distributed analysis • designed and developed to support analysis as well as production • pathena delivers analysis jobs to PANDA • A command line tool with easy I/F familiar to Athena users • Athena-based analysis • ATLAS has a two stage analysis model, i.e., Athena and ROOT • Recently PROOF has been investigated and deployed as a distributed analysis tool for semi-interactive ROOT-based analysis • PANDA/xrootd integration supports easy downstream analysis using PROOF on PANDA/pathena outputs
PANDA/xrootd/PROOF Integration Panda Direct Writing or DQ2 Subscription SE CASTOR/dCache xrootd Job AOD DPD DPD PROOF End-user
Analysis using PANDA/pathena (1/2) • Athena-based analysis • Panda can run batch-style ROOT job. But no plan to support it officially PROOF for ROOT-based analysis • Support all sorts of Athena job types • All production steps (evgen,simul,pileup,digi,reco,merge,analysis) • Arbitrary package configuration • Add new packages • Modify cmt/requirements in any package • Multiple-input streams • E.g., Signal + Minimum-bias • TAG/AANT-based analysis • Back-Navigation • Production releases, AtlasPoint1, nightlies (dev/bugfix), pcache nightlies • Full support for Configurable • ByteStream Reading/Writing
Analysis using PANDA/pathena (2/2) • Requirements for pathena • Athena • Any release version • AFS for BNL/CERN • Kit for other sites • GRID UI • OSG, glite,NG UI • Join ATLAS VO No exotic dependencies • US and non-US users can submit jobs even from laptop
What happens when submitting a job User buildJob x 1 runAthena x N PANDA source.tgz Storage buildJob libraries.tgz DDM libraries.tgz runAthena outputs inputs runAthena outputs inputs output dataset
Analysis Sites • BNL is primary analysis site • AGLT2, UTA, OU, LYON, TRIUMF are also in use • Deploying at US/FR/CA T2s • Testing in other clouds with lower priority unless someone from the cloud explicitly helps “pathena –-site=AUTO” automatically sends jobs to the site which holds greatest dataset content users don’t have to care that Panda works for many sites
Example with pathena (1/7) • Setup Athena and CVS
Example with pathena (2/7) • Check-out PandaTools from CVS
Example with pathena (3/7) • Make PandaTools
Example with pathena (4/7) • Run athena to make sure jobO works
Example with pathena (5/7) • Submit the job using pathena
Example with pathena (6/7) • Check job status using Panda Monitor
Example with pathena (7/7) • E-mail notification should come
Trouble shooting • Panda Monitor • Error dialog • Log file https://twiki.cern.ch/twiki/bin/view/Atlas/DAonPanda#6_How_to_debug_jobs_when_they_fa • FAQ in Wiki https://twiki.cern.ch/twiki/bin/view/Atlas/DAonPanda#FAQ • HyperNews • Operation problems like site outage should be informed in HN https://hypernews.cern.ch/HyperNews/Atlas/get/pandaPathena.html • Savannah for bug-reports https://savannah.cern.ch/bugs/?func=additem&group=panda
Writing pathena outputs to xrootd • pathena/PANDA writes outputs to the same SE as input by default • “pathena –-destSE” allows users to write outputs to different SE --destSE BNL_XRD direct writing to BNL xrootd instead of dCache --destSE SLACXRD transfer via DQ2 subscription to SLAC
Analysis using PROOF • ATLAS has a two-stage analysis model • The first stage occurring in Athena and the second stage occurring in ROOT • pathena is designed as a tool for Athena-based analysis • Complement service/tool is needed for ROOT-based analysis • Requirements for ROOT-based analysis • Parallel processing, interactivity and fast turnaround • PROOF has been investigated at GLOW-ATLAS and BNL • Good results so far
Example with PROOF (1/4) • Setup ROOT
Example with PROOF (2/4) • Run ROOT
Example with PROOF (3/4) • Start PROOF session and access to DQ2
Example with PROOF (4/4) • Plot a variable using Draw()
TSelector and PyROOT • TSelector for more complicated analysis Example with TSelector • PyROOT in PROOF • One has to write a wrapper for TSelector to hook Python functions • ROOT will provide something to generate the wrapper automatically Using PyROOT in PROOF
Conclusions • Distributed Analysis in the US follows the two stage analysis model for ATLAS • PANDA/pathena for Athena-based analysis • PROOF for ROOT-based analysis • PANDA/xrootd/PROOF are integrated for seamless analysis sequence