230 likes | 238 Views
A full demonstration based on a “real” analysis scenario. Tadashi Maeno (BNL). Distributed Analysis in the US. PANDA is the distributed processing system for distributed analysis designed and developed to support analysis as well as production pathena delivers analysis jobs to PANDA
E N D
A full demonstration based on a “real” analysis scenario Tadashi Maeno (BNL)
Distributed Analysis in the US • PANDA is the distributed processing system for distributed analysis • designed and developed to support analysis as well as production • pathena delivers analysis jobs to PANDA • A command line tool with easy I/F familiar to Athena users • Athena-based analysis • ATLAS has a two stage analysis model, i.e., Athena and ROOT • Recently PROOF has been investigated and deployed as a distributed analysis tool for semi-interactive ROOT-based analysis • PANDA/xrootd integration supports easy downstream analysis using PROOF on PANDA/pathena outputs
PANDA/xrootd/PROOF Integration Panda Direct Writing or DQ2 Subscription SE CASTOR/dCache xrootd Job AOD DPD DPD PROOF End-user
Analysis using PANDA/pathena (1/2) • Athena-based analysis • Panda can run batch-style ROOT job. But no plan to support it officially PROOF for ROOT-based analysis • Support all sorts of Athena job types • All production steps (evgen,simul,pileup,digi,reco,merge,analysis) • Arbitrary package configuration • Add new packages • Modify cmt/requirements in any package • Multiple-input streams • E.g., Signal + Minimum-bias • TAG/AANT-based analysis • Back-Navigation • Production releases, AtlasPoint1, nightlies (dev/bugfix), pcache nightlies • Full support for Configurable • ByteStream Reading/Writing
Analysis using PANDA/pathena (2/2) • Requirements for pathena • Athena • Any release version • AFS for BNL/CERN • Kit for other sites • GRID UI • OSG, glite,NG UI • Join ATLAS VO No exotic dependencies • US and non-US users can submit jobs even from laptop
What happens when submitting a job User buildJob x 1 runAthena x N PANDA source.tgz Storage buildJob libraries.tgz DDM libraries.tgz runAthena outputs inputs runAthena outputs inputs output dataset
Analysis Sites • BNL is primary analysis site • AGLT2, UTA, OU, LYON, TRIUMF are also in use • Deploying at US/FR/CA T2s • Testing in other clouds with lower priority unless someone from the cloud explicitly helps “pathena –-site=AUTO” automatically sends jobs to the site which holds greatest dataset content users don’t have to care that Panda works for many sites
Example with pathena (1/7) • Setup Athena and CVS
Example with pathena (2/7) • Check-out PandaTools from CVS
Example with pathena (3/7) • Make PandaTools
Example with pathena (4/7) • Run athena to make sure jobO works
Example with pathena (5/7) • Submit the job using pathena
Example with pathena (6/7) • Check job status using Panda Monitor
Example with pathena (7/7) • E-mail notification should come
Trouble shooting • Panda Monitor • Error dialog • Log file https://twiki.cern.ch/twiki/bin/view/Atlas/DAonPanda#6_How_to_debug_jobs_when_they_fa • FAQ in Wiki https://twiki.cern.ch/twiki/bin/view/Atlas/DAonPanda#FAQ • HyperNews • Operation problems like site outage should be informed in HN https://hypernews.cern.ch/HyperNews/Atlas/get/pandaPathena.html • Savannah for bug-reports https://savannah.cern.ch/bugs/?func=additem&group=panda
Writing pathena outputs to xrootd • pathena/PANDA writes outputs to the same SE as input by default • “pathena –-destSE” allows users to write outputs to different SE --destSE BNL_XRD direct writing to BNL xrootd instead of dCache --destSE SLACXRD transfer via DQ2 subscription to SLAC
Analysis using PROOF • ATLAS has a two-stage analysis model • The first stage occurring in Athena and the second stage occurring in ROOT • pathena is designed as a tool for Athena-based analysis • Complement service/tool is needed for ROOT-based analysis • Requirements for ROOT-based analysis • Parallel processing, interactivity and fast turnaround • PROOF has been investigated at GLOW-ATLAS and BNL • Good results so far
Example with PROOF (1/4) • Setup ROOT
Example with PROOF (2/4) • Run ROOT
Example with PROOF (3/4) • Start PROOF session and access to DQ2
Example with PROOF (4/4) • Plot a variable using Draw()
TSelector and PyROOT • TSelector for more complicated analysis Example with TSelector • PyROOT in PROOF • One has to write a wrapper for TSelector to hook Python functions • ROOT will provide something to generate the wrapper automatically Using PyROOT in PROOF
Conclusions • Distributed Analysis in the US follows the two stage analysis model for ATLAS • PANDA/pathena for Athena-based analysis • PROOF for ROOT-based analysis • PANDA/xrootd/PROOF are integrated for seamless analysis sequence