90 likes | 151 Views
MCRunJob in SAMGrid. i.t. Outline. Architectural relationship JIM’s Requirements/expectations SAMGrid Status w.r.t. D0 MC in relation to RunJob/Shahkar. SAMGrid/MCRunJob Interaction. JIM User Interface. Client Site. Sanity Checks. SAM DH Services. Req details. JIM Local Job Submission.
E N D
MCRunJob in SAMGrid i.t.
Outline • Architectural relationship • JIM’s Requirements/expectations • SAMGrid Status w.r.t. D0 MC in relation to RunJob/Shahkar Igor Terekhov, FNAL
SAMGrid/MCRunJob Interaction JIM User Interface Client Site Sanity Checks SAM DH Services Req details JIM Local Job Submission Generate Local Macro Headnode At exec site MCRunJob (Retrieve)/Store Files Execute Local Macro Worker node At exec site JIM Local Job Execution (Sand-boxing) Igor Terekhov, FNAL
Interactions, cont-d • Sanity Check – see if request can be executed an any site. Add requirements on sites based on D0 soft (to go away) • Macro Generation – Request details retrieval (a must) and local settings incorporation (being revised) • Execution – JIM sandbox package initializes environment and calls MCRJ. • DH – Store files from worker nodes (now) or prepare for merging (later). Igor Terekhov, FNAL
Abstracted Interactions • Need to validate request • Need to prepare job execution • (Use external job submission) actually execute it • Need to use a Grid Data Handling like SAM for file access Igor Terekhov, FNAL
Requirement Derivation • Must be able to provide several services rather than mix all in one command • Must use (externally configured) Fabric Adapter services such as: • SAM batch adapters • JIM sandboxing • Must NOT assume that “qsub” is in the path and shared file systems “rcp” works • Must have thin configuration layer • Goal is (statistical) reproducibility of Grid job results • On-site editing of core files is out of question • More than that, should not have 20 obscure parameters that will ensure that your results will differ Igor Terekhov, FNAL
Xmas Wish List (Req-s Continued) • Should NOT assume control of everything it gets a hold of • Should not have concepts like “HOME directory” (WM – grid site is time-shared condo not your house) • Should have ability to validate statistically results from multiple sites Igor Terekhov, FNAL
JIM status and MCRunJob • Initial phase of SAMGrid integration complete • Separated job preparation from execution • Inserted fabric management tools • Reimplemented SAM file storing in new Executor framework • Miscellaneous bug fixes • Much work was done by JIMmers due to core folks doing re-processing • We are running SAMGrid MC jobs at Wisconsin, Manchester and Lyon • We need to freeze MCRunJob to complete MC commissioning with SAMGrid: • Understand brokering issues • Understand Monitoring issues • Decompose Jobs at Grid level • Use SAMGrid for unified management of Data and Job files T<10min? Igor Terekhov, FNAL
Status, Continued • Expect a few months (depends on D0 participation. Warning, Rod leaving). • Et Apres, (or in parallel if resources from outside the SAMGrid team), migrate to the next generation mc_runjob. Igor Terekhov, FNAL