1 / 17

CMS Distributed Production

CMS Distributed Production. Michael Thomas Sep. 20, 2006. With thanks to Ajit Mohapatra from UW. Old Production Model. Large scale MC production in CMS has been going on for past several years. Based on agreement/willingness of individual CMS institutes to participate in the MC production.

arlo
Download Presentation

CMS Distributed Production

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CMS Distributed Production Michael Thomas Sep. 20, 2006 With thanks to Ajit Mohapatra from UW CMS Production System

  2. Old Production Model • Large scale MC production in CMS has been going on for past several years. • Based on agreement/willingness of individual CMS institutes to participate in the MC production. • Installation/maintenance of CMS software at the involved sites by their own people. • MC production, storage and transfer was managed by the sites. • Site responsibility to arrange/negotiate the needed resources by themselves. • Each production site performed the task based on the directions from the production mgmt group at CERN. CMS Production System

  3. New Grid-based Distributed Production • Based on new CMSSW software framework • Centralized software installation at both LCG and OSG sites • (Centralized) Production Manager maintains all work to be done (1 OSG operator, 3 for LCG) • Remote Production Agents pull work from the Production Manager, run and track jobs • Jobs are monitored and resubmitted in case of failure • Makes extensive use of Grid resources • In use since July 2006 CMS Production System

  4. CMS Computing Infrastructure CMS Production System

  5. ProdAgent CMS Production System

  6. Production System Production Manager Production Agent Production Agent Site Site Site Job Job Job Job Job Job Job Job Job Job Job Job CMS Production System

  7. Production System Production/Workflow Manager Workflow Spec * Production Agent Job Spec * Job * Application CMS Production System

  8. Production System Workflow Spec • Request/Assignment level template • Has Unique ID • Corresponds to CTDR Task • Used by the (central) Production Manager process Job Spec Job Application CMS Production System

  9. Production System II Workflow Spec • Processing Job Definition • Has a unique ID within a workflow/request • Used to create physical jobs • Dealt with by the Production Agent Job Spec Job Application CMS Production System

  10. Production System III Workflow Spec • Physical job created from Job Spec • Combination of JobSpecID and batch/grid job id are unique • Corresponds to a CTDR job • Multiple jobs created from the same JobSpec will be the same job in terms of physics (retries, etc.) Job Spec Job Application CMS Production System

  11. Production System IIII Workflow Spec • Application like nodes within a job, managed by SHREEK • CMSSW, StageOut, Rescue types of tasks • May be multiple CMSSW apps run in a single job • Each app has a unique node name within the workflow Job Spec Job Application CMS Production System

  12. Monitoring • Prod. Manager receives notifications from Prod. Agent • Allocated • Final states: success, failure • Prod. Agent monitors the high level status of jobs via Message Service and reports to CMS Dashboard • Created • Running • Finished • Fail (requeue) • Job Tracker monitors the job status • Queued • % completed • Output text • Job publishes data • BOSS wrapper captures stdout • Job may use ApMon directly to publish to MonALISA CMS Production System

  13. Data Storage on OSG CMS Production System

  14. Data Transfer and Management CMS Production System

  15. Production Summary CMS Production System

  16. Production Summary CMS Production System

  17. Looking Forward • Integration of Prod Manager • Scale to multiple Prod Agents • Extend production to non-CMS OSG sites • Ramp up for CSA06 CMS Production System

More Related