1 / 6

PROOF integration with Condor

PROOF integration with Condor. Neng Xu. Progress on PROOF integration with Condor . Concentrating on the Batch PROOF+Condor mode. Build a database for Dataset management. Two daemons: Session manage daemon Dataset stage manage daemon Session management part is almost there.

Download Presentation

PROOF integration with Condor

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PROOF integration with Condor Neng Xu

  2. Progress on PROOF integration with Condor • Concentrating on the Batch PROOF+Condor mode. • Build a database for Dataset management. • Two daemons: • Session manage daemon • Dataset stage manage daemon • Session management part is almost there. • Dataset stage management is still under considering. Neng Xu, Wisconsin Group at CERN

  3. Batch PROOF+CONDOR Model Pure PROOF pool PROOF requests Interactitive PROOF jobs (Only for coding and configuration, Only small amount of files) PROOF Master BatchPROOF jobs (Real jobs) Condor Scheduler for PROOF D-Cache, CASTOR, Xrootdfs, Or DDM(Grid) Neng Xu, Wisconsin Group at CERN

  4. Batch PROOF+Condor Model • Designed for large scale CAF with: • Dedicated PROOF pool. • > 100 users. • Main concerns: • Avoid large number PROOF sessions running at same time • How to scheduling the sessions which are WAITING. • How to provide a “fair share” environment for large number of users. • How to make sure the data is staged in to the POOL. Neng Xu, Wisconsin Group at CERN

  5. Condor Scheduler for PROOF Service for Scheduling Condor Master Condor Collector Condor Scheduler Service for PROOF jobs Condor Starter Job slots for PROOF session slot1@pcuw104 slot2@pcuw104 slot3@pcuw104 slot4@pcuw104 These slots can be used to limit the total number of running PROOF sessions Condor Scheduler for PROOF Job slots for File Stage in. (job can run on background.) slot5@pcuw104 slot6@pcuw104 slot7@pcuw104 slot8@pcuw104 slot9@pcuw104 slot10@pcuw104 Neng Xu, Wisconsin Group at CERN

  6. Batch PROOF+CONDOR Model Database for Datasets PROOF/xrootd pool Check files’ status PROOF batch jobs list Name input dataset Job1 mc08.017506.PythiaB_bbmu6mu4X.evgen.e306 500 Job2 mc08.017506.PythiaB_bbmu6mu6X.evgen.e306 400 Job3 mc08.0175068.PythiaB_bbmu6mu4X.evgen.e306 30 Job4 mc08.017506.PythiaB_bbmu6mu4X.evgen 50 Job5 mc08.017888.PythiaB_bbmu6mu4X.evgen.e306 100 Job6 mc08.017506.PythiaB_bbmu6mu4X.evgen.e306 120 Read the requirement Stage in the files Release the job Stage server These condor jobs will be set to “Held” by default. The Stage server will release them once the dataset is staged into the PROOF/Xrootd pool. Dataset stage-in also has priority which depends on the number of requests, number of files, waiting time, etc.. D-Cache, CASTOR, Xrootdfs, Or DDM(Grid) Neng Xu, Wisconsin Group at CERN

More Related