1 / 7

PanDA HPC integration. Current status.

PanDA HPC integration. Current status. Danila Oleynik BigPanda F2F meeting 13 August 2013 from. Outline. HPC access, architecture, specialty Current PanDA implementation PanDA architecture for Kraken, Titan Initial testing Next step: Pilot - SAGA integration. HPC specialty.

erwin
Download Presentation

PanDA HPC integration. Current status.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PanDA HPC integration.Current status. Danila Oleynik BigPanda F2F meeting 13 August 2013 from

  2. Outline • HPC access, architecture, specialty • Current PanDA implementation • PanDA architecture for Kraken, Titan • Initial testing • Next step: Pilot - SAGA integration.

  3. HPC specialty • Kraken Cray XT5 (have access from beginning of August) • 9408 nodes • node: 12 core, 16 GB RAM • Titan Cray XT7 (access request in process) • 18,688 nodes • node: 16 core, 32 + 6 GB RAM (2GB per core) • Parallel file system shared between nodes. • Access only to interactive nodes (worker nodes have extremely limited connectivity) • One-Time Password Authentication • Internal job management tool: PBS/TORQUE • One job occupy minimum one node (12-16 cores) • Limitation of number of jobs in scheduler for one user

  4. Current PanDAimplementation • One Pilot per WN • Pilot executes on same node as job • SW distribution through CVMFS One Pilot per WN Pilot executes on same node as job SW distribution through CVMF

  5. PanDA architecture for Kraken, Titan • Pilot(s) executes on HPC interactive node • Pilot interact with local job scheduler to manage job • Number of executing pilots = number of available slots in local scheduler

  6. Initial testing • Some initial testing was done for proving that panda components will be abele to run in HPC environment on interactive nodes • Sergey was successful with starting APF and pilots on Titan, outbound https connection was confirmed, so pilots can communicate with PanDA • I provide successful testing of SAGA API on Kraken. Generally SAGA API allows manage jobs in local HPC job scheduler • Due to interactive node and worker nodes use shared file-system, we did not need any special internal data-management process.

  7. Next step: Pilot - SAGA integration • Actually it’s a bit big step, which may be technically split: • SAGA source integration with pilot code • Reviewing, revers engineering runJob class • Implementation runJobHPC class based on SAGA API

More Related