1 / 13

Process Management & Monitoring WG

Process Management & Monitoring WG. Quarterly Report October 10, 2002. Components. Process Management Process Manager Checkpoint Manager Monitoring Job Monitor System/Node Monitors Meta Monitoring Data Migration. “Next Steps” From June 2002. Prototyping will continue

nuri
Download Presentation

Process Management & Monitoring WG

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Process Management & Monitoring WG Quarterly Report October 10, 2002

  2. Components • Process Management • Process Manager • Checkpoint Manager • Monitoring • Job Monitor • System/Node Monitors • Meta Monitoring • Data Migration PMWG Quarterly Report

  3. “Next Steps”From June 2002 • Prototyping will continue • Interfaces will stabilize • Checkpoint Manager • Process Manager • Monitors • Monitoring data… PMWG Quarterly Report

  4. Group Progress • Prototyping and development continue • How to interface to something for which we can’t yet visualize the implementation? • Some interface progress • Validating schema for Process Manager • Early Node Monitor schema • Conference calls on alternate weeks PMWG Quarterly Report

  5. Component Progress • Checkpoint Manager (LBNL) • Process Manager (ANL) • Monitoring (NCSA) PMWG Quarterly Report

  6. Checkpoint ManagerWork at LBNL • Serial (intranode) checkpoints • Checkpoint job(s) on a single node • Parallel (internode) checkpoints • Checkpoint a multi-node job • Scalable Systems Checkpoint Manager • Scalable Systems XML interfaces PMWG Quarterly Report

  7. Checkpoint ManagerWork at LBNL • Serial (intranode) checkpoints • System-level for best coverage • Full requirements in a technical report • Prototype to demonstrate here • Based on pre-existing vmadump code • Extended for multi-threaded processes • Provides hooks for runtime libraries • Coverage is still limited PMWG Quarterly Report

  8. Checkpoint ManagerWork at LBNL • Parallel (internode) checkpoints • Works with the job control system • Cooperates with the runtime libraries • Working with LAM/MPI team to implement • We will have a joint demo at SC02 • NPBs as a realistic goal • Runtime interfaces are due May ‘03 PMWG Quarterly Report

  9. Checkpoint ManagerWork at LBNL • Scalable Systems Checkpoint Manager • Will provide Scalable Systems interface to the parallel checkpoint capability • Interface only roughly defined • Interface refinement still to follow • XML Interfaces are due May ‘03 PMWG Quarterly Report

  10. Process ManagerWork at ANL • Rusty Lusk… PMWG Quarterly Report

  11. MonitoringWork at NCSA • Mike Showerman… PMWG Quarterly Report

  12. Data Migration •  Still no work done here •  Mostly dismissed at last meeting •  Will disappear at next meeting PMWG Quarterly Report

  13. Next Steps • Prototyping will inevitably continue • Interfaces will continue to stabilize • Checkpoint Manager • Process Manager • Monitors • Monitoring data… • Now have a framework started PMWG Quarterly Report

More Related