1 / 9

SSS Deployment using OSCAR

SSS Deployment using OSCAR. John Mugler, Thomas Naughton, Phil Pfeiffer & Stephen Scott. Aug 2004, Argonne, IL SSS Face-to-face meeting. OSCAR: Cluster Toolkit. Framework for cluster management

kapila
Download Presentation

SSS Deployment using OSCAR

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SSS Deployment using OSCAR John Mugler, Thomas Naughton, Phil Pfeiffer & Stephen Scott Aug 2004, Argonne, IL SSS Face-to-face meeting

  2. OSCAR: Cluster Toolkit • Framework for cluster management • simplifies installation, configuration and operation • reduces time/learning curve for cluster build • requires: pre-installed headnode w. supported Linux distribution • thereafter: wizard guides user thru setup/install of entire cluster • Package-based framework • Content: Software + Configuration, Tests, Docs • Types: • Core: SIS, C3, Switcher, ODA, OPD, (Support Libs) • Non-core: selected & third-party • Access: repositories accessible via OPD/OPDer

  3. OSCAR Wizard * OSCAR-3.0 release

  4. Using OSCAR for SSS Problem: Helping users obtain and install SSS software. Solution: Leverage OSCAR framework to package and distribute the SSS suite, sss-oscar. sss-oscar A release of OSCAR containing all SSS software in single downloadable bundle.

  5. OSCAR-ized SSS Components • Bamboo – Queue/Job Manager • BLCR – Berkeley Checkpoint/Restart • Gold – Accounting & Allocation Management System • LAM/MPI (w/ BLCR) – Checkpoint/Restart enabled MPI • MAUI-SSS – Job Scheduler • SSSLib – SSS Communication library • Includes: SD, EM, PM, BCM, NSM, NWI • Warehouse – Distributed System Monitor • MPD2 – MPI Process Manager * As of April 2004

  6. Current Status • Several 0.2 cuts – (latest 0.2a8) • 2 items remain in 0.2 Tracker, http://sf.net/projects/sss-oscar/ • Both items have fixes in CVS, pending testing • Added $OSCAR_PACKAGE_TEST_HOME and work arounds for current testing framework • Ready for 0.2a9 & can test during meeting! • After testing post as pre-release on main www site • Starting work on 0.3, etc. • New Gold pkg • OSCAR support for part of BCWG schema

  7. TODO • Integrate Gold into new releases • Integrate APItest into OSCAR • SSS Component authors create their APItest cases • Update individual SSS Pkgs as needed • Update/Improve Documentation for v1.0 • Start weekly builds for testing (next slide) • Improve testing/bug reporting (fixing)

  8. Release Schedule ADJUST - previous Sept. 3 freeze date for SC’04 release. [Nov 8] SC’04 release sss-oscar-1.0 • Whoo-hooo!  [Oct 4] SC’04 freeze • No changes except to fix approved bugs [Sep *] weekly builds • Available first day of week by 12 noon • Untested “as-is” tarballs for tests/bug • Each developer to test their component for acceptance • Your pkg & any dependent pkgs install properly • Report/Respond to bugs in Tracker • Make appropriate fixes in CVS to remedy any errors

  9. Resources • ORNL “Test1” cluster • Full install tests & restore of headnode (2 compute victims) • Access via ORNL Login Server (examples/info pending) • Must do reservations/coordinate use • Restart nodes with care no remote power mgmt • Developer CVS repository • Hosted at http://sss-oscar.sf.net • Account requests torc@msr.csm.ornl.gov • OSCAR Homepage • http://oscar.OpenClusterGroup.org • Includes “HOWTO: Create an OSCAR Package” document

More Related