1 / 26

Scalable Systems Software Suite: Updates, Tests, and Scalability

Explore the latest updates and scalability tests of the Scalable Systems Software Suite in the working group meeting. Discussing resource management, job scheduling, and user management for efficient large computing center operations.

cdupuis
Download Presentation

Scalable Systems Software Suite: Updates, Tests, and Scalability

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Working Group updates, Suite Tests and Scalability, Race conditions, SSS-OSCAR Releases and Hackerfest Al Geist August 17-19, 2005 Oak Ridge, TN Welcome to Oak Ridge National Lab!First quarterly meeting here

  2. Demonstration: Faster than Light Computer Able to calculate the answer before the problem is specified ORNL ORNL ORNL ORNL ORNL Faster than Light Computer

  3. Resource Management Accounting & user mgmt System Monitoring System Build & Configure Job management ORNL ANL LBNL PNNL SNL LANL Ames IBM Cray Intel SGI NCSA PSC Scalable Systems Software Participating Organizations Problem • Computer centers use incompatible, ad hoc set of systems tools • Present tools are not designed to scale to multi-Teraflop systems Goals • Collectively (with industry) define standard interfaces between systems components for interoperability • Create scalable, standardized management tools for efficiently running our large computing centers To learn more visit www.scidac.org/ScalableSystems

  4. Scalable Systems Software Suite Any Updates to this diagram? Grid Interfaces Components written in any mixture of C, C++, Java, Perl, and Python can be integrated into the Scalable Systems Software Suite Meta Scheduler Meta Monitor Meta Manager Meta Services Accounting Scheduler System & Job Monitor Node State Manager Service Directory Standard XML interfaces Node Configuration & Build Manager authentication communication Event Manager Allocation Management Usage Reports SSS-OSCAR Process Manager Job Queue Manager Hardware Infrastructure Manager Validation & Testing Checkpoint / Restart

  5. Components in Suites Multiple Component Implementations exits Meta Manager Grid scheduler Warehouse Meta Services NSM Maui sched Warehouse (superMon NWPerf) Gold SD ssslib BCM EM Gold Usage Reports APITest PM Bamboo QM Compliant with PBS, Loadlever job scripts HIM BLCR

  6. Scalable Systems Users • Production use today: • Running an SSS suite at ANL, and Ames • ORNL industrial cluster (soon) • Running components at PNNL • Maui w/ SSS API (3000/mo), Moab (Amazon, Ford, TeraGrid, …) • Who can we involve before the end of the project? • - National Leadership-class facility? • NLCF is a partnership between • ORNL (Cray), ANL (BG), PNNL (cluster) • NERSC and NSF centers • NCSA cluster(s) • NERSC cluster? • NCAR BG

  7. Goals for This Meeting • Updates on the Integrated Software Suite components • Change in Resource Management Group • Scott Jackson left PNNL • Planning for SciDAC phase 2 – • discuss new directions • Preparing for next SSS-OSCAR software suite release • What needs to be done at hackerfest? • Getting more outside Users. • Production and feedback to suite

  8. Since Last Meeting • FastOS Meeting in DC • Any chatting about leveraging our System Software? • SciDAC 2 Meeting in San Francisco • Scalable Systems Poster • Talk on ISICS • Several SSS members there. Anything to report? • Telecoms and New entries in Electronic Notebooks • Pretty sparse since last meeting

  9. Agenda – August 17 8:00 Continental Breakfast CSB room B226 8:30 Al Geist - Project Status 9:00 Craig Steffen – Race Conditions in Suite 9:30 Paul Hargrove Process Management and Monitoring 10:30 Break 11:00 Todd Kordenbrock – Robustness and Scalability Testing 12:00 Lunch (on own at cafeteria ) 1:30 Brett Bode - Resource Management components 2:30 Narayan Desai - Node Build, Configure, Cobalt status 3:30 Break 4:00 Craig Steffen – SSSRMAP in ssslib 4:30 Discuss proposal ideas for SciDAC 2 4:30 Discussion of getting SSS users and feedback 5:30 Adjourn for dinner

  10. Agenda – August 18 • 8:00 Continental Breakfast • 8:30 Thomas Naughton - SSS OSCAR software releases • 9:30 Discussion and voting • Your name here • 10:30 Group discussion of ideas for SciDAC-2. • 11:30 Discussion of Hackerfest goals • Set next meeting date/location: • 12:00 Lunch (walk over to cafeteria) • 1:30 Hackerfest begins room B226 • 3:00 Break • 5:30 or whenever break for dinner

  11. Agenda – August 19 8:00 Continental Breakfast 8:30 Hackfest continues 12:00 Hackerfest ends

  12. What is going on inSciDAC 2Executive PanelFive Workshops in past 5 wksPreparing a SciDAC 2 program plan at LBNL today!ISIC section has words about system software and tools

  13. Ultrascale Hardware Rainer, Blue Gene, Red Storm OS/HW teams View to the Future HW, CS, and Science Teams all contribute to the science breakthroughs Computing Environment Common look&feel across diverse HW Leadership-class Platforms SciDAC Science Teams Software & Libs SciDAC CS teams High-End science problem Research team Tuned codes BreakthroughScience

  14. SciDAC Phase 2 and CS ISICs • Future CS ISICs need to be mindful of needs of • National Leadership Computing facility • w/ Cray, IBM BG, SGI, clusters, multiple OS • No one architecture is best for all applications • SciDAC Science Teams • Needs depend on application areas chosen • End stations? Do they have special SW needs? • FastOS Research Projects • Complement, don’t duplicate these efforts • Cray software roadmap • Making the Leadership computers usable, efficient, fast

  15. Gaps and potential next steps • Heterogeneous leadership-class machines • science teams need to have a robust environment that presents similar programming interfaces and tools across the different machines. • Fault tolerance requirements in apps and systems software • particularly as systems scale up to petascale around 2010 • Support for application users submitting interactive jobs • computational steering as means of scientific discovery • High performance File System and I/O research • increasing demands of security, scalability, and fault tolerance • Security • One-time-passwords and impact on scientific progress

  16. Heterogeneous Machines • Heterogeneous Architectures • Vector architectures, Scalar, SMP, Hybrids, Clusters • How is a science team to know what is best for them? • Multiple OS • Even within one machine, eg. Blue Gene, Red Storm • How to effectively and efficiently administer such systems? • Diverse programming environment • science teams need to have a robust environment that presents similar programming interfaces and tools across the different machines • Diverse system management environment • Managing and scheduling multiple node types • System updates, accounting, … everything will be harder in round 2

  17. Fault Tolerance • Holistic Fault Tolerance • Research into schemes that take into account the full impact of faults: application, middleware, OS, and hardware • Fault tolerance in systems software • Research into prediction and prevention • Survivability and resiliency when faults can not be avoided • Application recovery • transparent failure recovery • Research into Intelligent checkpointing based on active monitoring, sophisticated rule-based recoverys, diskless checkpointing… • For petascale systems research into recovery w/o checkpointing

  18. Interactive Computing • Batch jobs are not the always the best for Science • Good for large numbers of users, wide mix of jobs, but • National Leadership Computing Facility has different focus • Computational Steering as a paradigm for discovery • Break the cycle: simulate, dump results, analyze, rerun simulation • More efficient use of the computer resources • Needed for Application development • Scaling studies on terascale systems • Debugging applications which only fail at scale

  19. File System and I/O Research • Lustre is today’s answer • There are already concerns about its capabilities as systems scale up to 100+ TF • What is the answer for 2010? • Research is needed to explore the file system and I/O requirements for petascale systems that will be here in 5 years • I/O continues to be a bottleneck in large systems • Hitting the memory access wall on a node • To expensive to scale I/O bandwidth with Teraflops across nodes • Research needed to understand how to structure applications or modify I/O to allow applications to run efficiently

  20. Security • New stricter access policies to computer centers • Attacks on supercomputer centers have gotten worse. • One-Time-Passwords, PIV? • Sites are shifting policies, tightening firewalls, going to SecureID tokens • Impact on scientific progress • Collaborations within international teams • Foreign nationals clearance delays • Access to data and computational resources • Advances required in system software • To allow compliance with different site policies and be able to handle tightest requirements • Study how to reduce impact on scientists

  21. Meeting notes Al Geist – see slides Craig Steffen – Exciting new race condition! Nodes go offline – Warehouse doesn’t know quick enough Event manager, scheduler, lots of components affected Problem grows linear with system size. C Order of operations need to be considered – something we haven’t considered before. Issue can be reduced, can’t be solved Good discussion on ways to reduce race conditions. SSS use at NCSA Paul Egli rewrote Warehouse – many new features added, Sandia user Now monitoring sessions All configuration is dynamic Multiple debugging channels Sandia user – tested to 1024 virtual nodes Web site – http://arrakis.ncsa.uiuc.edu/warehouse/ New hire full time on SSS Lining up T2 scheduling (500 proc)

  22. Meeting notes Paul Hargrove – Checkpoint Manager BLCR status AMD64/EM64T port now in beta (crashes some users machines) Recently discovered kernel panic during signal interaction (must fix at hackerfest) Next step process groups/sessions – begin next week LRS-XML and Events “real soon now” Open MPI chpt/restart support by SC2005 Torque integration done at U. Mich. for phd thesis (needs hardening) Process manager – MPD rewrite “refactoring” Getting a PM stable and working on BG. Todd K – Scalability and Robustness tests ESP2 Efficiency ratio 0.9173 on 64 nodes Scalability – Bamboo 1000 job submission Gold (java version) – reservation slow – PERL version not tested Warehouse – up to 1024 nodes Maui on 64 nodes (need more testing) Durability – Node Warm stop – 30 seconds to Maui notification Node Warm start – 10 seconds Node Cold stop – 30 seconds

  23. Meeting notes Todd K – testing continued Single node failure – good Resource hog (stress) Resource exhaustion – service node (Gold fails in logging package) Anomalies Maui Warehouse Gold happynsm ToDo Test BLCR module Retest on larger cluster Get latest release of all software and retest Write report on results.

  24. Meeting notes Brett Bode – RM status New release of components Bamboo v1.1 Maui 3.2.6p13 Gold 2b2.10.2 Gold being used on Utah cluster SSS suite on several systems at Ames New fountain component – to front end Supermon, ganglia, etc. Demos new tool called Goanna for looking at fountain output Has same interface as Warehouse – could plug right in General release of GOLD 2.0.0.0 available. New perl cgi gui no Java dependency at all in Gold X509 support in Mcom (for Maui and Silver) Cluster scheduler bunch of new features Grid scheduler – enabled basic accounting for grid jobs. Future work – Gary needs to get up to speed on Gold code make it all work with LRS

  25. Meeting notes Narian – LRS Conversion status All components in center cloud converted to LRS Service directory, Event manager, BCM stack, Processor Manager Targeted for SC05 release SSSlib changeover – completed SDK support – completed Cobalt Overview SSS suite on Chiba and BG Motivations – scalability, flexibility, simplicity, support for research ideas Tools included: parallel programming tools Porting has been easy – now running on Linux, MacOS, and BG/L Only about 5K lines of code. Targeted for Cray xt3, x1, zeptoOS Unique features- small partition support on BG/L, OS Spec support Agile – swap out components. User and admin requests easier to satisfy Running on ANL and NCAR (evaluation at other BG sites) May be running on JAZZ soon. Future- better scheduler, new platforms, more front ends, better docs

  26. Meeting notes • Narian – Parallel tool development • parallel Unix tools suite • File staging • Parallel rsync

More Related