70 likes | 174 Views
Extreme Scalability RAT Report. www.teragridforum.org/mediawiki/index.php?title=RATs#Extreme_Scalability Sergiu Sanielevici, sergiu@psc.edu. Mission and Membership.
E N D
Extreme Scalability RAT Report www.teragridforum.org/mediawiki/index.php?title=RATs#Extreme_Scalability Sergiu Sanielevici, sergiu@psc.edu
Mission and Membership • Mission: Recommend how TeraGrid should meet the challenges faced by our users and ourselves in the productive utilization of Track 2 Systems as integrated into the TeraGrid. • RAT chartered 6/28 • Members volunteered from all RPs:Alameda, Brown, Dennis, Gaither, Lathrop, Lynch, Majumdar, Milfeld, Nystrom, Sanielevici, Sheppard, Whitson. • Deliverables: • Whitepaper describing challenges and proposed paths to solution • Draft charter of a new working group to deal with these challenges
Whitepaper Recommendations: Technical Challenge Areas • Designing applications for scaling and robustness • Coding for performance on multi-core systems • Coding for performance on specific T2 architectures • Tools for debugging applications at scale • Tools for optimizing applications at scale • Work and data flows for extracting knowledge from petascale simulations TG to deal with these Challenges by building the infrastructure proposed in the following slides:
Whitepaper Recommendations: Infrastructure (1) • TG to charter a new Extreme Scalability Working Group (XSWG) • XSWG to spearhead creation of an Extreme Scalability R&D Grid (XSG): • Suitable TG machines including Track-2 access • Consistent R&D environment, including XS Kit in CTSS • Meta-scheduling system and policies • Documentation
Whitepaper Recommendations: Infrastructure (2) • XSWG to draft, then help to implement, a new Extreme Scalability Allocations policy for granting access to the XSG • SU amount at today’s MRAC level, but supporting application and tools R&D and training • Clear-cut eligibility criteria, e.g. • NSF PetaApps awards and their equivalents sponsored by other programs or agencies; • Relevant NSF SDCI awards and their equivalents sponsored by other programs or agencies; • Applications, currently running on TG resources, that have demonstrated scalability to at least 4000 cores with at least 60% parallel efficiency; • Academic scientists, commercial vendor personnel and TeraGrid staff who collaborate on R&D projects undertaken in support of the XSWG mission; • Academic scientists, commercial vendor personnel and TeraGrid staff who collaborate on EOT projects undertaken in support of the XSWG mission.
Whitepaper Recommendations: Infrastructure (3) • XSWG to foster collaborative R&D projects • Between RPs, computational and computer scientists, applied mathematicians, and commercial vendors • Methods: write joint proposals to funding agencies, licensing, access to XSG, etc. • Resulting software may become part of CTSS XS Kit • XSWG to coordinate the creation and teaching of HPC University workshops and modules. • XSWG to work with the TeraGrid User Facing and EOT teams on documentation and dissemination.
Whitepaper Recommendations: XSWG Charter • Report via GIG User Support Coordination Area • Membership encouraged from all RPs, required from Track-2 sites • Ensure presence of skills required to execute XSWG tasks by appointing Core Members: • WG Leader(s); • Leaders for each Framework task and each Challenge area; • Alternates to ensure presence at all WG activities.