140 likes | 235 Views
Advanced User Support. Amit Majumdar 5/7/09. Outline. Three categories of AUS Update on Operational Activities AUS.ASTA AUS.ASP AUS.ASEOT. Three Categories of AUS. A dvanced S upport for T eraGrid A pplications ( AUS . ASTA ) AUS staff work on a particular user’s code (usually)
E N D
Advanced User Support Amit Majumdar 5/7/09
Outline • Three categories of AUS • Update on Operational Activities • AUS.ASTA • AUS.ASP • AUS.ASEOT
Three Categories of AUS • Advanced Support for TeraGrid Applications (AUS.ASTA) • AUS staff work on a particular user’s code (usually) • Guided/initiated by the allocations process • Advanced Support for Projects (AUS.ASP) • Project initiated by AUS staff (jointly with users) – impacts many users • Advanced Support for EOT (AUS.ASEOT) • Advanced HPC/CI part of EOT
Update on Operational Activities • Bi-weekly telecon among AUS POCs from every RP site • Matching of AUS staff to ASTA projects, discussion about ASP, EOT activities • Web/Tele-con among AUS technical staff • Biweekly technical tele/web-conference on ASTA and other projects - using readytalk • 13 presentation sessions (including today), 22 technical presentations
ASTAs Started in April 2009 - TRAC Total number of active ASTAs ~40
ASTA Update - PI: Durisen, IU, AstrophysicsAUS staff: Henschel, Berry (IU) • Legacy code – OpenMP parallel – benchmarked on three Altix systems (PSC, NCSA, ZIH) – different performance! • Optimized subroutines (calculates the gravitational potential) – 1.8X speedup • Simulations generate several TBs of data which is then analyzed interactively using IDL • Transferring this amount of data via traditional methods (ftp, scp, etc.) to IU is extremely time consuming and tedious • By mounting the Data Capacitor at PSC on Pople the user can write their data directly to IU and then access from servers in their department • Extensive profiling and optimization of I/O performance on local scratch vs DataCapacitor – eventually ~30% speedup in I/O • Files appear locally as they are generated by simulation
ASTA Update - PI: Scheraga, Cornell, BiophysicsAUS Staff : Blood, Mahmoodi (PSC) • Identified a serial bottleneck and load imbalance problem that limited parallel scaling • Eliminated serial bottleneck and restructured code to eliminate imbalance • Resulting code performing 4 times faster for large systems • Never ending optimization • In the new code computation/communication balance is changed – further profiling ongoing using CrayPAT, TAU
ASTA Update - PI: Van de Walle, UCSB, Condensed Matter PhysicsAUS Staff : Liu (TACC), Vanmoer (NCSA) • Main effort was to identify performance issues of VASP • Identified only routine that needed lower level (-O1) compilation; others used –O3 : resulted in ~10 % performance improvement • MKL on Ranger had SMP enabled with default OMP_NUM_THREADS of 4; caused overhead – fixed with proper wayness and threads • Proper setting of NPAR (determines process grouping for band diagonalization and FFT) showed 3-4 times speedup
Advanced Support Projects • Two projects ongoing • Benchmarking Molecular Dynamics codes • Benchmarking Materials Science codes • Other potential ones we are looking into • Multi-core performance analysis • Usage-based perf/profiling tools
Comparison of MD Benchmark on TeraGrid Machines at Different Parallel Efficiencies
Advanced Support EOT • Advanced HPC classes at various RP sites • TG09 • AUS staff participating in organizing TG09; reviewing papers • AUS staff will be presenting papers at TG09; presenting tutorials • Joint US-AUS-XS working group meeting at TG09