150 likes | 271 Views
Balint Joo Jefferson Lab. BU SciDAC Meeting. Anisotropic Clover. Why do it ? Anisotropy -> Fine Temporal Lattice Spacing at moderate cost Combine with Group Theoretical Baryon Operators -> Access to Excited States Nice preliminary results – with just Wilson Excited states
E N D
Balint Joo Jefferson Lab BU SciDAC Meeting
Anisotropic Clover • Why do it ? • Anisotropy -> Fine Temporal Lattice Spacing at moderate cost • Combine with Group Theoretical Baryon Operators -> Access to Excited States • Nice preliminary results – with just Wilson • Excited states • States with spin 5/2+ http://arxiv.org/pdf/hep-lat/0601029 http://arxiv.org/pdf/hep-lat/0609052
Anisotropic Clover • Why do it ? • Part of Jlab 3 prong Lattice QCD programme • Prong 1: Dynamical Anisotropic Clover • Prong 2: DWF on a staggered sea (MILC Configs) • Prong 3: Large Scale Dynamical DWF • This programme was specially commended by the DOE at our recent Science and Technology Review • Anisotropic Clover is a major part of the INCITE proposal (for XT3 and BG/?) machines
Anisotropic Clover • Level 2 • Clover Term and Inverse & Force Term • Wired into Chroma -> Provides HMC/RHMC • Our Choice of Gauge Action: • Plaquette + Rectangle + Adjoint Term • Fermion Action • Anisotropic Clover + Stout Smearing • Stout Force Recursion • Usual Barrage of DF techniques • Hasenbusch + Chronology for 2 flavours • RHMC for the +1 flavour • Multi time scale integrators
CG Inverter Performance We only got 7.3Tflops on 8K CPUs :( - but we didn't work much at all at optimzation
Clover Work Under SciDAC 2 • Performance is OK but want better... • Optimizations • Clover • SSE Optimizations for Clusters & XT3 • BAGEL terms for BG/??? • Multi Mass Inverter, Trace Terms • Would like to optimize the actual bottleneck • CG Inverter is not the current bottleneck • Help from our friends at RENCI at identifying the exact hotspots? (Right now we rely on gprof) • Algorithmic: Temporal Preconditioning ('later)
Thoughts at the back of my mind • Are we actually going to get any time at ORNL? • We asked for a lot • I think 20M CPU hours just for the clover stuff • Incite proposal was extremely hurried • We had to respond very quickly • Many small groups did not have (stand?) a chance • How much effort should we be investing? • Should we be focusing on BlueGene/? and clusters more?
CRE and ILDG • Progress on CRE has been slow. Why? • Manpower reasons in SciDAC 1? • People are happily running production already without it? In which case is it just LOW VALUE? • where are the 'armies of new users' who need it? • What are the issues? • Intimately tied to infrastructure at each site. • site infrastructure leverages off experiments • different everywhere • High Maintenance • PBS, LoadLeveller, NSF? dcache anyone? • upgrade of mvapich, OpenMPI, IB fabric etc • Inherently non portable (what about ANL/ORNL)
CRE and ILDG • If it has low value, no user demand and is high maintenance and won't work outside our sites.... • is it worth doing? • can we just drop it ? PLEASE? • Anyway common environments are so passe and 90s. Nowadays we should think about 'interoperable grid environments' – they're IN!
ILDG • Middleware Progressed • but still on eXist MDC • dumb RC: (just remap the LFN to a FNAL dcache name) • Issues: • Where is all the markup ? • Eventually need more sophisticated RC ? • Markup is NOT anisotropy aware (future fights in the MDWG – will take time) • working towards interoperability • Meeting at JlLab Dec 11-13. Can folks from BNL and FNAL come?
Testing and Release • Unit Testing v.s. End to End Testing • Too much existing code • We intermix • QMP, QDP++, QIO, XpathReader, LIME, Chroma, Wilson Dslash or BAGEL Dslash, possibly BAGEL linear algebra, level 3 CG-DWF • Unit testing all of these is difficult • End to End Tests: Compare the final result • eg: correlation functions • Lots of output – selective diffs? • QDP++ Uses XML, Selective Diffs through XMLDiff
Structure • Test Consists of • Executable, Input XML, Expected Output XML • Metric file to decide which bits of the Output we need to check • Runner – abstract away running • Trivial Runner (just re-echoes your commands) • MPIRUN runner (runs on 2 Jlab IB nodes) • prototype YOD runner (for XT3) • LoadLeveller runner (for BG/L) – yucky • Driver Scripts • run interactively (eg scalar targets) & check • submit jobs to a queue, check later (for queues)
What has testing taught us? • We run through this regression framework nightly: gcc3,gcc4, scalar, parscalar-ib • What runs fine with gcc3.x on RHEL won't necessarily run fine with gcc4.x on FC5 • Maintenance: • Keep up with compilers – identify problems • ICC – catastrophic error: can't allocate register (SSE inline) • VACPP (XLC) – 'Internal Compiler error: Please contact IBM representative' on templates • PGI: No inline assembler? intrinsics? • we really MUST focus on this issue • or will it be GCC 3.4.x forever (seems most stable so far)
SciDAC Release Pages? • What's the actual problem here? • Jlab page has releases that live in the JLAB CVS • release directory previous versions (by vox populi) • We strive to keep the pages up to date • Not everyone uses Jlab CVS. Why? • do you prefer to run your own repository? • do you you want to use Subversion? • do you think only sissies use version control? • Centralizing release management is bad • imagine if I had to be responsible for the release of a code that I myself could only pick up by web page? • Is it only John Kogut who is unhappy?
A possible solution ... • ... to the problem which may or may not exist • A SourceForge like setup (Gforge) • Provides Per Project • Web-Space, • Release Tarball Space • Source Code Management Modules (CVS & SVN) • May be able to 'proxy' for your own repo. • Mailing Lists, Bugtracker, Newsfeeds yadda yadda • Wiki like authentication • Our new Sysadmins are installing this at JLAB • But all the effort iswasted if folks don't use it...