150 likes | 374 Views
CTSS Rollout update. Mike Showerman JP Navarro April 6 2006. CTSS V3 Software. CTSS V3 Software. CTSS V3 Software. CTSS V3 software. Coordinated software amongst DTF resources Intel compilers update to 8.1 Common MPICH-GM Common BLAS libs Binary compatible goal. CTSS V3 Services.
E N D
CTSS Rollout update Mike Showerman JP Navarro April 6 2006
CTSS V3 software • Coordinated software amongst DTF resources • Intel compilers update to 8.1 • Common MPICH-GM • Common BLAS libs • Binary compatible goal
CTSS V3 environment • Variables • GLOBUS_LOCATION • HDF4_HOME • HDF5_HOME • MYPROXY_SERVER • SRB_HOME • TG_APPS_PREFIX • TG_CLUSTER_GPFS • TG_CLUSTER_HOME • TG_CLUSTER_PFS • TG_CLUSTER_PVFS • TG_CLUSTER_SCRATCH • TG_COMMUNITY • TG_EXAMPLES • TG_NODE_SCRATCH • Softenv keys
Rollout strategy • Plan involved 3 phase rollout • Client software • Initial service deployment • Find problems in configuration/deployment • Final software release
Timelines • Initial timeline • November 11: • Gig packaging complete initial Globus 4.0.1 Package • GT4 Client software deployed by SC05 • ~50% completion • December 1: • Software ready for deployment • December 13: • Final packages installed and ready for testing • Software completion by Jan 1 • Target missed
Timeline 2 • November 11: • Gig packaging complete initial Globus 4.0.1 Package • December 13: • gig packaging team GT 4.0.1-r2 package complete and available for deployment • December 22: • RP sites complete deployment of GT4.0.2-r2 package and begin service configuration • January 17: • Deadline established to start GT 4.0.1-r2 services • January 31: • RP sites completed the deployment of the GT4.0.1-R2 services • February 13: • Gig packaging team begin build and test GT4.0.1-r3 packages along with v2 updated software • February 20: • Gig Packaging team deliver GT 4.0.1-r3 package along with updated components to CTSS v2 for deployment at RP sites • March 6: • TeraGrid CTSS v3 software and services available to TeraGrid user support and Gateways communities for testing/validation • March 19: • TeraGrid CTSS v3 software stack and services available to TeraGrid users for testing/validation • April 1 (really the 3rd): • CTSS V3 production stack in place NCSA adds 2 resources to TeraGrid production
Where We are • Time for a new timeline • Need to hit our deadlines • Need to define what constitutes “ready” • For testing • For switching to the default environment • I think it would be a mistake not to be solidly on the new stack before TeraGrid 06
Platforms • New Platforms • PPC: AIX 5.3 • IA-32: RH 9 • IA-64: SGI ProPack 3.4/RHEL3 • PPC: SLES9 BG/L • AMD64: SUSE9 Cray XT3 • Others coming • Existing Platforms • PPC: AIX 5.2 • IA-32: Debian Sarge, RHEL3, RHEL4, SLES8, SUSE9.1 • IA-64: SLES8 • Sparc: Solaris 9 • Alpha: Tru64 • 18 different machines • 13 platforms, more than NMI supports
Deployment as of April 3 Software • Pacman 3.14+ • Intel Intel 8.1 and above on most Linux platforms (was 8.0) • mpich-gm 1.2.6..14b (where applicable) • TG Usage 2.1 • Condor 6.7.17 • TGCP 0.9.9 (pre-production) • Globus 4.0.1 (-r3) • Striped GridFTP (all sites) • Pre-WS GRAM (all sites) • WS GRAM, RFT, MDS4 (most sites) • RLS (some sites) • Full Client Toolkit (all sites)
Punch List as of 4/6/2006 Globus • Static globus-url-copy for tgcp • Updated PBS JobManager • MDS4 configuration instructions • MPI flavored Globus libraries Other • Condor-G (installation documentation) • tgcp 1.0.0 (installation testing) • Hdf4, hdf5, phdf5 (packaging and installation testing) • SRB 3.4.0 client, GridFTP with SRB DSI support • Mpich-G2 • TG Resource ID • Debugging Globus vendor and mpi flavor build issues on some platforms • Debugging Globus service issues on some platforms
New Timeline • April 6 • Gig Pack team completes remaining GT4 additions • April 14 • Gig Pack team completes remaining items to be packages • April 21 • RP sites complete the installation of remaining packages • Initial Documentation complete • April 24. • Internal testing begins on all resources: • Testing of user apps with libraries/compilers • Verification of upgraded services (gram jobs representing user workload), gridftp • Validation that new services meet the requested needs • Specifically web services • May 5 • Documentation updates complete • May 8 • User Testing begins • June 1 • Default software stack switch • Services on default ports • July 1 • Removal of old services and non-default v2 software