130 likes | 220 Views
PPS : Release and Interactions with ITR team. Author: Antonio Retico (SA1) Location: CERN (29-Jun-07). PPS in a nutshell. P re- P roduction S ervice for E G EE
E N D
PPS : Release and Interactions with ITR team Author: Antonio Retico (SA1) Location: CERN (29-Jun-07)
PPS in a nutshell • Pre-ProductionService for EGEE • Mandate: “To give early access to new services in order for WLCG/EGEE users to evaluate new features and changes in the release” • PPS grid counts ~ 30 sites • Operations supported by the EGEE ROCs • Coordination done at CERN (resp. Nick Thackray) www.cern.ch/pps To change: View -> Header and Footer
PPS “ + ” • The quality of gLite profits of PPS: • Testing of deployment procedures and software in real operational conditions • Debugging of new functionality done by the applications/VOs • Feedback for early bug fix to the release before moving to production • PPS is the “production” grid for Diligent VO • 6 sites in PPS are exclusively supporting Diligent • https://twiki.cern.ch/twiki/bin/view/DILIGENT/DiligentInfrastructurePps • The DILIGENT project aims to support a new research operational mode by enabling the creation of on-demand digital libraries. To change: View -> Header and Footer
PPS “ – ” • Share of work spent “around” PPS (last three months): • Standard Usage (5%): VOs use SW regularly released • Special Activities (37%): VOs test non-certified SW • SRMv2 integration • FIX for VoViews • … • Release Testing (15%): few selected sites do pre-deployment testing • Operations (43%): ~ 30 sites maintain a service running • High operation costs compared to poor (standard) usage by VOs => Revision of the mandate in study To change: View -> Header and Footer
ITR and PPS • ITR and PPS Coordination work together doing: • Release steering in EMT • priorities in release • actions on severity of bugs • fast-tracking of urgent fixes • Middleware releases • shared tools and procedures (see next slide) • https://twiki.cern.ch/twiki/bin/view/LCG/PPSReleaseProcedures To change: View -> Header and Footer
ITR and PPS: Release Process Thu Fri Mon Tue …3.5 weeks later... Mon Wed list of issues Tue To change: View -> Header and Footer
Release Schedule: Theory To change: View -> Header and Footer
Release Schedule: Practice • Estimation of work done in PPS(last three months): • 3 releases per month in average VoViews slc4-WN SRMv2 Standard Releases To change: View -> Header and Footer
Two words of explanation • Theory and practice differ because of: • Fast-tracking (decided by EMT) • Security patches • Important fixes • [optional] Fixes for new bugs introduced by “Important fixes” • Special activities • requested by VOs • based on installation of uncertified middleware • supported by a restricted number of sites • managed separately from the release process • e.g. Integration of SRMv2 pilot testbed in PPS To change: View -> Header and Footer
ITR and PPS: Emoticons • Friendly and constructive interactions • Solid, agreed and unambiguous procedures • High-quality release documentation • Significant improvements expected from new YAIM • PPS Site admins happy and reactive on releases • Frequent requests for fast-tracking patches • overlapping releases and broken upgrade sequences • Deployment task forces (e.g. WMS 3.1) • good for start-up but likely to forget operational aspects • (short) stage in PPS always recommended • quick fixes => new bugs • still a lot of services “skip” PPS To change: View -> Header and Footer
quick fixes => new bugs • PPS maybe a buffer to production, but it is not a testbed: PPS is a grid and a service • Recovering the service after a bad upgrade costs time (money) to more than 30 sites (not to count GD time) • We do our best to protect PPS against accidents • APT mirroring, pre-deployment test • But help from ITR is always appreciated • No fast-tracking unless really needed • Avoid grouping quick fixes and new features in a single patch • Mention known and possible issues with a patch in advance: • “high-risk” patches eventually to be tried in “isolated compartments” in PPS To change: View -> Header and Footer
Services “skipping” PPS • One valid argument: “no point to stay in PPS stage so long if nobody tries out the software there” • True. PPS offers nevertheless some added value: • pre-deployment testing (release notes, instructions) • “ops” SAM testing • Skipping basic testing proved to be dangerous (issues in production caused by trivial bugs) • Work needs to be done here at process level • PPS openings (free-thinking for further development): • middleware is service-oriented but updates to PPS are still handled as a single object • The correlation between PPS and PROD release numbers is practical, but it is an arbitrary choice • The same fixed stage in PPS for all the services is not strictly a requirement • The really important thing is to keep the sequence in the updates for the same service To change: View -> Header and Footer
? Questions? To change: View -> Header and Footer