1 / 13

PPS : Release and Interactions with ITR team

PPS : Release and Interactions with ITR team. Author: Antonio Retico (SA1) Location: CERN (29-Jun-07). PPS in a nutshell. P re- P roduction S ervice for E G EE

Download Presentation

PPS : Release and Interactions with ITR team

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PPS : Release and Interactions with ITR team Author: Antonio Retico (SA1) Location: CERN (29-Jun-07)

  2. PPS in a nutshell • Pre-ProductionService for EGEE • Mandate: “To give early access to new services in order for WLCG/EGEE users to evaluate new features and changes in the release” • PPS grid counts ~ 30 sites • Operations supported by the EGEE ROCs • Coordination done at CERN (resp. Nick Thackray) www.cern.ch/pps To change: View -> Header and Footer

  3. PPS “ + ” • The quality of gLite profits of PPS: • Testing of deployment procedures and software in real operational conditions • Debugging of new functionality done by the applications/VOs • Feedback for early bug fix to the release before moving to production • PPS is the “production” grid for Diligent VO • 6 sites in PPS are exclusively supporting Diligent • https://twiki.cern.ch/twiki/bin/view/DILIGENT/DiligentInfrastructurePps • The DILIGENT project aims to support a new research operational mode by enabling the creation of on-demand digital libraries. To change: View -> Header and Footer

  4. PPS “ – ” • Share of work spent “around” PPS (last three months): • Standard Usage (5%): VOs use SW regularly released • Special Activities (37%): VOs test non-certified SW • SRMv2 integration • FIX for VoViews • … • Release Testing (15%): few selected sites do pre-deployment testing • Operations (43%): ~ 30 sites maintain a service running • High operation costs compared to poor (standard) usage by VOs => Revision of the mandate in study To change: View -> Header and Footer

  5. ITR and PPS • ITR and PPS Coordination work together doing: • Release steering in EMT • priorities in release • actions on severity of bugs • fast-tracking of urgent fixes • Middleware releases • shared tools and procedures (see next slide) • https://twiki.cern.ch/twiki/bin/view/LCG/PPSReleaseProcedures To change: View -> Header and Footer

  6. ITR and PPS: Release Process Thu Fri Mon Tue …3.5 weeks later... Mon Wed list of issues Tue To change: View -> Header and Footer

  7. Release Schedule: Theory To change: View -> Header and Footer

  8. Release Schedule: Practice • Estimation of work done in PPS(last three months): • 3 releases per month in average VoViews slc4-WN SRMv2 Standard Releases To change: View -> Header and Footer

  9. Two words of explanation • Theory and practice differ because of: • Fast-tracking (decided by EMT) • Security patches • Important fixes • [optional] Fixes for new bugs introduced by “Important fixes” • Special activities • requested by VOs • based on installation of uncertified middleware • supported by a restricted number of sites • managed separately from the release process • e.g. Integration of SRMv2 pilot testbed in PPS To change: View -> Header and Footer

  10. ITR and PPS: Emoticons • Friendly and constructive interactions • Solid, agreed and unambiguous procedures • High-quality release documentation • Significant improvements expected from new YAIM • PPS Site admins happy and reactive on releases • Frequent requests for fast-tracking patches • overlapping releases and broken upgrade sequences • Deployment task forces (e.g. WMS 3.1) • good for start-up but likely to forget operational aspects • (short) stage in PPS always recommended • quick fixes => new bugs • still a lot of services “skip” PPS To change: View -> Header and Footer

  11. quick fixes => new bugs • PPS maybe a buffer to production, but it is not a testbed: PPS is a grid and a service • Recovering the service after a bad upgrade costs time (money) to more than 30 sites (not to count GD time) • We do our best to protect PPS against accidents • APT mirroring, pre-deployment test • But help from ITR is always appreciated • No fast-tracking unless really needed • Avoid grouping quick fixes and new features in a single patch • Mention known and possible issues with a patch in advance: • “high-risk” patches eventually to be tried in “isolated compartments” in PPS To change: View -> Header and Footer

  12. Services “skipping” PPS • One valid argument: “no point to stay in PPS stage so long if nobody tries out the software there” • True. PPS offers nevertheless some added value: • pre-deployment testing (release notes, instructions) • “ops” SAM testing • Skipping basic testing proved to be dangerous (issues in production caused by trivial bugs) • Work needs to be done here at process level • PPS openings (free-thinking for further development): • middleware is service-oriented but updates to PPS are still handled as a single object • The correlation between PPS and PROD release numbers is practical, but it is an arbitrary choice • The same fixed stage in PPS for all the services is not strictly a requirement • The really important thing is to keep the sequence in the updates for the same service To change: View -> Header and Footer

  13. ? Questions? To change: View -> Header and Footer

More Related