1 / 23

The Staged Roll-out in the Transition EGEE EGI

The Staged Roll-out in the Transition EGEE EGI. Antonio Retico EGEE09 Barcelona – 23 Sep 2009. Contents. Good Afternoon. The Staged Roll-out in the EGI development and deployment model The Staged Roll-out in the SA3/SA1 implementation

chas
Download Presentation

The Staged Roll-out in the Transition EGEE EGI

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Staged Roll-out in the Transition EGEEEGI Antonio Retico EGEE09 Barcelona – 23 Sep 2009

  2. Contents Good Afternoon The Staged Roll-out in the EGI development and deployment model The Staged Roll-out in the SA3/SA1 implementation Some technical considerations Highlight on the status of the transition The Staged Roll-out - Antonio Retico - EGEE09 - 23rd Sep 2009

  3. What is the Staged Roll-out A bit of context The Staged Roll-out - Antonio Retico - EGEE09 - 23rd Sep 2009

  4. EGEE Development & Deployment EGI Development & Deployment EGI.eu Middleware Unit EGI.eu Operations Unit Certification JRA1 SA3 SA1 NGI Staged Rollout Rollout Certification PPS Software Provider Product Team Cluster of Competence Component Maintenance Testing Certification NGI Exp. Services Deployment Tests Product Team Certification Cluster of Competence NGI Component Maintenance Testing Rollout Pilot Services NGI Pilot Services Software Provider Product Team Cluster of Competence Certification Component Maintenance Testing NGI Rollout Plans for Year II - Steven Newhouse - EGEE-III First Review 24-25 June 2009

  5. UMD and EGI: acquired facts Mw products developed and tested by independent “Product Teams” Release to EGI (production)  SW released by Product Teams into “Beta” repository A thin “validation” layer featured by the EGI.eu Middleware Unit (MU)  “green light”  staged roll-out The Staged Roll-out - Antonio Retico - EGEE09 - 23rd Sep 2009

  6. UMD, EGI and Production Service A process for MW staged roll-out is to be implemented • Deemed necessary by SA1, SA3 and WLCG Management • Protection mechanism for the production service • Applies to all MW updates Not a surprise for Operations people: • Local “buffering” solutions already applied at various regions and sites • The new idea is to share the results Implementation: work in progress [4] • Part of the EGEEEGI transition • Currently using PPS resources to validate the new procedures The Staged Roll-out - Antonio Retico - EGEE09 - 23rd Sep 2009

  7. Staged Roll-out How we (will) do it The Staged Roll-out - Antonio Retico - EGEE09 - 23rd Sep 2009

  8. Staged roll-out in a nutshell [2] April 2009 A MW release v.N+1 is announcedto the rollout sitesby the MU. As defined by their SLAssites are expected to update ‘their’ services and to reporton failures within the SLA specified time period • If no issues filed within SLA period the release is ‘good’ for wider deployment Staged roll out is not a compulsory waiting time: sites can skip the waiting time and proceed before, under their own risk Staged roll out is transparent for the product teams, for them, the component is released in production once it is given to the MU The Staged Roll-out - Antonio Retico - EGEE09 - 23rd Sep 2009

  9. Staged roll-out in a nutshell [2] To whom? How? When? A MW release v.N+1 is announcedto the rollout sitesby the MU. As defined by their SLAssites are expected to update ‘their’ services and to reporton failures within the SLA specified time period • If no issues filed within SLA period the release is ‘good’ for wider deployment Staged roll out is not a compulsory waiting time: sites can skip the waiting time and proceed before, under their own risk Staged roll out is transparent for the product teams, for them, the component is released once it is in the ‘beta’ repository The Staged Roll-out - Antonio Retico - EGEE09 - 23rd Sep 2009

  10. Staged roll-out in a nutshell [2] • Early Adopters sites are a club of production sites that commit with EGI to provide this service (OPT-IN approach) • Trade-off : receive release earlier at the price of instability risk • Communication and announces to EA sites happen through formal deployment tasks • The release pages for v.N+1 are ready but not public default yet • Default release pages stop to Update N (stable release) • Link to N+1 pages provided for sites that want to update (at their own risk) A MW release v.N+1 is announcedto the rollout sitesby the MU. As defined by their SLAssites are expected to update ‘their’ services and to reporton failures within the SLA specified time period • If no issues filed within SLA period the release is ‘good’ for wider deployment Staged roll out is not a compulsory waiting time: sites can skip the waiting time and proceed before, under their own risk Staged roll out is transparent for the product teams, for them, the component is released once it is in the ‘beta’ repository The Staged Roll-out - Antonio Retico - EGEE09 - 23rd Sep 2009

  11. Staged roll-out in a nutshell [2] How? SLA? A MW release v.N+1 is announcedto the rollout sitesby the MU. As defined by their SLAssites are expected to update ‘their’ services and to reporton failures within the SLA specified time period • If no issues filed within SLA period the release is ‘good’ for wider deployment Staged roll out is not a compulsory waiting time: sites can skip the waiting time and proceed before, under their own risk Staged roll out is transparent for the product teams, for them, the component is released once it is in the ‘beta’ repository The Staged Roll-out - Antonio Retico - EGEE09 - 23rd Sep 2009

  12. Staged roll-out in a nutshell [2] • Updates of the EA sites through “delta” sw repository • Two operational repositories “Current” (stable) and “Next” (newer version) • Both run by MU on behalf of OU • “Next” contains the final production package (e.g. not pps-* meta-pkg) • Upgrade issues in the task report • Operational issues reported through standard channels (e.g. GGUS, Savannah) • SLA • Time for upgrade 1 day (tunable) • Quarantine: ~ 3 days / 1 week A MW release v.N+1 is announcedto the rollout sitesby the MU. As defined by their SLAssites are expected to update ‘their’ services and to reporton failures within the SLA specified time period • If no issues filed within SLA period the release is ‘good’ for wider deployment Staged roll out is not a compulsory waiting time: sites can skip the waiting time and proceed before, under their own risk Staged roll out is transparent for the product teams, for them, the component is released once it is in the ‘beta’ repository The Staged Roll-out - Antonio Retico - EGEE09 - 23rd Sep 2009

  13. Staged roll-out in a nutshell [2] Otherwise? A MW release v.N+1 is announcedto the rollout sitesby the MU. As defined by their SLAssites are expected to update ‘their’ services and to reporton failures within the SLA specified time period • If no issues filed within SLA period the release is ‘good’ for wider deployment Staged roll out is not a compulsory waiting time: sites can skip the waiting time and proceed before, under their own risk Staged roll out is transparent for the product teams, for them, the component is released once it is in the ‘beta’ repository The Staged Roll-out - Antonio Retico - EGEE09 - 23rd Sep 2009

  14. Staged roll-out in a nutshell [2] • If there are issues (e.g. major problems introduced by the update) • “Next” repository is emptied and the update is rejected • EA sites (still production sites) need to roll-back (not necessarily simple, support may be needed by MU) A MW release v.N+1 is announcedto the rollout sitesby the MU. As defined by their SLAssites are expected to update ‘their’ services and to reporton failures within the SLA specified time period • If no issues filed within SLA period the release is ‘good’ for wider deployment Staged roll out is not a compulsory waiting time: sites can skip the waiting time and proceed before, under their own risk Staged roll out is transparent for the product teams, for them, the component is released once it is in the ‘beta’ repository The Staged Roll-out - Antonio Retico - EGEE09 - 23rd Sep 2009

  15. Staged roll-out in a nutshell [2] A MW release v.N+1 is announcedto the rollout sitesby the MU. As defined by their SLAssites are expected to update ‘their’ services and to reporton failures within the SLA specified time period • If no issues filed within SLA period the release is ‘good’ for wider deployment Staged roll out is not a compulsory waiting time: sites can skip the waiting time and proceed before, under their own risk Staged roll out is transparent for the product teams, for them, the component is released once it is in the ‘beta’ repository The Staged Roll-out - Antonio Retico - EGEE09 - 23rd Sep 2009

  16. Staged roll-out in a nutshell [2] A MW release v.N+1 is announcedto the rollout sitesby the MU. As defined by their SLAssites are expected to update ‘their’ services and to reporton failures within the SLA specified time period • If no issues filed within SLA period the release is ‘good’ for wider deployment Staged roll out is not a compulsory waiting time: sites can skip the waiting time and proceed before, under their own risk Staged roll out is transparent for the product teams, for them, the component is released in production once it is given to the MU The Staged Roll-out - Antonio Retico - EGEE09 - 23rd Sep 2009

  17. SW Repos for Staged Roll-out Some technical considerations The Staged Roll-out - Antonio Retico - EGEE09 - 23rd Sep 2009

  18. The Operational Repositories “early adopter” SITE Service X(N+1) “standard” SITE Service X(N) Production Infrastructure PROD “Next” REPO Product X version N+1 (delta) PROD “Current” REPO Product X version N End of “quarantine” “Beta” REPO Product X version N+1 Validation EGI Middleware Unit UMD Product team Release to “production” The Staged Roll-out - Antonio Retico - EGEE09 - 23rd Sep 2009

  19. The Operational Repositories Requirement: production repository in a consistent state, always Staged Roll-out process flexible wrt to implementations • No assumptions on the structure of repos • No strong assumptions on version naming conventions • BUT logistics rely a lot on current Savannah “jra1mdw” patch configuration But we can give advice Process inherently sequential (leap-frogging not allowed) • while staged roll-out of v.N+1 is pending v.N+2 has to wait (or obsolete it) • Exceptions (e.g. critical security patches) manageable by increasing the release number of version N (e.g. 2.1.4-1  2.1.4-2) Independent product repositories possibly more efficient (parallel deployment) In alternative a bundle ID for content of delta repo may help (e.g. gLite Update #) The Staged Roll-out - Antonio Retico - EGEE09 - 23rd Sep 2009

  20. Status of the transition Slowly but steadily getting there The Staged Roll-out - Antonio Retico - EGEE09 - 23rd Sep 2009

  21. Rough timeline [4] EGEE09 LHC start? Repos ready GOCDB4? Topology DB? workplan ready All sites meeting 31 Aug 30 Nov 30 Jun 30 Sep 31 Dec 31 Jul 31 Oct preparation transition 1 • task-based reporting • Populate PPS registry • Documentation • Management procedures • Test reports pages • Start the operations • Discontinue PPS deployment test • Sam and GridMap displays transition 2 consolidation • Commitments into GOCDB • Modify PPS tools • Transfer resource mgmt to ROCs/NGI • Add more PROD sites • Interface with regional MW re-distributions Transition plan Coordination with SA3 Requirements for GOCDB4 Prepare release documentation Adapt PPS tools EGEE-SA1 Coordination Meeting – 28th Jul 2009

  22. References • [1] EGI: Managing the Software Process http://indico.cern.ch/getFile.py/access?sessionId=2&resId=0&materialId=1&confId=57092 • [2] SA1: proposal and requirements for staged-roll-out of middleware updates https://edms.cern.ch/document/997514/ • [3] SA1/SA3: Staged roll-out of grid middleware: general lines https://twiki.cern.ch/twiki/bin/view/EGEE/StagedRolloutOverview • [4] SA1: Implementation details and roadmap https://twiki.cern.ch/twiki/bin/view/EGEE/StagedRolloutSA1 • All of them available on the PPS web site http://www.cern.ch/pps/index.php?dir=./rollout/ EGEE-SA1 Coordination Meeting – 28th Jul 2009

  23. Questions? ? The Staged Roll-out - Antonio Retico - EGEE09 - 23rd Sep 2009

More Related