330 likes | 503 Views
Continuous Delivery on an Enterprise Java Stack. Marc Fasel, Senior Consultant, Shine Technologies @marcfasel http://blog.shinetech.com. Shine Technologies. Specialises in enterprise software development Blue-chip clients Technology focus Enterprise Java Ruby AWS Mobile development.
E N D
Continuous Delivery on an Enterprise Java Stack • Marc Fasel, Senior Consultant, Shine Technologies • @marcfasel • http://blog.shinetech.com
Shine Technologies • Specialises in enterprise software development • Blue-chip clients • Technology focus • Enterprise Java • Ruby • AWS • Mobile development
What is Continuous Delivery? • Rapid, repeatable, reliable, low risk deployments • Automation of the software delivery process • Extension of Continuous Integration • Continuous check-in by developers • Automated build • Automated unit testing => Detect integration problems early
Business Motivation • Put business in control of software release cycle • Reduce risk • deliver small batches to production • Increase Agility • Allow for short iterations • Trigger deployments at any time
Continuous Delivery Project • Blue-chip Australian retailer • Website relaunch with nation-wide marketing campaign • High-performance web application • Expected load of 1,000,000 requests per hour • 24*7 availability • Greenfield development • Rapid release of new features
Deployment Status Quo • Monthly deployments of software • Highly manual process • Expert knowledge required • Tedious and unpleasant • Failures due to manual process • Not rapid, repeatable, reliable, and low risk
Implementation of Continuous Delivery • Automated deploy to application server • Automated acceptance testing • Push-button promotion to user acceptance testing (UAT) environment • Manual user acceptance testing • Push-button release to production
Continuous Delivery Pipeline Source: Wikipedia
Starting Point • Continuous Integration was already in place • Agile practices were emerging • Automated provisioning of virtual machines using Puppet was in place - machines were reliably the same • Development and deployment was both managed by Shine => DevOps • Access to production machines
Environments Continuous Integration Server Local Deploy Build UAT Production Promote Promote
Production Environment • Cluster of 20 nodes • Load Balancer • Use of Puppet to create nodes • Tomcat and Apache connected to a single Oracle database
1. Automated Deploy to Build Server • Different parts work together: Build artifact, configuration, data • Build artifact is copied into application server • Configuration files (Tomcat, Apache Httpd) are copied • Manual non-destructive database migration beforehand
Database Migration • Ideal: Automated database migration • Database schema changes are in version control • Automated execution of database scripts during deployment and promotion • Not possible at client • DBA team get change requests with attached scripts • Review of database changes • Execution independent of software release cycle
2. Automated Acceptance Testing • Selenium • Run through major use cases • Write code don’t record scripts • Use page pattern to aid reusability • Run with every deploy for quick feedback • Tests only ran against Firefox
Problems with Automated Acceptance Testing • Data must be set up/ reset in the environment • No collision with test data used by developers
3. Push-button Promotion to UAT • Artifact, configuration, and data • Promotion: Copy existing artifact, don’t rebuild • All environment-specific configuration external • Database changes had been done beforehand
4. Manual User Acceptance Testing • UAT for small features is no problem • Bottleneck were manual regression tests; even with a good test plan this took two hours of tedious work • Batching of features was necessary to reduce the number of times the manual regression tests had to be performed • Good balance was 2 week iterations with a few days of UAT and 1/2 day of manual regression testing
Automated Multi-browser Acceptance Testing • Run Selenium test against supported browsers and operating systems • Maintenance of such an environment time-consuming • Testers were not technical enough to take charge of scripted acceptance testing • => Reliance on manual acceptance tests and - painfully - manual regression tests
Manual Regression Test • Manual regression test after UAT is done, otherwise: If UAT fails manual regression test has to be redone=> lots of work • Time pressure was high because of the 1/2 day needed after UAT was done
5. Push-button Release to Production • Same script as for UAT promotion: copy artifact, copy production configuration from svn, database setup
Zero Outage Deployment • Deploy at any time • Production deployment needs to be transparent to the user • Round-Robin deployment in cluster • Session Sharing • Programmatically remove node form load balancer cluster
Session Sharing • Different approaches possible • Applications server allow for automatic session replication • Sessions can be stored outside of application server, i.e. • Keep sessions on client • We chose to implement client-side session sharing
Load Balancer Visibility • Programmatically take cluster nodes out of Load Balancer • Remove from cluster • Deploy new software version • Add server back to cluster • Creative solution: health check file in Apache HTTPD
Advantages of Zero-Outage Deployment • Deploy any time vs. deployment window 7:00am Friday morning • Deployment becomes Business-As-Usual • If deployment fails we can redo it an hour later • Support for agile development • Controlled environment: no more manual steps • Deliver hot fixes within hours not days or weeks
Problem: Different Software Versions in Production • Round-robin deployment means multiple software versions are in the same cluster • User lands on server with new front-end feature • User submits page • User lands on server with previous version • Error • Only happens during deployment • We chose to ignore this
Reporting • Monitoring of # of exceptions in logs • Monitoring of down time of servers
Disaster Recovery • Roll Back • Deploy latest working version • No analysis of problem necessary • Almost immediately back in business • Roll Forward • Fix the problem and release again • Analysis may take time • Not feasible for a 24*7 application
Disaster Recovery: Example • Database table was missing index • UAT database had large tables truncated • Long running queries blocked each other • Whole cluster stopped working • Roll-back restored operations • Root-cause analysis
Continuous Deployment • Takes Continuous Delivery one step further • Promote code to Production as soon as it is ready • Reduce batch size to one feature per release • Beyond two week sprints: deliver asap
Disadvantages of Continuous Deployment • Business may request more deployments • Manual testing still the most important test gate for regression; the more you deploy the less thorough this step will be • Continuous Delivery is not for free: Manual testing still takes time • Regression testing time is often more than testing new features
Not Covered • Automated Performance Testing • Spot-check by developers using JMeter • was done after major rewrites of the software by external party • Production database passwords in version control • Systems that are not easily automated
Achievements • Rapid, reliable, repeatable, low risk deployments • Extension of existing agile process • Two week releases • Hotfixes any time • Happy developers • Happy business users